Skip to content

Commit 6e3beb5

Browse files
Create agentic_code_guide.md
1 parent e75c236 commit 6e3beb5

File tree

1 file changed

+185
-0
lines changed

1 file changed

+185
-0
lines changed

agentic_code_guide.md

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
# Oxylabs AI Studio Python SDK Agentic Code Guide
2+
3+
## Installation
4+
5+
```bash
6+
pip install oxylabs-ai-studio
7+
```
8+
9+
## Best Practices for Implementation
10+
11+
- Install latest version of oxylabs-ai-studio.
12+
- Incorporate Rate Limiting: Ensure your implementation respects rate limits associated with your
13+
purchased plan to prevent service disruptions or overuse.
14+
- Implement a Robust Retry Mechanism: Introduce a retry logic for handling failed requests, but
15+
include a limit to the number of retries to avoid infinite loops or excessive API calls.
16+
17+
## Browser-Agent app
18+
19+
### What It Is Good For
20+
21+
A browser automation tool capable of controlling a browser to perform actions such as
22+
clicking, scrolling, and navigation. The tool takes a textual prompt as input to execute
23+
these actions.
24+
25+
### Python interface
26+
27+
#### Sync interface
28+
29+
```python
30+
from oxylabs_ai_studio.apps.browser_agent import BrowserAgent
31+
32+
browser_agent = BrowserAgent(api_key="<API_KEY>")
33+
34+
prompt = "Find if there is game 'super mario odyssey' in the store."
35+
url = "https://sandbox.oxylabs.io/"
36+
result = browser_agent.run(
37+
url=url,
38+
user_prompt=prompt,
39+
output_format="json",
40+
schema={"type": "object", "properties": {"page_url": {"type": "string"}}, "required": []},
41+
)
42+
print(result.data)
43+
```
44+
45+
#### Async interface
46+
47+
```python
48+
import asyncio
49+
from oxylabs_ai_studio.apps.browser_agent import BrowserAgent
50+
51+
browser_agent = BrowserAgent(api_key="<API_KEY>")
52+
53+
async def main():
54+
prompt = "Find if there is game 'super mario odyssey' in the store."
55+
url = "https://sandbox.oxylabs.io/"
56+
result = await browser_agent.run_async(
57+
url=url,
58+
user_prompt=prompt,
59+
output_format="json",
60+
schema={"type": "object", "properties": {"page_url": {"type": "string"}}, "required": []},
61+
)
62+
print(result.data)
63+
64+
if __name__ == "__main__":
65+
asyncio.run(main())
66+
```
67+
68+
Parameters:
69+
70+
- url (str): Target URL to scrape (required).
71+
- user_prompt (str): User prompt to perform browser actions. Mention task or actions instead of what you like to extract from it. (required).
72+
- output_format (Literal["json", "markdown"]): Output format (default: "markdown").
73+
- schema (dict | None): OpenAPI schema for structured extraction (required if output_format is "json").
74+
75+
Output (result):
76+
77+
- Python classes:
78+
79+
```python
80+
class DataModel(BaseModel):
81+
type: Literal["json", "markdown", "html", "screenshot"]
82+
content: dict[str, Any] | str | None
83+
84+
class BrowserAgentJob(BaseModel):
85+
run_id: str
86+
message: str | None = None
87+
data: DataModel | None = None
88+
```
89+
90+
## AI-Scraper app
91+
92+
### What It Is Good For
93+
94+
A tool designed to scrape website content and return it either as Markdown or structured JSON.
95+
When opting for JSON output, the user must provide a valid JSON schema for the expected structure.
96+
97+
### Python interface
98+
99+
#### Sync interface
100+
101+
```python
102+
from oxylabs_ai_studio.apps.ai_scraper import AiScraper
103+
104+
scraper = AiScraper(api_key="<API_KEY>")
105+
106+
url = "https://sandbox.oxylabs.io/products/3"
107+
result = scraper.scrape(
108+
url=url,
109+
output_format="json",
110+
schema={"type": "object", "properties": {"price": {"type": "string"}}, "required": []},
111+
render_javascript=False,
112+
)
113+
print(result)
114+
```
115+
116+
#### Async interface
117+
118+
```python
119+
import asyncio
120+
from oxylabs_ai_studio.apps.ai_scraper import AiScraper
121+
122+
scraper = AiScraper(api_key="<API_KEY>")
123+
124+
async def main():
125+
url = "https://sandbox.oxylabs.io/products/3"
126+
result = await scraper.scrape_async(
127+
url=url,
128+
output_format="json",
129+
schema={"type": "object", "properties": {"price": {"type": "string"}}, "required": []},
130+
render_javascript=False,
131+
)
132+
print(result)
133+
134+
if __name__ == "__main__":
135+
asyncio.run(main())
136+
```
137+
138+
Parameters:
139+
140+
- url (str): Target URL to scrape (required)
141+
- output_format (Literal["json", "markdown"]): Output format (default: "markdown")
142+
- schema (dict | None): OpenAPI schema for structured extraction (required if output_format is "json")
143+
- render_javascript (bool): Render JavaScript (default: False)
144+
- geo_location (str): proxy location in ISO2 format.
145+
146+
Output (result):
147+
148+
- Python classes:
149+
150+
```python
151+
class AiScraperJob(BaseModel):
152+
run_id: str
153+
message: str | None = None
154+
data: str | dict | None
155+
```
156+
157+
If output_format is "json", data will be a dictionary.
158+
If output_format is "markdown", data will be a string.
159+
160+
161+
## Use Cases Examples
162+
163+
### E-commerce Product Scraping
164+
165+
- Task: Locate the category page of a specific domain, extract all product data from the category, and gather detailed information from each product page.
166+
- Proposed Workflow:
167+
- Use the Browser-Agent app to identify the category page URL and all pagination URLs within that category in a single action.
168+
Define a JSON schema to return the pagination URLs. Example:
169+
```json
170+
{
171+
"type": "object",
172+
"properties": {
173+
"paginationUrls": {
174+
"type": "array",
175+
"description": "Return all URLs from first to last page in category pagination. If you noticed there are missing URLs, because category page does not list them all, create them to match existing ones.",
176+
"items": {
177+
"type": "string"
178+
}
179+
}
180+
},
181+
"required": []
182+
}
183+
```
184+
- Use the Ai-Scraper app to extract all product URLs from the pagination pages in the category.
185+
- Use the Ai-Scraper app again to extract detailed data from each product page by defining an appropriate JSON schema.

0 commit comments

Comments
 (0)