3.2. Proxy Provider
Overview
The proxy layer gives each agent a local mitmproxy instance (port 8081) that all browser traffic passes through. Optionally, this local proxy forwards traffic to an upstream proxy fetched from the Proxy Provider service, which serves verified proxies from a database.
Beyond routing, the proxy layer also supports response modifiers, request modifiers, and data retrievers — hooks that let you intercept and transform HTTP traffic in real time. For the Proxy Provider service API and deployment reference, see Proxy Provider Service.
Browser ──► mitmproxy (localhost:8081) ──► Upstream Proxy ──► Internet
│
├─ request modifiers
├─ response modifiers
└─ data retrievers
Configuration
Add a config_proxy block to default_agent_message in your config_template.json (see JSON Configuration):
"default_agent_message": {
"config_proxy": {
"upstream_proxy_enabled": 1,
"upstream_proxy_broker_type": "provider",
"proxy_provider_name": "iproyal",
"proxy_location": "",
"proxy_type": "",
"proxy_rotation_strategy": "",
"verify_ip": 0,
"verify_quality": 0,
"verify_location": 0,
"quality_threshold": 70,
"max_queries_number": 3
}
}
Fields
| Field | Type | Description |
|---|---|---|
upstream_proxy_enabled |
int | 0 = no upstream proxy (local mitmproxy only), 1= enable upstream proxy |
upstream_proxy_broker_type |
string | "provider" = fetch from Proxy Provider service, "direct" = use a hardcoded proxy URL |
proxy_provider_name |
string | Filter proxies by provider name (e.g. "iproyal") |
proxy_location |
string | Filter by geographic location (e.g. "Europe", "US") |
proxy_type |
string | Filter by proxy type (residential, mobile) |
proxy_rotation_strategy |
string | Filter by rotation strategy (sticky, rotating) |
verify_ip |
int | 1 = resolve and return the proxy’s outbound IP |
verify_quality |
int | 1 = run an IPQualityScore fraud check |
verify_location |
int | 1 = verify the proxy IP matches proxy_location |
quality_threshold |
int | Minimum quality score (0-100) to accept a proxy |
max_queries_number |
int | Max retry attempts if proxy checks fail |
Initialization
The ProxyManager is initialized in ctx_agent_context.py:
from src.proxy_manager import ProxyManager
class AgentContext:
def __init__(self, agent_message):
self.agent_message = agent_message
def instantiate_default(self):
self.proxy_manager = ProxyManager(
self.agent_message["config_proxy"]
).launch_proxy()
On launch_proxy(), the manager:
- Kills any existing process on port 8081
- Fetches an upstream proxy from the Proxy Provider (if enabled)
- Starts a local mitmproxy subprocess on port 8081
The browser is automatically configured to use this proxy via the --proxy-server=http://127.0.0.1:8081 Chrome flag in config_fantomas.config_browser.
Proxy Modes
No upstream proxy
Set upstream_proxy_enabled to 0. Traffic flows through the local mitmproxy only (useful for local development or when you just need traffic interception without IP masking):
"config_proxy": {
"upstream_proxy_enabled": 0
}
Fetch from Proxy Provider
Set upstream_proxy_enabled to 1 and upstream_proxy_broker_type to "provider". The manager queries the Proxy Provider service for a random proxy matching your filters:
"config_proxy": {
"upstream_proxy_enabled": 1,
"upstream_proxy_broker_type": "provider",
"proxy_provider_name": "iproyal",
"proxy_location": "Europe",
"verify_quality": 1,
"quality_threshold": 70,
"max_queries_number": 3
}
How the request is built
When upstream_proxy_broker_type is "provider", the ProxyManager builds a GET request to the Proxy Provider’s /fetch_proxy endpoint. Only config fields with truthy values are included as query parameters — fields set to 0, "", or None are omitted.
For example, with this config:
"config_proxy": {
"upstream_proxy_enabled": 1,
"upstream_proxy_broker_type": "provider",
"proxy_provider_name": "iproyal",
"proxy_location": "Europe",
"proxy_type": "",
"proxy_rotation_strategy": "",
"verify_ip": 1,
"verify_quality": 1,
"verify_location": 0,
"quality_threshold": 70,
"max_queries_number": 3
}
The resulting HTTP request would be:
GET http://127.0.0.1:5001/fetch_proxy?proxy_provider_name=iproyal&proxy_location=Europe&verify_ip=1&verify_quality=1&quality_threshold=70&max_queries_number=3
Note that proxy_type, proxy_rotation_strategy, and verify_location are excluded because their values are falsy.
How the response is built
The response from the Proxy Provider is adaptive — it only includes fields for the checks you actually requested. This keeps the response minimal and avoids unnecessary work on the server side.
Always included:
| Field | Description |
|---|---|
proxy_url |
The proxy address (e.g. http://user:pass@host:port) |
attempt |
Which retry attempt returned this proxy (starts at 1) |
Conditionally included — only if the corresponding verify_* flag is set to 1:
| Field | Included when | Content |
|---|---|---|
verify_ip |
verify_ip=1 |
{"ip": "1.2.3.4"} or {"ip": null} if resolution failed |
verify_quality |
verify_quality=1 |
{"fraud_score_inverted": 92, "quality_threshold": 70, "quality_pass": true} |
verify_location |
verify_location=1 |
{"actual_timezone": "Europe/Paris", "expected_location": "Europe", "location_match": true} |
Example — all checks enabled:
{
"proxy_url": "http://user:pass@proxy.example.com:8080",
"attempt": 1,
"verify_ip": {
"ip": "185.230.12.45"
},
"verify_quality": {
"fraud_score_inverted": 92,
"quality_threshold": 70,
"quality_pass": true
},
"verify_location": {
"actual_timezone": "Europe/Paris",
"expected_location": "Europe",
"location_match": true
}
}
Example — no checks enabled:
{
"proxy_url": "http://user:pass@proxy.example.com:8080",
"attempt": 1
}
If a check fails (e.g. quality score below threshold or location mismatch), the service automatically retries with a new proxy, up to max_queries_number attempts.
Direct proxy URL
Set upstream_proxy_broker_type to "direct". Instead of querying the service, the manager uses the URL from the PROXY_PROVIDER_URL environment variable (see Environment Variables) directly as the upstream proxy:
"config_proxy": {
"upstream_proxy_enabled": 1,
"upstream_proxy_broker_type": "direct"
}
Addon Registration and Proxy Lifecycle
The ProxyManager runs mitmproxy in a separate subprocess (via billiard.Process). When launch_proxy() is called, the current modifier and retriever arrays are copied into the child process. This has important implications for when addons can be registered.
Registration must happen before launch
Addons (request modifiers, response modifiers, data retrievers) must be registered before calling launch_proxy(). Once the mitmproxy subprocess starts, it holds its own copy of the arrays — any addon added to the parent process afterward will not be seen by the running proxy.
class AgentContext:
def instantiate_default(self):
# 1. Create the ProxyManager (no subprocess yet)
self.proxy_manager = ProxyManager(self.agent_message["config_proxy"])
# 2. Register addons while still in the parent process
self.proxy_manager.add_request_modifier(my_request_modifier)
self.proxy_manager.add_modifier(my_response_modifier)
self.proxy_manager.add_retriever("my_queue", my_retriever)
# 3. Launch — arrays are copied into the subprocess
self.proxy_manager.launch_proxy()
What happens after launch
| Action | Effect |
|---|---|
Register addon before launch_proxy() |
Addon is active in the proxy subprocess |
Register addon after launch_proxy() |
Addon is ignored — the running subprocess has its own copy |
Register addon after launch, then call switch_upstream_proxy() |
Addon becomes active — switch_upstream_proxy() kills the old subprocess and starts a new one with the current arrays |
Proxy switching re-applies addons
switch_upstream_proxy() internally calls exit_local_proxy() followed by launch_proxy(). The new subprocess is created with the current state of the modifier and retriever arrays. This means:
- All previously registered addons are preserved across switches.
- Any addon added after the initial launch but before a switch will be picked up by the new subprocess.
async def ctx_script(job_ctx, agent_ctx):
# Proxy is already running with initial addons
# This modifier won't be active yet (subprocess already running)
agent_ctx.proxy_manager.add_request_modifier(extra_modifier)
# After switch, ALL addons (initial + extra_modifier) are active
agent_ctx.proxy_manager.switch_upstream_proxy()
Pickling constraint
Since the subprocess receives addons via serialization, all modifier and retriever functions must be picklable. In practice this means they must be defined at module level — not as lambdas, closures, or nested functions.
# Good — module-level function, picklable
def remove_csp_header(flow):
if "content-security-policy" in flow.response.headers:
del flow.response.headers["content-security-policy"]
# Bad — lambda, not picklable
remove_csp = lambda flow: flow.response.headers.pop("content-security-policy", None)
# Bad — closure, not picklable
def make_modifier(header_name):
def modifier(flow):
del flow.response.headers[header_name]
return modifier
A recommended pattern is to place each addon in its own file inside the addon/ directory (see Reusable Python Modules) and import them where needed:
project/
└── addon/
├── request_modifier_add_custom_header.py
├── response_modifier_remove_csp.py
└── retriever_capture_api.py
from addon.request_modifier_add_custom_header import request_modifier_add_custom_header
from addon.response_modifier_remove_csp import response_modifier_remove_csp
from addon.retriever_capture_api import retriever_capture_api
Usage in ctx_script.py
Switching proxy mid-run
To get a fresh upstream proxy during a job (e.g. after a block), call switch_upstream_proxy():
async def ctx_script(job_ctx, agent_ctx):
# ... some automation ...
# IP got blocked, rotate to a new proxy
agent_ctx.proxy_manager.switch_upstream_proxy()
# Continue with a new IP
This stops the current mitmproxy process, fetches a new upstream proxy, and restarts mitmproxy. All previously registered addons remain active.
Adding response modifiers
Response modifiers intercept and alter HTTP responses passing through the proxy. They receive a flow object with access to flow.response.headers, flow.response.content, etc.
def remove_csp_header(flow):
if flow.response and "content-security-policy" in flow.response.headers:
del flow.response.headers["content-security-policy"]
agent_ctx.proxy_manager.add_modifier(remove_csp_header)
Adding request modifiers
Request modifiers intercept outgoing requests. They receive a flow object with access to flow.request.headers, flow.request.url, etc.
def add_custom_header(flow):
flow.request.headers["X-Custom"] = "my-value"
agent_ctx.proxy_manager.add_request_modifier(add_custom_header)
Adding data retrievers
Retrievers extract data from responses and store it in a named queue you can read later. They receive flow and queue as arguments:
def capture_api_response(flow, queue):
if "/api/data" in flow.request.url and flow.response:
queue.put(flow.response.text)
agent_ctx.proxy_manager.add_retriever("api_data", capture_api_response)
Then in your script, retrieve the captured data:
async def ctx_script(job_ctx, agent_ctx):
# Navigate to a page that triggers the API call
tab = await agent_ctx.browser.get("https://example.com")
# Retrieve the last captured value from the queue
data = agent_ctx.proxy_manager.retrieve("api_data")
Execution order inside the proxy
When a response arrives, the ProxyAddOn processes hooks in this order:
- Data counting — cumulative
Content-Lengthtracking - Retrievers — data extraction into queues
- Response modifiers — header/body transformations
This means retrievers see the original response before any modifier alters it.
For requests, request modifiers run before the request is forwarded upstream.
Monitoring bandwidth
The proxy tracks total data transferred. Access it with:
total_bytes = agent_ctx.proxy_manager.get_data_count()
Disabling the Proxy Layer
If your project does not need any proxy (no upstream, no traffic interception), remove the ProxyManager from ctx_agent_context.py and remove --proxy-server=http://127.0.0.1:8081 from the browser options in config_template.json.