4.3.1. Database

SQL Server schema, table definitions, service-to-table matrix, bulk import tools, and database conventions

Database Reference

Armada uses Microsoft SQL Server as its persistent data store, accessed through the pymssql library (v2.3.13). There is no ORM — all queries are raw SQL with parameterized placeholders (%s).

Connection Credentials

Every service reads the same four environment variables:

Variable	Description	Example
`SQL_SERVER_NAME`	Server hostname	`myserver.database.windows.net`
`SQL_SERVER_USER`	Login name	`myadmin`
`SQL_SERVER_PASSWORD`	Login password	`••••••`
`SQL_SERVER_DB`	Database name	`my_db`

These are stored in the root .env file and injected into Kubernetes pods via the armada-sql-server-secret secret.

Schema

All tables are created by bootstrap/bootstrap_database.py using idempotent IF NOT EXISTS DDL (see Bootstrap Scripts). There is no migration framework — schema changes must be applied manually or by editing the bootstrap script.

Note: No primary keys, foreign keys, or indexes are defined. Uniqueness and referential integrity are enforced at the application level.

armada_runs

Tracks each orchestrator run.

Column	Type	Description
`run_uuid`	`VARCHAR(255)`	UUID v4 identifying the run
`run_datetime`	`DATETIME`	Timestamp when the run started

armada_jobs

Tracks individual jobs spawned within a run.

Column	Type	Description	Allowed values
`job_uuid`	`VARCHAR(255)`	UUID v4 identifying the job
`run_uuid`	`VARCHAR(255)`	References `armada_runs.run_uuid`
`job_datetime`	`DATETIME`	Timestamp when the job started
`job_associated_agent`	`VARCHAR(255)`	Identifier of the agent that executed the job
`job_status`	`VARCHAR(255)`	Current status of the job	`Running`, `Success`, `Failed`

armada_events

Tracks granular events emitted during a job.

Column	Type	Description	Allowed values
`event_uuid`	`VARCHAR(255)`	UUID v4 identifying the event
`event_content`	`VARCHAR(255)`	Free-text event description
`job_uuid`	`VARCHAR(255)`	References `armada_jobs.job_uuid`
`event_datetime`	`DATETIME`	Timestamp of the event
`event_status`	`VARCHAR(255)`	Status of the event	`Success`, `Failed`

armada_proxies

Stores the pool of available proxies.

Column	Type	Description	Allowed values
`proxy_url`	`VARCHAR(255)`	Full proxy URL	`http://USER:PASS@HOST:PORT`
`proxy_provider_name`	`VARCHAR(255)`	Provider name	e.g. `iproyal`, `brightdata`, `oxylabs`
`proxy_type`	`VARCHAR(255)`	Proxy category	`datacenter`, `residential`, `mobile`
`proxy_rotation_strategy`	`VARCHAR(255)`	Rotation mode	`sticky`, `rotating`
`proxy_location`	`VARCHAR(255)`	IANA timezone identifier (can be empty)	e.g. `Europe/Paris`, `Europe`, `US`

Filtering on proxy_location uses a contains match: setting the filter value to Europe selects proxies in both Europe/Vienna and Europe/Paris. Use a more specific value (e.g. Europe/Paris) to target a single timezone.

armada_output

Stores the output data produced by agent runs.

Column	Type	Description
`run_uuid`	`VARCHAR(255)`	UUID v4 referencing the run
`data`	`VARCHAR(MAX)`	Output data (plain text or JSON string)
`timestamp`	`DATETIME`	Timestamp of the output record

armada_fingerprints

Stores collected browser fingerprints as JSON blobs.

Column	Type	Description
`antibot_vendor`	`VARCHAR(255)`	Vendor name (e.g. `arkose`, `datadome`)
`website`	`VARCHAR(255)`	Target website identifier
`data`	`VARCHAR(MAX)`	JSON string — format depends on vendor (see below)
`collecting_date`	`DATE`	Collection date (`YYYY-MM-DD`)

data field format by vendor:

arkose: {"ts":"<timestamp>","ua":"<user_agent>","bda":"<bda_payload>"}
Other vendors: free-form JSON.

Data type reference

SQL Server type	Format	Example
`DATE`	`YYYY-MM-DD`	`2026-02-24`
`DATETIME`	`YYYY-MM-DD HH:MM:SS`	`2026-02-24 10:30:00`
`VARCHAR(255)`	Plain text	`my_value`
`VARCHAR(MAX)`	Plain text / JSON string	`{"key":"value"}`

Empty values are stored as NULL for date and datetime columns.

Entity Relationships

armada_runs
  └── 1:N ── armada_jobs  (via run_uuid)
                └── 1:N ── armada_events  (via job_uuid)

armada_runs
  └── 1:N ── armada_output  (via run_uuid)

armada_proxies        (standalone lookup table)
armada_fingerprints   (standalone lookup table)

These relationships are not enforced by foreign keys — they are maintained by application code. The backend’s DELETE /api/runs/{run_uuid} endpoint performs a manual cascade: events → jobs → run.

Bulk Data Import

Armada provides CLI scripts to populate tables from CSV files. See the Database Management guide for usage instructions and CSV examples.

Conventions & Design Decisions

Convention	Detail
No connection pooling	Every operation opens and closes its own `pymssql` connection.
Parameterized queries	All SQL uses `%s` placeholders — no string substitution.
No ORM	Raw SQL throughout the entire codebase.
No migrations	Schema is managed via the idempotent bootstrap script only.
Async wrapping	The backend wraps synchronous `pymssql` calls with `asyncio.to_thread()` for FastAPI compatibility.
Consistent credentials	All services use the same 4 env vars, loaded identically.

Dependencies

All services pin the same version:

pymssql==2.3.13

Found in requirements.txt of: services/agent, services/backend, services/proxy-provider, services/fingerprint-provider, and bootstrap.