Platform Architecture
How XeeNet coordinates distributed ML research across a global grid of volunteer devices.
System Overview
XeeNet follows a hub-and-spoke model. A central server manages experiment campaigns, task queues, and results aggregation. Worker nodes (desktop applications running on volunteer machines) poll for tasks, execute training runs, and report metrics back.
(HTMX + Jinja2)"] API["fa:fa-server FastAPI REST API"] ORC["fa:fa-brain Orchestrator Agent"] ECON["fa:fa-coins Economics Agent"] QUEUE["fa:fa-list Task Queue"] DB["fa:fa-database SQLite Database"] W1["fa:fa-desktop Worker 1
Desktop App"] W2["fa:fa-desktop Worker 2
Desktop App"] W3["fa:fa-desktop Worker N
Python Agent"] WEB -->|"Briefs & Reports"| API API --> ORC API --> ECON ORC -->|"Generate Tasks"| QUEUE QUEUE --> DB API -->|"Poll / Submit"| W1 API -->|"Poll / Submit"| W2 API -->|"Poll / Submit"| W3 style WEB fill:#2563eb,stroke:#60a5fa,stroke-width:2px,color:#000 style API fill:#16a34a,stroke:#4ade80,stroke-width:2px,color:#000 style ORC fill:#16a34a,stroke:#4ade80,stroke-width:2px,color:#000 style ECON fill:#16a34a,stroke:#4ade80,stroke-width:2px,color:#000 style QUEUE fill:#ca8a04,stroke:#facc15,stroke-width:2px,color:#000 style DB fill:#ca8a04,stroke:#facc15,stroke-width:2px,color:#000 style W1 fill:#9333ea,stroke:#c084fc,stroke-width:2px,color:#000 style W2 fill:#9333ea,stroke:#c084fc,stroke-width:2px,color:#000 style W3 fill:#9333ea,stroke:#c084fc,stroke-width:2px,color:#000
Directory Structure
xeenet/
├── agents/ # Agent definitions
│ ├── base.py # BaseAgent ABC
│ ├── orchestrator/ # Decomposes research goals into tasks
│ │ ├── orchestrator_agent.py # Runtime implementation
│ │ └── orchestrator_prompt.md # Agent role/responsibilities
│ ├── worker/ # Runs tasks on user devices
│ │ └── worker_agent.py # Subprocess execution + fallback
│ ├── portal/ # Researcher interface agent
│ └── economics/ # Credits and metering agent
│
├── skills/ # Reusable modules called by agents
│ ├── task_generation/ # Experiment templates + config generators
│ │ ├── task_templates.py # ExperimentTemplate, generate_task_batch()
│ │ └── config_generators.py # CharLMConfigGenerator (search space)
│ ├── result_analysis/ # Aggregation and interpretation
│ ├── scheduling/ # Queue management
│ ├── credits/ # Credit calculations
│ └── infra/ # Device profiling
│
├── services/
│ ├── api/ # FastAPI application
│ │ ├── main.py # App with lifespan, routers
│ │ ├── routers/ # orchestrator, worker, portal routes
│ │ ├── dashboard/ # HTMX dashboard views
│ │ └── templates/ # ~15 Jinja2 templates
│ ├── db/ # Async SQLAlchemy + SQLite
│ ├── schemas.py # Pydantic data models
│ └── orchestration.py # Brief -> campaign -> tasks pipeline
│
├── experiments/
│ └── train_char_lm.py # Self-contained training script
│
├── desktop/ # Electron desktop worker
│ └── src/
│ ├── main/ # Main process (8 TypeScript files)
│ ├── renderer/ # UI (HTML + TypeScript)
│ └── shared/ # Cross-process types + IPC channels
│
├── config/ # Settings (Pydantic + YAML)
├── tests/ # 110 tests across 8 files
└── static/ # CSS, JS for dashboard
The Four Agents
XeeNet uses a multi-agent architecture. Each agent has a defined role, a prompt specification, and a Python implementation.
Orchestrator Agent
Decomposes research briefs into task graphs. Uses the CharLMConfigGenerator
to sample hyperparameter configurations from a defined search space. Each task gets
a unique seed, a time budget, and a code package reference pointing to the training script.
Worker Agent
Executes tasks within resource and schedule constraints. The Python worker resolves the training script, writes config to a temp JSON file, spawns a subprocess with dual deadlines, and parses the JSON metrics from stdout. Falls back to simulation if PyTorch is unavailable.
Portal Assistant
Conversational interface for researchers. Helps formulate research briefs, explains results, and generates reports. Designed for progressive disclosure: high-level summaries first, drill-down on request.
Economics Agent
Manages the credits marketplace. Meters compute contributions from workers, calculates costs for researchers, detects fraudulent results, and handles budget planning. Workers are semi-trusted and the agent accounts for adversarial behaviour.
Data Flow
The platform uses Pydantic schemas as the cross-language contract. The same data
structures are defined in Python (services/schemas.py) and mirrored in
TypeScript (desktop/src/shared/types.ts), ensuring consistency across
the backend API and Electron workers.
Core Schemas
| Schema | Purpose | Key Fields |
|---|---|---|
ResourceProfile |
Hardware capabilities of a worker | cpu_cores, ram_gb, gpu_name, gpu_vram_gb |
TaskSpec |
A unit of work to execute | task_id, config (JSON), code_package_ref, time_budget, seeds |
TaskResult |
Metrics from a completed task | val_bpb, train_loss, steps_completed, wall_time_seconds, device_used |
BriefSpec |
A research campaign definition | title, description, num_tasks, time_budget_per_task |
TrainingCapability |
Python/PyTorch availability on a worker | pythonPath, pythonVersion, hasTorch, hasCuda |
API Layer
The FastAPI backend exposes a versioned JSON API at /api/v1/ with three
router groups:
- Orchestrator routes: Create briefs, list campaigns, view task status
- Worker routes: Poll for tasks, submit results, register workers
- Portal routes: Researcher queries, report generation
The database is async SQLAlchemy with SQLite (via aiosqlite), initialised at app startup through FastAPI's lifespan context manager. This keeps the deployment simple: no external database server required.