Roadmap - XeeNet

Current State Phase 1 Complete

The core platform is functional end-to-end as a single-operator research network. A researcher creates a brief, the orchestrator decomposes it into a campaign of bounded experiments with real hyperparameter configurations, workers execute actual PyTorch training, results flow back through the API, and the dashboard displays live progress with factor analysis. Credits are calculated and recorded on every result submission.

What's Built

Component	Status	Detail
FastAPI Backend	Complete	REST API, async DB, 3 router groups, lifespan management
Web Dashboard	Complete	HTMX + Jinja2, brief CRUD, campaign status, results display, auto-refresh
Orchestrator Agent	Complete	Brief decomposition, config generation, task graph creation
Python Worker Agent	Complete	Subprocess execution, dual deadlines, simulated fallback, CLI runner
Electron Desktop Worker	Complete	Zero-setup install, GPU detection, task execution, system tray
Training Pipeline	Complete	Char-level transformer on TinyShakespeare, real val_bpb, bounded budgets
Config Generator	Complete	Reproducible search space sampling with deterministic seeds
Campaign Tooling	Complete	Campaign runner, progress monitor, post-campaign factor analysis with JSON export
Credits System	Complete	Calculation on result submission, ledger persistence, dashboard display
Portal Assistant	Framework	Agent prompt and stub, needs LLM integration
Test Suite	Complete	110 tests across 10 files, all passing

Growth Phases

XeeNet grows in four deliberate phases. Each phase proves a specific thesis before the next layer of complexity is introduced. The platform is not a replacement for datacentres — it is a new compute substrate for high-volume ML experimentation.

Phase	Status	Thesis to Prove
1. Single-Operator Network	Complete	A distributed worker network can reliably execute bounded ML experiments and produce useful, verified research insights.
2. Trusted External Contributors	Next	Strangers can safely contribute compute with verifiable results and earned reputation.
3. Research Platform	Planned	External researchers get genuine value from submitting briefs to the network.
4. Economic Layer	Planned	A sustainable incentive model drives long-term participation without regulatory overhead.

Phase 2: Trusted External Contributors Next

The trust layer is the existential requirement. Without it, adoption stops. These priorities harden the platform for external workers before scaling the researcher side.

🔒

Result Verification

First-class k-of-n redundancy: randomly re-run a subset of tasks on trusted workers, compare metrics within tolerance, and flag anomalies. This is the minimum viable trust layer for a semi-trusted worker network.

🛡

Workload Sandboxing

The autoresearch contract already constrains the attack surface — workers run a pre-approved script with JSON config, not arbitrary code. Harden this with containerised or WASM-based execution, no arbitrary filesystem or network access, and deterministic runtime constraints.

⭐

Worker Reputation

Track worker reliability over time: task completion rate, result consistency across redundant runs, uptime history. Reputation scores inform task scheduling priority and eligibility for higher-value workloads.

🌐

Cross-Platform Workers

Extend the Electron app’s auto-setup beyond Windows. Add platform-specific Python distribution management for macOS and Linux. Signed binaries, reproducible runtimes, and transparent security model in the documentation.

Phase 3: Research Platform Planned

Once external workers can contribute safely, open the platform to external researchers. The goal is to prove that the network produces genuine research value — not just compute, but knowledge.

Workload Admission and Policy Engine

External briefs must pass through a policy engine: constrained job formats, pre-approved execution templates, resource and time-budget caps. The workload model must be strict enough to prevent abuse (cryptomining, data exfiltration, fingerprinting) while flexible enough to support diverse research questions. Public datasets only in the initial release.

Bayesian Optimisation

Replace random search with Bayesian optimisation. Use the accumulated experiment history to inform the next batch of hyperparameter configurations. Focus exploration on promising regions of the search space rather than sweeping blindly.

Multiple Experiment Types

Extend beyond character-level LMs. Add experiment templates for image classification (CIFAR-10), reinforcement learning (CartPole), and other self-contained benchmarks. Each template follows the same autoresearch contract: bounded budget, single comparable metric, deterministic seeds.

Regional Orchestrator Nodes

Deploy local orchestrator nodes in each geographic region. Regional nodes prioritise low-latency clients, handle worker-to-task matching within their zone, and synchronise results with the central server. Reduces cross-region data transfer and improves scheduling responsiveness.

Intelligent Orchestration

The orchestrator consults the global experiment corpus before designing new campaigns. Historical results inform which hyperparameter regions to explore, which architectures to prioritise, and how to allocate compute across tasks. Each campaign builds on the accumulated knowledge of every previous campaign.

Phase 4: Economic Layer Planned

Introduce only after the research platform proves its value. The economic layer must be legally sound and operationally justified before any monetisation.

Credits Marketplace

Workers earn credits for completed, verified tasks. Researchers spend credits to submit campaigns. The economics agent manages pricing based on supply and demand, detects fraud, and ensures fair distribution. Credits reflect actual compute contributed: GPU time is worth more than CPU time.

Monetisation

Large research projects and enterprise labs can purchase worker time for high-throughput experimentation. The acquisition thesis is not “replace datacentres” — it is “massively accelerate the experiment loop that informs what to train in the datacentres.” Every major lab runs thousands of small experiments before committing to a large training run. That pre-cluster experimentation phase is the sweet spot.

Central Ledger with Cryptographic Audit

A centralised append-only ledger with cryptographic task receipts, worker attestations, and audit trails. Provides 95% of the trust guarantees of a distributed blockchain with a fraction of the complexity. Blockchain becomes relevant only if the platform requires trustless settlement between parties who do not trust a central operator.

Federated Research Programmes

Multiple researchers contribute to shared long-term research goals (e.g., “find the optimal small transformer architecture for character-level language modelling”). Programmes coordinate campaigns across research groups and accumulate results into shared knowledge.

Agent-Driven Code Modification

Following the autoresearch pattern more deeply: agents propose modifications to training scripts based on experiment results. Given a series of outcomes, the orchestrator suggests architectural changes, new regularisation techniques, or training procedure modifications and generates the code to test them.

Long-Term: The Global Experiment Corpus

The real moat is not the compute network. It is the experiment database.

Every completed task produces a verified (seed, config, metric) tuple. Over time, that accumulates into a massive structured dataset of “what works in ML.” That corpus cannot be replicated by simply spinning up more GPUs — it represents institutional knowledge at network scale.

Imagine a dataset that answers questions like:

Across 40 million experiments, which optimiser schedules consistently outperform others for small transformer models?
How does context length interact with model depth across dozens of hardware tiers?
What architectural patterns produce the best token efficiency under strict compute budgets?

Most research knowledge today is fragmented across papers, private lab notebooks, and unpublished results. The failures, near-misses, and surprising parameter combinations that drive real scientific progress mostly vanish into internal systems. XeeNet captures that negative space.

The ultimate vision:

Millions of volunteer devices run bounded ML experiments continuously
Research programmes self-direct based on accumulated results and lessons
A global experiment corpus captures insights across all experiments — successes and failures alike
Researchers launch campaigns that build on the accumulated knowledge of every previous campaign
A sustainable credits economy incentivises long-term participation
Results are publicly available, advancing open ML research

The SETI@home Parallel

SETI@home proved that volunteers will donate compute for science. At its peak, it had over 5 million participants providing 27 PetaFLOPS. XeeNet applies the same model to ML research: instead of searching for extraterrestrial signals, we search for optimal neural network architectures — using workloads that are embarrassingly parallel, naturally bounded, and immediately verifiable. The compute requirements are similar; the scientific payoff is immediate and measurable.

Risks and Constraints

These are the dragons in the cave. Each must be addressed deliberately as the platform grows through the phases above.

Risk	Phase	Mitigation
Trust and sandboxing	2	Constrained job formats, pre-approved templates, containerised execution, no arbitrary filesystem or network access. The autoresearch contract is itself the primary sandbox.
Result integrity	2	k-of-n redundant execution, cross-worker metric comparison within tolerance, anomaly detection. First-class verification, not an afterthought.
Malicious workloads	3	Workload admission policy engine, deterministic runtime constraints, no arbitrary internet access from tasks, pre-approved execution templates only.
Dataset privacy	3	Public datasets only in early phases. Proprietary data support requires differential privacy, secure enclaves, and data governance review.
Cold start problem	2–3	Do not build a marketplace first. Build a working network: seed worker fleet, own workloads, prove throughput, then invite external workers, then external researchers.
Consumer trust optics	2	Signed binaries, reproducible runtimes, transparent code, public security review, explicit resource controls, hard caps on power / time / bandwidth.
Regulatory exposure	4	If credits become redeemable or cash-equivalent, financial regulation applies. Start with non-cash reputation scoring; add grants, prizes, or sponsorships before direct financial settlement. Professional legal review before any tokenisation.
Hardware heterogeneity	2–3	Resource profiles as first-class scheduling inputs, task parameterisation by device tier, regional orchestrators for latency-aware matching.
Network economics	3	Bounded experiments minimise data transfer by design. Small configs in, single metric out. Datasets cached locally on workers, not streamed per task.