Vision & Market

The Scale Problem in ML Research

Machine learning research is fundamentally compute-bound. A single hyperparameter sweep across architecture choices, learning rates, and training schedules can require hundreds or thousands of GPU-hours. Academic labs wait weeks for cluster time. Startups burn through cloud budgets. Individual researchers are locked out entirely.

Meanwhile, billions of dollars worth of consumer compute sits idle. The average gaming PC is active less than 4 hours a day. Enterprise workstations sit powered on but unused overnight. Every one of these machines has a CPU capable of training small neural networks, and many have GPUs that rival data centre hardware from just a few years ago.

XeeNet bridges this gap: researchers get affordable, elastic compute for ML experiments, and device owners put their idle hardware to productive use.

The Autoresearch Revolution

Andrej Karpathy's autoresearch project demonstrated a powerful insight: ML experiments can be fully autonomous. A training script runs for a fixed compute budget, reports a single metric, and an agent decides what to try next. No human in the loop during execution. No interactive debugging sessions. Just bounded, reproducible experiments that produce comparable results.

This changes the economics of ML research fundamentally. If every experiment is:

Self-contained (one script, no external dependencies beyond PyTorch)
Bounded (fixed time budget, guaranteed termination)
Comparable (single metric like val_bpb across all runs)
Reproducible (seeded RNG, deterministic configs)

...then experiments become perfectly distributable. Any machine, anywhere in the world, can run any experiment and produce a valid result. This is the core insight behind XeeNet.

Market Landscape

Volunteer Computing

SETI@home proved the model at massive scale: 5.2 million participants contributing 27 PetaFLOPS at peak. Folding@home exceeded an ExaFLOP during COVID-19 protein folding efforts, briefly becoming the world's most powerful computing system. BOINC continues to power dozens of scientific projects across astrophysics, mathematics, and biology.

These projects demonstrate sustained volunteer willingness to donate compute for science. But none target ML research, where the compute demand is growing exponentially and the workload characteristics (bounded training runs, GPU acceleration) are an excellent fit for distributed execution.

Distributed ML Compute

Commercial distributed GPU marketplaces (Vast.ai, RunPod, Lambda) focus on renting full machines at hourly rates. They target ML engineers who need dedicated instances for days or weeks. This leaves a massive gap: researchers who need thousands of short experiments (minutes each) rather than a few long ones.

XeeNet occupies a different niche entirely. Tasks are short (seconds to minutes), self-contained, and fault-tolerant. A worker can disconnect mid-task and the work is simply reassigned. This makes volunteer hardware viable in a way that long-running training jobs never could be.

The Opportunity

Factor	Existing Solutions	XeeNet
Target user	ML engineers with budgets	Researchers, academics, independents
Compute source	Dedicated GPU clusters / cloud	Volunteer idle hardware (any device)
Task duration	Hours to days	Seconds to minutes
Cost model	Per-hour rental ($0.50-$4/GPU-hr)	Credits economy (contribute to earn)
Setup friction	SSH, Docker, driver config	Zero: download, click Start
Fault tolerance	Checkpointing (complex)	Inherent: tasks are short and idempotent

Scale Economics

The economics of XeeNet improve with scale in ways that centralised compute cannot match:

Supply-Side Dynamics

Every new worker that joins the network adds compute capacity at zero marginal cost to the platform. Workers bear their own electricity and hardware costs, but since these machines are already powered on and idle, the incremental cost is near zero. A gaming PC drawing 50W idle versus 150W under load costs the owner roughly $0.01/hour in additional electricity to run ML training. That is 10-100x cheaper than cloud GPU pricing.

Demand-Side Dynamics

Researchers benefit from elastic scaling that no fixed cluster can provide. A hyperparameter sweep of 1,000 configurations can run across 500 workers in parallel, completing in the time it takes to run two experiments on a single machine. The autoresearch contract (fixed budget, single metric) means results are directly comparable regardless of which worker ran them.

Network Effects

More workers attract more researchers (faster results). More researchers attract more workers (more credits to earn, more impactful science). This creates a virtuous cycle where the platform becomes more valuable to both sides as it grows. Unlike cloud compute, where costs scale linearly, XeeNet's distributed model means capacity scales with adoption.

Scale Illustration

A modest network of 10,000 workers, each contributing 2 hours of idle time daily, provides 20,000 compute-hours per day. At an average of 6 experiments per hour per worker, that is 120,000 experiments per day. A hyperparameter sweep that would take a single GPU months can complete in hours.

The Credits Economy

XeeNet uses an internal credits system to balance supply and demand:

Workers earn credits by completing tasks. Credit value scales with compute contributed: GPU tasks earn more than CPU tasks, longer budgets earn proportionally more.
Researchers spend credits to submit experiment campaigns. Pricing reflects the actual compute required.
Bootstrap mechanism: New users receive starter credits to run their first campaign, experiencing the platform before contributing compute.
Anti-fraud: The economics agent monitors for fabricated results. Random verification re-runs catch workers submitting fake metrics. Repeat offenders lose credits and are excluded from the network.

This two-sided market creates sustainable incentives. Contributors are rewarded for honest participation. Researchers access compute that scales with the network. The platform captures no rent on the underlying compute itself.

Dashboard overview showing workers, tasks, families, briefs, and total credits

Platform dashboard showing active workers, task completion stats, and total credits in circulation

Use Cases

🎓

Academic Research

University labs can run large-scale hyperparameter sweeps without waiting for shared cluster allocations. Graduate students get access to distributed compute for their thesis experiments.

💡

Architecture Search

Explore neural network architecture choices (depth, width, attention patterns) across thousands of configurations simultaneously. Find optimal designs for specific tasks and compute budgets.

📚

Reproducibility Studies

Re-run published experiments across diverse hardware to verify claims. Seeded configs and standardised metrics make large-scale reproducibility testing practical for the first time.

🏭

Open Science

All experiment results flow into a shared database. The global research community benefits from every run, building collective knowledge about what works and what does not.