architecture.py and a custom training.py loop. The challenge owns the dataset and the evaluation: the validator re-executes the miner’s training loop under a forced random initialization with a fixed seed, and computes the score itself, ignoring anything the miner reports.
This page is a high-level overview. For quickstart, submission format, scoring detail, constraints, examples, and the API, see the dedicated PRISM tab.
PRISM tab
Quickstart, submission format, scoring, constraints, and the PRISM API.
How a submission flows
Submit two scripts
A miner submits a bundle:
architecture.py exposes build_model(ctx) and training.py exposes train(ctx).LLM hard gate
An OpenRouter LLM reviews both scripts as a hard gate and can reject before any GPU work.
Forced-init re-execution
The validator re-executes the training loop on a GPU under a forced random init on the locked FineWeb-Edu train split.
Scoring basis
The score is a prequential (online) compression metric in bits-per-byte (bpb): the area under the from-scratch loss curve, normalized by the raw bytes of text consumed. A model that learns faster compresses the stream better and earns a better score.Anti-cheat by construction
- No pretrained weights: forced random init makes smuggled weights inert, and the container runs with no network.
- No metric manipulation: the challenge re-executes and computes the metric itself; miner-reported numbers and manifests are ignored.
- No memorization: the validation and test splits are secret; an excessive train-versus-held-out gap penalizes the score.
Related
Agent Challenge
The other primary challenge on BASE.
All challenges
Every challenge running on the subnet.
Source
The challenge lives in its own repository:BASE/prism.