API documentation Live

The CAPA prediction backend

A FastAPI service that turns HLA typing strings and clinical covariates into competing-risk survival curves. Hosted on HuggingFace Spaces — no API key required.


Overview

CAPA's backend is a FastAPI application that loads a trained PyTorch model on startup and exposes a REST API for inference. It runs inside Docker on HuggingFace Spaces — always-on, CPU-only, free tier.

The model pipeline: donor and recipient HLA alleles are resolved to protein sequences, embedded with frozen ESM-2 vectors (or looked up from a pre-built cache), passed through a cross-attention interaction network, and decoded by a DeepHit competing-risks survival head that jointly predicts the time-to-event distribution for GvHD, relapse, and TRM.

Live endpoint

Base URL: https://coconutmocha-capa.hf.space
No authentication. CORS is open (*). Rate limiting may apply under heavy load.

What the model returns

For every prediction you get three cumulative incidence functions (CIF) — one per competing event — evaluated at 100 evenly-spaced time points from 0 to 730 days post-transplant. Each CIF is a monotone non-decreasing curve in [0, 1]. The scalar risk_score is simply cif[−1]: the estimated probability of the event occurring within two years.

Quick start

The fastest way to try the API is a single curl call. You need at least one HLA locus for both donor and recipient.

Shell — check server health
curl https://coconutmocha-capa.hf.space/health
Shell — minimal prediction (3 mismatched loci)
curl -s -X POST https://coconutmocha-capa.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{
    "donor_hla":     { "A": "A*02:01", "B": "B*07:02", "DRB1": "DRB1*15:01" },
    "recipient_hla": { "A": "A*01:01", "B": "B*08:01", "DRB1": "DRB1*03:01" }
  }'
Python — full example with clinical covariates
import requests

payload = {
    "donor_hla": {
        "A": "A*02:01", "B": "B*07:02", "C": "C*07:02",
        "DRB1": "DRB1*15:01", "DQB1": "DQB1*06:02"
    },
    "recipient_hla": {
        "A": "A*01:01", "B": "B*08:01", "C": "C*07:01",
        "DRB1": "DRB1*03:01", "DQB1": "DQB1*02:01"
    },
    "clinical": {
        "age_recipient": 12,
        "age_donor": 34,
        "disease": "ALL",
        "conditioning": "MAC",
        "donor_type": "MUD",
        "stem_cell_source": "BM",
        "sex_mismatch": 0
    }
}

r = requests.post(
    "https://coconutmocha-capa.hf.space/predict",
    json=payload,
    timeout=30,
)
r.raise_for_status()
data = r.json()

print(f"GvHD 2-yr risk:    {data['gvhd']['risk_score']:.3f}")
print(f"Relapse 2-yr risk: {data['relapse']['risk_score']:.3f}")
print(f"TRM 2-yr risk:     {data['trm']['risk_score']:.3f}")
print(f"Mismatches:        {data['mismatch_count']}")
JavaScript (fetch)
const res = await fetch("https://coconutmocha-capa.hf.space/predict", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    donor_hla:     { A: "A*02:01", DRB1: "DRB1*15:01" },
    recipient_hla: { A: "A*01:01", DRB1: "DRB1*03:01" },
  }),
});
const data = await res.json();
console.log(data.gvhd.risk_score);  // e.g. 0.331

Endpoints

GET /health Liveness + readiness probe

Always returns HTTP 200. Check the ready field to know if the model loaded successfully. Use this before sending predictions in scripts to avoid silent 503s.

Response
{
  "status":         "ok",
  "model_version":  "model",
  "ready":          true,      // false if checkpoint missing or corrupt
  "startup_error":  null,      // human-readable error string if ready=false
  "uptime_seconds": 1382.4,
  "device":         "cpu"
}
POST /predict Single donor–recipient pair

Takes donor HLA typing, recipient HLA typing, and optional clinical covariates. Returns competing-risk CIF curves for GvHD, relapse, and TRM. Requires at least one HLA locus on each side; missing loci are filled with zero embeddings.

Returns 503 if the model checkpoint has not loaded; 422 if neither donor nor recipient has any HLA locus.

POST /compare Rank multiple donors for one recipient

Accepts a recipient and a list of 2–20 candidate donors with optional labels. Runs /predict for each pair and returns them ranked by composite acute-risk score (GvHD + TRM, lower is better). Useful for donor selection simulations.

Request format

All endpoints accept and return JSON (Content-Type: application/json).

HLA typing object

All fields are optional strings. Supply whichever loci you have — the model runs on any subset. Use standard IMGT allele notation (A*02:01, not A2).

FieldTypeExampleNotes
Astring"A*02:01"HLA-A allele
Bstring"B*07:02"HLA-B allele
Cstring"C*07:02"HLA-C allele
DRB1string"DRB1*15:01"HLA-DRB1 allele
DQB1string"DQB1*06:02"HLA-DQB1 allele
DPB1string"DPB1*04:01"HLA-DPB1 (optional 6th locus)

Clinical covariates object optional

All fields are optional. Missing values are imputed with zeros or the "unknown" category — the model will still run, just with less information.

FieldTypeExampleNotes
age_recipientnumber12Years
age_donornumber34Years
cd34_dosenumber5.2×10⁶ cells/kg
sex_mismatch0 or 111 = donor/recipient sex differ
diseasestring"ALL"ALL · AML · CML · MDS · NHL · HD · AA · MM · other
conditioningstring"MAC"MAC · RIC · NMA
donor_typestring"MUD"MSD · MUD · MMUD · haplo · cord
stem_cell_sourcestring"BM"BM · PBSC · cord

Full /predict request body

JSON schema
{
  "donor_hla": {          // required — at least one locus
    "A":    "A*02:01",
    "B":    "B*07:02",
    "C":    "C*07:02",
    "DRB1": "DRB1*15:01",
    "DQB1": "DQB1*06:02"
  },
  "recipient_hla": {      // required — at least one locus
    "A":    "A*01:01",
    "B":    "B*08:01",
    "C":    "C*07:01",
    "DRB1": "DRB1*03:01",
    "DQB1": "DQB1*02:01"
  },
  "clinical": {            // optional — all sub-fields optional
    "age_recipient":    12,
    "age_donor":        34,
    "disease":          "ALL",
    "conditioning":     "MAC",
    "donor_type":       "MUD",
    "stem_cell_source": "BM",
    "sex_mismatch":     0
  }
}

Response format

/predict response

JSON
{
  "gvhd": {
    "cumulative_incidence": [0.0, 0.003, …, 0.331],  // 100 values, 0–730 days
    "risk_score":          0.331,                       // = CIF at day 730
    "time_points":         [0.0, 7.37, …, 730.0]        // 100 day values
  },
  "relapse": { /* same shape */ },
  "trm":     { /* same shape */ },
  "attention_weights": [                               // n_loci × n_loci matrix
    [0.42, 0.12, 0.11, 0.21, 0.14],
    …
  ],
  "mismatch_count": 3,                                // allele-level loci mismatches
  "model_version":  "model"
}
FieldTypeDescription
gvhd / relapse / trmobjectCompeting-risk event block — CIF array, risk score, time points
cumulative_incidencefloat[100]Monotone CIF values in [0, 1] at each of the 100 time points
risk_scorefloat2-year cumulative incidence probability (CIF at day 730)
time_pointsfloat[100]Day values from 0 to 730 corresponding to each CIF entry
attention_weightsfloat[][] | nulln_loci × n_loci cross-attention matrix (donor→recipient, last layer)
mismatch_countintNumber of loci where donor and recipient alleles differ
model_versionstringIdentifier of the loaded checkpoint

/compare response

Returns all donors ranked by ascending composite score (GvHD risk + TRM risk). The best donor is first and identified by best_donor_label.

JSON
{
  "donors": [
    {
      "label":          "Donor A",
      "rank":           1,
      "gvhd_risk":      0.28,
      "relapse_risk":   0.31,
      "trm_risk":       0.19,
      "mismatch_count": 1,
      "full_prediction": { /* full /predict response */ }
    },
    …
  ],
  "best_donor_label": "Donor A",
  "model_version":    "model"
}

Run locally

Two options: Docker (self-contained, matches the HF Space exactly) or uv (faster iteration during development).

Option A — Docker

  1. Clone the backend repo

    The Docker image lives in the HuggingFace Space repository, not the main GitHub repo.

    git clone https://huggingface.co/spaces/coconutmocha/capa capa-backend
    cd capa-backend
  2. Build the image
    docker build -t capa-backend .
  3. Run the container
    docker run -p 7860:7860 capa-backend

    The server starts on http://localhost:7860. Visit /health to confirm it's up.

Option B — uv (development)

  1. Clone the backend repo and install
    git clone https://huggingface.co/spaces/coconutmocha/capa capa-backend
    cd capa-backend
    uv sync

    Requires Python 3.11+ and uv. Install uv with curl -LsSf https://astral.sh/uv/install.sh | sh.

  2. Checkpoint is already included

    The repo ships with a bundled checkpoint at runs/best/model.pt. The server finds it automatically — no extra config needed. To override, set CAPA_CHECKPOINT to any valid .pt path.

  3. Start the server
    uv run uvicorn capa.api.predict:app \
      --reload --host 0.0.0.0 --port 8000

    The API is now at http://localhost:8000. The --reload flag restarts on code changes.

  4. Point the frontend at your local server

    Edit web/config.js and change the apiUrl:

    window.CAPA_CONFIG = {
      apiUrl: 'http://localhost:8000'
    };

    Then open web/predict.html in a browser — the prediction UI now calls your local server.

Embedding cache

Without a pre-built HDF5 embedding cache the model uses zero vectors for unknown alleles and logs a warning. This is fine for smoke-testing. To build the cache with real ESM-2 embeddings, run uv run python scripts/preprocess.py — this downloads IPD-IMGT/HLA sequences and runs ESM-2 inference (requires ~4 GB RAM and takes a few minutes on CPU).

Configuration

All runtime settings are controlled via environment variables. No config files to edit — set variables before starting the server or pass them to docker run -e.

VariableDefaultDescription
CAPA_CHECKPOINTruns/best/model.ptAbsolute or relative path to the .pt checkpoint file.
CAPA_EMBED__CACHE_PATHdata/processed/hla_embeddings.h5HDF5 embedding cache. If absent, zero vectors are used for all alleles.
CAPA_EMBED__DEVICEcpuPyTorch device for ESM-2 inference: cpu, cuda, or mps.
CAPA_CORS_ORIGINS*Comma-separated list of allowed CORS origins, e.g. https://your-frontend.vercel.app.

Switching between the live and local backend

The frontend reads web/config.js at runtime. Change apiUrl there — no rebuild needed, just refresh the page.

web/config.js
window.CAPA_CONFIG = {
  // Live HF Space (production)
  apiUrl: 'https://coconutmocha-capa.hf.space'

  // Local dev server
  // apiUrl: 'http://localhost:8000'
};
Try it live View source