Overview
Each disease lives in app/data/diseases/{slug}/ and contains 14 JSON files — one per data type. TypeScript interfaces for all types are in src/types/disease.ts.
The three most common contributions are: adding a new source (a published paper), adding a data item (symptom, treatment, hypothesis, etc.), and adding ontology identifiers (HPO, DrugBank, UniProt codes).
Data model
Every disease directory contains these files:
| File | Content | Key fields |
|---|---|---|
| disease.json | Core disease entity | name, slug, description, epidemiology |
| sources.json | Published references | authors, title, year, evidenceGrade, refCode |
| symptoms.json | Disease symptoms | name, frequency, severity, category |
| pathway-molecules.json | Molecular signalling cascade | molecule, role, expressionChange, evidenceLevel |
| genetic-findings.json | Gene variants | gene, variant, variantType, significance |
| treatments.json | Treatment evidence | name, mechanism, responseRate, line, evidenceLevel |
| hypotheses.json | Scientific hypotheses | statement, status, evidenceScore, evidenceFor/Against |
| open-questions.json | Unsolved research questions | question, context, status |
| diagnostic-criteria.json | Diagnostic criteria sets | majorCriteria, minorCriteria, sensitivity |
| differential-diagnoses.json | Conditions to rule out | condition, keyDistinction, sharedFeatures |
| complications.json | Long-term risks | name, risk, monitoring |
| research-updates.json | Research update feed | updateType, impactLevel, title, summary |
| clinical-trials.json | Active and completed trials | nctId, title, phase, status |
| preprints.json | Recent preprints | doi, title, server, category |
All data items use a string id following the pattern: {disease-slug}-{type}-{descriptive-slug}.
Examples: schnitzler-syndrome-source-a2, schnitzler-syndrome-symptom-chronic-urticarial-rash, schnitzler-syndrome-treatment-anakinra
Add a source
Sources are the foundation — every data item traces back to a source via sourceRefCodes. Open sources.json for the relevant disease and add a new entry.
Required fields: id, type, authors, title, year, evidenceGrade, keyFindings, refCode.
{
"id": "schnitzler-syndrome-source-b10",
"type": "journal_article",
"authors": ["Smith J", "Doe A", "et al."],
"title": "Title of the publication",
"year": 2024,
"evidenceGrade": "B",
"keyFindings": [
"Finding 1",
"Finding 2"
],
"refCode": "B10",
"journal": "Journal Name",
"doi": "10.xxxx/xxxxx",
"pmid": "12345678",
"category": "pathogenesis",
"studyType": "cohort_study",
"sampleSize": 50
}refCode convention
Each source gets a short refCode (e.g., A2, B7, G1) that other data items reference via their sourceRefCodes array. The letter loosely groups by topic area. Pick the next available code in the relevant letter group, or start a new letter if needed.
Source types
Allowed type values: journal_article, clinical_guideline, case_report, systematic_review, meta_analysis, expert_opinion, registry_data
Categories
Allowed category values: diagnostics, epidemiology, pathogenesis, therapeutics, reviews, genetics, immunology, case_reports
Add data items
Data items (symptoms, treatments, hypotheses, etc.) reference sources through sourceRefCodes.
Adding a symptom
{
"id": "schnitzler-syndrome-symptom-lymphadenopathy",
"name": "Lymphadenopathy",
"frequency": "common",
"frequencyPercent": 45,
"severity": "minor",
"description": "Palpable lymph node enlargement, usually non-tender.",
"category": "lymphatic",
"sourceRefCodes": ["A3", "A5"],
"identifiers": {
"hpo": "HP:0002716"
}
}Frequency values: universal, very_common, common, occasional, rare
Severity values: cardinal, major, minor
Adding a treatment
{
"id": "schnitzler-syndrome-treatment-drug-name",
"name": "Drug Name",
"mechanism": "Mechanism of action",
"route": "SC 100mg daily",
"responseRate": "Description of response rate",
"onset": "Hours–days",
"line": "1st",
"evidenceLevel": "green",
"explanation": "Detailed explanation of how/why this treatment works.",
"sourceRefCodes": ["G1", "G5"],
"identifiers": {
"drugbank": "DB00026",
"rxnorm": "140587",
"atc": "L04AC03"
}
}evidenceLevel values: green (strong evidence), amber (moderate), default (limited)
Adding a hypothesis
{
"id": "schnitzler-syndrome-hypothesis-2",
"statement": "Clear, testable hypothesis statement",
"domain": "pathogenesis",
"status": "leading",
"evidenceScore": 55,
"studyCount": 10,
"evidenceFor": ["Supporting point 1", "Supporting point 2"],
"evidenceAgainst": ["Counter-evidence 1"],
"firstProposedYear": 2020,
"firstProposedBy": "Author et al.",
"sourceRefCodes": ["B7", "C4"]
}Status values: leading, emerging, challenged, refuted
Ontology identifiers
Adding standardised ontology identifiers makes every data item interoperable with the broader biomedical ecosystem. We use an identifiers object on each item.
Symptoms — HPO
Add an identifiers.hpo field with the Human Phenotype Ontology code.
{
"id": "schnitzler-syndrome-symptom-recurrent-fever",
"name": "Recurrent fever",
"identifiers": {
"hpo": "HP:0001954"
}
}Treatments — DrugBank, RxNorm, ATC
Add identifiers.drugbank, identifiers.rxnorm, and/or identifiers.atc.
{
"id": "schnitzler-syndrome-treatment-anakinra",
"name": "Anakinra",
"identifiers": {
"drugbank": "DB00026",
"rxnorm": "140587",
"atc": "L04AC03"
}
}Molecules — UniProt
Add identifiers.uniprot and optionally identifiers.ncbiGene.
{
"id": "schnitzler-syndrome-molecule-il-1",
"molecule": "IL-1β",
"identifiers": {
"uniprot": "P01584",
"ncbiGene": "3553"
}
}Genes — HGNC, Ensembl, ClinVar
Add identifiers.hgnc, identifiers.ensembl, and optionally identifiers.clinvar for specific variants.
{
"id": "schnitzler-syndrome-gene-myd88-l265p-somatic",
"gene": "MYD88",
"identifiers": {
"hgnc": "7562",
"ensembl": "ENSG00000172936",
"ncbiGene": "4615",
"clinvar": "28173"
}
}Lookup APIs (all free, no auth)
| Ontology | API endpoint | ID format |
|---|---|---|
| HPO | https://ontology.jax.org/api/hp/search?q={term} | HP:\d{7} |
| DrugBank | https://go.drugbank.com/unearth/q?query={drug} | DB\d{5} |
| RxNorm | https://rxnav.nlm.nih.gov/REST/rxcui.json?name={drug} | Numeric |
| UniProt | https://rest.uniprot.org/uniprotkb/search?query={protein}&format=json | [A-Z]\d{5} |
| HGNC | https://rest.genenames.org/search/{gene} | Numeric |
| MONDO | https://api.monarchinitiative.org/v3/search?q={disease} | MONDO:\d{7} |
| MeSH | https://id.nlm.nih.gov/mesh/lookup/term?label={term} | D\d{6,9} |
| ClinVar | https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=clinvar&term={variant} | Numeric |
| PubMed | https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id={pmid}&retmode=json | Numeric |
Evidence grades
Every source must have an evidenceGrade:
| Grade | Meaning | Examples |
|---|---|---|
| A | RCT, meta-analysis, or systematic review | RCTs, systematic reviews, meta-analyses |
| B | Cohort study, case-control, or large case series | Prospective studies, large retrospective series, expert consensus guidelines |
| C | Case report or expert opinion | Individual case reports, narrative reviews, editorials |
Validate locally
Before submitting a PR, validate your changes from the app/ directory:
cd app
npm run validate # Checks JSON schema, required fields, referential integrity
npm run score # Calculates disease completeness scoreBoth commands must pass before your PR will be accepted.
Submit a PR
- Fork the repository
- Clone your fork and create a branch:
git checkout -b add-source-smith-2024 - Edit the relevant JSON files in
app/data/diseases/{slug}/ - Validate locally:
cd app npm run validate && npm run score - Commit with a descriptive message:
git commit -m "Add Smith 2024 cohort study to Schnitzler sources" - Push and open a pull request against
main
PR templates
GitHub will show template options when you create the PR:
- Add Source — for new publications
- Add Data — for new symptoms, treatments, hypotheses, etc.
- Add Identifiers — for ontology mappings (HPO, DrugBank, UniProt, HGNC)
- Autoresearch Batch — for agent-generated research updates
For AI agents
If you're an AI agent that discovered Kipine through llms.txt or the JSON API:
Read the data
GET /api/diseases/{slug}.json— complete disease objectGET /api/diseases/{slug}/{table}.json— individual tablesGET /api/diseases.json— index of all diseases with record countsGET /api/export/{slug}.jsonl— bulk export as newline-delimited JSON
Improve the data
Fork the repo, edit JSON files in app/data/diseases/{slug}/, run npm run validate && npm run score, and open a PR. Use the autoresearch-batch PR template and include before/after quality scores.
Constraints
- Never remove existing data without justification
- Every new fact must cite a source with a DOI or PMID
- Preserve existing IDs
- Follow the JSON schema — run
npm run validate - Verify PMIDs against PubMed before adding:
GET https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id={pmid}&retmode=json
Code of conduct
Be rigorous with citations. Every clinical claim should trace back to a published source. When in doubt, use evidence grade C and note the limitation. Accuracy matters more than completeness.