The Mission
SpaceX Inspiration4 — September 2021
The first all-civilian orbital spaceflight: 3 days at 590 km altitude, higher than ISS. Four crew members provided biospecimens at 10 timepoints spanning 289 days — creating the largest molecular dataset from any private astronaut mission.
Jared Isaacman
Mission Commander
Current NASA Administrator
Hayley Arceneaux
Medical Officer
Physician Assistant, St. Jude's
Sian Proctor
Pilot
Professor, PhD, State Dept Envoy
Chris Sembroski
Mission Specialist
Data Engineer, Air Force Veteran
Molecular Data Types
Click any card to explore — what it is, what it means, and how to use it
Transcriptomics (Gene Expression)
Measures how actively every gene is being read into RNA — a real-time readout of what your cells are doing.
What It Is
RNA sequencing counts how many copies of each gene's mRNA exist in a sample. The Inspiration4 mission used nanopore direct RNA sequencing (first ever from astronauts) plus single-nuclei RNA-seq from PBMCs (immune cells). This gives both bulk tissue-level and individual cell-type resolution.
What It Implies
If a gene's expression increases in space, that biological process is being activated. For example, upregulated oxidative phosphorylation genes mean cells are under metabolic stress. The I4 data revealed a 'spaceflight transcriptional signature' enriched in UV response, immune function, and stress pathways.
f(x)Math Involved
Log2 fold-change between timepoints. A value of 1.0 means expression doubled; -1.0 means it halved. Z-scores normalize across genes. No complex statistics needed — with n=4, descriptive analysis is more appropriate than hypothesis testing.
Hackathon Application
Group genes by pathway (inflammation, oxidative stress, DNA repair) and compute average fold-changes per pathway per crew member. This becomes your per-domain 'activation score' on the dashboard.
Key Findings
Cytokines (Immune Signaling)
Concentrations of immune signaling proteins in blood — the body's alarm system and communication network.
What It Is
Cytokines are small proteins that cells release to communicate. They regulate inflammation, immune response, and tissue repair. The I4 study measured 18 cytokines/chemokines at each timepoint: TNF-alpha, IL-8, IL-1ra, VEGF, CCL2, CCL4, CXCL5, thrombopoietin, and others.
What It Implies
Elevated pro-inflammatory cytokines (TNF-alpha, IL-8) mean the body is mounting an immune response. Anti-inflammatory cytokines (IL-1ra) rising simultaneously suggests the body is trying to regulate itself. This is the most intuitive biomarker set — 'high = stressed.'
f(x)Math Involved
Direct concentration values in pg/mL. Easy to z-score: (value - preflight_mean) / preflight_SD. Aggregate multiple cytokines into an 'Inflammation Index' by averaging their z-scores.
Hackathon Application
Perfect for the Inflammation domain score. Group pro-inflammatory vs anti-inflammatory cytokines, compute composite z-scores per crew member. Visual: radar chart showing cytokine profiles across timepoints.
Key Findings
Proteomics (Protein Levels)
Comprehensive measurement of all proteins in blood plasma — shows what the body is actually building and deploying.
What It Is
Liquid chromatography tandem mass spectrometry (LC-MS/MS) identifies and quantifies thousands of proteins circulating in plasma. Also includes extracellular vesicle (EV) proteomics — proteins packaged in tiny membrane bubbles that cells use for long-distance communication.
What It Implies
Proteins are the functional workforce of cells. While genes tell you what COULD happen, proteins tell you what IS happening. Elevated antioxidant proteins (SOD, catalase, glutathione peroxidase) mean the body is actively fighting radiation-induced oxidative damage.
f(x)Math Involved
Relative abundance values from mass spec. Fold-change comparisons between timepoints. Group proteins by function (antioxidant, inflammatory, structural) and compute pathway-level scores.
Hackathon Application
Core data for the Oxidative Stress domain. Map antioxidant/pro-oxidant protein ratios over time. Also feeds into the Immune Regulation score via complement system and immunoglobulin levels.
Key Findings
Epigenomics (Gene Regulation)
Chemical modifications on DNA and RNA that control which genes can be activated — the body's control panel.
What It Is
Includes m6A RNA methylation (chemical tags on mRNA that control stability/translation) and chromatin accessibility via snATAC-seq (which DNA regions are 'open' and available for gene activation). Also cell-free DNA methylation for tissue-of-origin analysis.
What It Implies
Epigenetic changes mean the body is reprogramming its gene regulation — not just turning genes on/off temporarily, but changing which genes CAN be turned on. The massive m6A spike at R+1 (immediately post-landing) suggests a burst of post-transcriptional gene regulation during re-adaptation to gravity.
f(x)Math Involved
Methylation percentages (0-100% per site), accessibility scores from ATAC-seq peaks. Changes computed as delta-methylation between timepoints. Aggregate by genomic region or pathway.
Hackathon Application
Adds depth to all domain scores. Epigenetic 'memory' of spaceflight stress persists longer than transcriptional changes — useful for assessing long-term risk and recovery trajectory.
Key Findings
Telomere Biology
Length of protective caps on chromosome ends — a biological clock tied to aging and cancer risk.
What It Is
Telomeres are repetitive DNA sequences (TTAGGG) at chromosome ends that shorten with each cell division. Telomere length is measured as a T/S ratio (telomere to single-copy gene ratio). Longer telomeres = more divisions remaining; critically short telomeres = cellular senescence or malignancy risk.
What It Implies
Counterintuitively, telomeres ELONGATED during spaceflight (also seen in NASA Twins Study). This likely reflects stress-induced telomerase activation. However, they shortened rapidly post-flight, sometimes below pre-flight baseline — indicating accelerated aging upon return.
f(x)Math Involved
T/S ratio (relative telomere length). Percentage change from baseline. Simple time-series visualization. No complex math — the biology is the interesting part.
Hackathon Application
Key component of the DNA Damage Response domain. Telomere dynamics combined with CHIP data and cfDNA levels give a comprehensive picture of genomic integrity. The elongation-then-shortening pattern is visually compelling for dashboards.
Key Findings
Microbiome Composition
The community of trillions of microbes living on and inside you — shifts reveal immune and metabolic changes.
What It Is
Metagenomic and metatranscriptomic sequencing of 750 samples from 8 body sites (skin, oral, nasal, gut) plus spacecraft surfaces, at 8 timepoints. Identifies which species are present, their relative abundance, and which genes they're actively expressing.
What It Implies
Microbiome composition affects immune function, metabolism, and even mental health. In the confined spacecraft environment, crew members rapidly exchanged microbes — their microbiomes converged. Shifts in gut bacteria can indicate stress, dietary changes, or immune suppression.
f(x)Math Involved
Relative abundance percentages (what % of bacteria are Species X). Alpha diversity (Shannon index — how diverse is one person's microbiome). Beta diversity (Bray-Curtis — how similar are two people's microbiomes). All standard ecological metrics.
Hackathon Application
Supporting data for Immune Regulation domain. Microbiome diversity loss correlates with immune suppression. The crew-to-crew transfer visualization is compelling — shows shared living environment effects.
Key Findings
Cell-Free DNA (cfDNA)
DNA fragments from dying cells floating in blood — reveals which organs are under stress without biopsies.
What It Is
When cells die (apoptosis or necrosis), their DNA fragments enter the bloodstream. By analyzing methylation patterns on these fragments, you can determine which tissue they came from (each tissue has a unique methylation fingerprint). Also includes cf-mtDNA (mitochondrial cell-free DNA).
What It Implies
Elevated cfDNA from a specific tissue = that tissue is experiencing damage or increased turnover. Elevated immune-cell-origin cfDNA post-landing = immune system activation. Elevated mitochondrial cfDNA = mitochondrial stress/damage across the body.
f(x)Math Involved
Fragment counts per tissue-of-origin (deconvolution percentages). Fold-change from baseline. cf-mtDNA is measured as copies/mL. Simple ratios and time-series comparison.
Hackathon Application
Feeds into both Mitochondrial Function (cf-mtDNA) and DNA Damage Response domains. Tissue-of-origin data adds a 'which organs are affected' dimension that other omics can't provide.
Key Findings
Immune Repertoire (TCR/BCR)
The complete library of your immune system's receptors — shows how prepared your body is to fight threats.
What It Is
T-cell receptors (TCR) and B-cell receptors (BCR) are unique proteins on immune cells that recognize specific threats. Sequencing all of them reveals your immune system's diversity (how many different threats can you respond to) and clonality (is one clone dominating, suggesting active immune response).
What It Implies
Reduced TCR/BCR diversity = immunosuppression (fewer unique threats recognizable). Increased clonality = the immune system is actively fighting something (one clone expanding). T-cell frequency reduction during flight = suppressed adaptive immunity.
f(x)Math Involved
Diversity indices (Shannon, Simpson), clonality scores (1 - normalized Shannon entropy), clone frequency distributions. Standard ecological diversity metrics applied to immune sequences.
Hackathon Application
Core component of Immune Regulation domain score. Diversity drop + T-cell frequency reduction = quantifiable immune suppression. Clonality changes add nuance about active vs. passive immune states.
Key Findings
Clinical Blood Panels (CBC)
Standard blood work — white cells, red cells, platelets, metabolic markers. The ground truth of basic health.
What It Is
Complete Blood Count (CBC) and metabolic panels measuring: white blood cell counts (total and differential — neutrophils, lymphocytes, monocytes), red blood cells, hemoglobin, hematocrit, platelets, plus standard chemistry (glucose, creatinine, liver enzymes, electrolytes).
What It Implies
This is the clinical anchor for all other omics data. Low lymphocytes confirm transcriptomic findings of immune suppression. Elevated neutrophils confirm cytokine-level inflammation. Provides clinically validated reference ranges that give context to molecular findings.
f(x)Math Involved
Direct counts (cells/uL) and concentrations (mg/dL). Compared against established clinical reference ranges. Percentage change from individual baseline. The simplest math in the entire dataset.
Hackathon Application
Validation layer for your dashboard. If your molecular-derived 'Inflammation Score' is high but CBC shows normal WBC... something's wrong with your scoring. Use as sanity check and as the 'clinical summary' tab.
Key Findings
Clonal Hematopoiesis (CHIP)
Tracking radiation-induced mutations in blood stem cells — monitoring for cancer-precursor events.
What It Is
Clonal Hematopoiesis of Indeterminate Potential (CHIP) occurs when a blood stem cell acquires a somatic mutation and produces a clone of cells all carrying that mutation. Whole genome sequencing tracks the Variant Allele Frequency (VAF) — what percentage of blood cells carry each mutation.
What It Implies
If a CHIP clone expands during spaceflight, radiation may be driving dangerous mutations. The I4 finding was reassuring: pre-existing CHIP clones remained stable, suggesting 3 days of radiation exposure didn't measurably accelerate clonal expansion. However, longer missions may differ.
f(x)Math Involved
Variant Allele Frequency (VAF) — percentage of sequencing reads carrying the mutation (0-50% for heterozygous). Compare VAF at each timepoint. Stable VAF = clone not expanding. Increasing VAF = clone growing (concerning).
Hackathon Application
Part of DNA Damage Response domain. CHIP stability is a 'green flag' — include it as a reassuring indicator. For longer hypothetical missions, you could model projected clone growth rates.
Key Findings
Health Domains
Five scoring dimensions for the astronaut risk dashboard (Track 2)
Immune Regulation
Cytokine levels + TCR/BCR diversity + T-cell frequencies
Inflammation
TNF-alpha, IL-8, CCL2, CRP, neutrophil-to-lymphocyte ratio
Oxidative Stress
Antioxidant protein levels + oxidative phosphorylation genes
DNA Damage Response
CHIP stability + cfDNA levels + telomere dynamics
Mitochondrial Function
cf-mtDNA levels + OxPhos gene expression + metabolic proteins
Winning Strategy
How to turn molecular data into a first-place dashboard
Scoring
- Z-score biomarkers against pre-flight baseline
- Group by health domain, average within
- |z| > 1.5 = elevated, |z| > 2.0 = high concern
- Validate scores against clinical CBC
Visual Identity
- Mission-control aesthetic — dark, dense, professional
- Uncertainty as a first-class visual element
- Temporal scrubbing through timepoints
- Drill-down: domain → category → biomarker
Differentiation
- Crew-to-crew comparative framing
- Recovery trajectory projections
- "What we don't know" explicitly shown
- Deployed interactive web app, not PDF
AI Transparency
- Document AI-assisted interpretation openly
- Show reasoning chains for assignments
- Cite literature for all claims
- Target 'Best Use of AI' prize
Data Access
Where to find the actual datasets
| Source | Content | Location |
|---|---|---|
| NASA OSDR | All 10 OSD datasets (primary) | osdr.nasa.gov |
| SOMA Portal | Interactive data explorer | soma.weill.cornell.edu |
| I4 Multiome | Single-cell PBMC browser | soma.weill.cornell.edu/apps/I4_Multiome/ |
| Nature Collection | 44 SOMA papers | nature.com/collections/ebdbcahdgc |
| NCBI GEO | Multiome sequencing data | GSE264321 |
| NASA Twins Study | 340-day ISS reference | NASA GeneLab |