RNA-seq read estimator

A rough planning estimate of reads needed per sample for RNA-seq, from the number of features and a target average reads per feature. Not a power calculation.

Reads needed per sample

How it works

Formula

reads ≈ (features × target reads per feature) ÷ usable fraction. The usable fraction accounts for reads lost to rRNA, multi-mapping and QC.

Worked example

Human (~20,000 genes), 100 reads per gene on average, 80% usable: (20,000 × 100) ÷ 0.8 = 2,500,000 reads. (For differential expression many labs target 20–30M reads/sample, i.e. a higher reads-per-gene.)

When to use it

For a quick, order-of-magnitude planning figure when budgeting an RNA-seq run. This is NOT a statistical power calculation and ignores the highly skewed real expression distribution — low-expression genes need far more depth than the average implies.

Sensible defaults

Defaults assume the human protein-coding gene set (~20,000), 100 reads/gene and 80% usable reads. The preset gene counts are approximate round figures; set your own with the custom option.

FAQ

Is this a replacement for a power analysis?
No. It is a rough planning heuristic. For detecting differential expression at a chosen effect size and FDR, use a dedicated RNA-seq power tool that models dispersion and the expression distribution.
Where do the preset gene counts come from?
They are approximate protein-coding gene counts (~20,000 human, ~22,000 mouse). Use the custom field for a specific annotation, exome, or panel.