Sequencing adapter lookup
Identify which common sequencing adapter an over-represented sequence matches (Illumina TruSeq, Nextera and more), by exact-substring matching.
How it works
Formula
For each known adapter, find the longest adapter prefix (at least 8 bp) that appears exactly in the query (the usual case is a read reading through into the start of an adapter), or match when the whole query is a substring of an adapter. Exact substring only — no mismatches.
Worked example
TTTTAGATCGGAAGAGCACACGTCTGAAC contains AGATCGGAAGAGC, the Illumina TruSeq / universal adapter start, so it is reported as a full TruSeq adapter match.
When to use it
When a FastQC-style report flags an over-represented sequence and you want to know which adapter to trim. Match it here, then trim that adapter with your tool of choice.
Sensible defaults
The default contains the TruSeq adapter. Paste an over-represented sequence from your QC report to identify it.
Source
Curated common subset, NOT exhaustive. Adapter sequences follow the standard references used by FastQC’s adapter list and Illumina’s adapter-sequences documentation.
FAQ
- Why might a real adapter not match?
- This pass does exact substring matching only. Real sequencing reads contain errors (substitutions, indels) that can break an exact match; error-tolerant matching is not done here.
- Which adapters are included?
- A curated common subset: Illumina TruSeq/universal, Nextera (Tn5) transposase, and Illumina small-RNA 3′/5′ adapters. It is not exhaustive.