Mapping reads to the annotation

Single read mapping

The assignment of reads — after having mapped them to genomic locations — is not straightforward. The Flux Capacitor follows a conservative annotation assignment,i.e., reads are assigned uniquely to genomic regions („segments” or ,,junctions). These regions are defined given the exon-intron structure of each locus, an example is shown in Fig.1.

Fig.1: An example locus with two transcripts $\textrm{I}$ and $\textrm{II}$ (names to the left) that overlap in segments of their exons (green boxes denoted by letters A through E, indices indicate segments of overlapping exons). The Flux Capacitor distinguishes further 5 non-exonic areas. 19 sequencing reads (arrows with heart labels) have been mapped in the arrea of the locus as shown.

The locus sketched in Fig.1 consists of 8 exons that cluster in 8 segments (A₁, A₂, $\ldots$ ,E) separated by 5 non-exonic regions, i.e., the 5'proximal area (F), 3 introns (G,H,J), and 3'proximal (K). Additionally, there exist junctions between all adjacent segments (e.g., FA₁, A₁A₂, etc. $\ldots$ ), or between non-adjacent segments that are spliced together (so-called splice-junctions, for instance A₂B₁). Reads are assigned to the region they completely fall into.

category	FA₁	A₁A₂	A₂	G	GB₁	B₁B₂	B₁C₁	B₂H	C₁	C₁C₂	C₂	C₃	C₃J	J	E	EK	none
assigned read ID	1	2	3, 19	18	4	5	17	6	7, 16	15	8	14	9	10	11	12	13

Note: By meanings of the mapping, read number 13 is not compatible with the annotation and remains unassigned.

Read pair mapping

A read pair is mapped validly iff both mate reads map to a segment or junction and their mapping distance on at least one of the transcripts that support both mapping locations falls within the boundaries of expected insert sizes. How paired reads are counted and coverage by read pairs is determined summarizes Fig.2.

Fig.2: Examples of exonic structures (green boxes are exons, introns are not drawn to scale) and distinct possible read mappings, for single (above the structure) and paired-end reads (below). The read length is 3 and, for paired-ends, the insert size is 4 (no variation). For simplification, junctions are not shown. (A) There are 10 possible mapping locations („slots”)) in a mono-exonic transcript with 12nt. Reads starting at positions 11 or 12 fall partially outside of the annotation, as reads that start before position 1, and such reads are not considered to belong to the exon as annotated. Correspondingly, 4 slots with paired end reads can be observed. (B) Example of a transcript with 2 exons. Disconsidering the splice-junction, which is assigned read mappings starting in position 6 or 7, we observe 8 slots for single reads and 3 paired-end read slots. (C) Example of a transcript with 3 exons (splice-junctions disregarded). There are 7 slots for single reads, and 2 for paired-end reads.

Flux Capacitor

Wiki for the FLUX CAPACITOR and FLUX SIMULATOR

Navigation

General

Page tags

Add a new page

Single read mapping

Read pair mapping

Other interesting sites

Emchina2010

The Backrooms X

Soymilk Linkshell

交界档案室