PRO format
The PRO (Simulation Profile) format is designed to describe the characteristics of each transcript (i.e., line) from the reference annotation, initially and after each step of the simulation (i.e., columns). Columns are tab-separated and describe the attributes:
| ~Column Number | Name | Value | Description |
|---|---|---|---|
| 1 | LOCUS_ID | chrom:start-end[W|C] | identifier for the intrinsic splicing locus, given by the chromosome (chrom), start and end position and the strand (Watson or Crick). |
| 2 | TRANSCRIPT_ID | String | transcript identifier from the reference annotation. |
| 3 | CODING | [CDS|NC] | specifies whether the transcript has an annotated CDS or not (NC) |
| 4 | LENGTH | Integer | the spliced length of the transcript molecule as annotated in the reference annotation |
| 5 | RFREQ_EXP | Float | relative frequency of RNA copies of this transcript after simulated expression |
| 6 | AFREQ_EXP | Integer | absolute number of expressed RNA molecules |
| 7 | RFREQ_LIB | Float | relative frequency of cDNA molecules derived from this transcript after library construction |
| 8 | AFREQ_LIB | Integer | absolute number of cDNA molecules generated from this transcript |
| 9 | RFREQ_SEQ | Float | relative frequency of reads sequenced from this transcript |
| 10 | AFREQ_SEQ | Integer | absolute number of reads sequenced from this transcript |
Add a New Comment





