Profile (PRO) format

The PRO (Simulation Profile) format is designed to describe the characteristics of each transcript (i.e., line) from the reference annotation, initially and after each step of the simulation (i.e., columns). Columns are tab-separated and describe the attributes:

~Column Number Name Value Description
1 LOCUS_ID chrom:start-end[W|C] identifier for the intrinsic splicing locus, given by the chromosome (chrom), start and end position and the strand (Watson or Crick).
2 TRANSCRIPT_ID String transcript identifier from the reference annotation.
3 CODING [CDS|NC] specifies whether the transcript has an annotated CDS or not (NC)
4 LENGTH Integer the spliced length of the transcript molecule as annotated in the reference annotation
5 RFREQ_EXP Float relative frequency of RNA copies of this transcript after simulated expression
6 AFREQ_EXP Integer absolute number of expressed RNA molecules
7 RFREQ_LIB Float relative frequency of cDNA molecules derived from this transcript after library construction
8 AFREQ_LIB Integer absolute number of cDNA molecules generated from this transcript
9 RFREQ_SEQ Float relative frequency of reads sequenced from this transcript
10 AFREQ_SEQ Integer absolute number of reads sequenced from this transcript
