Reading the reference annotation

Requires: REF_FILE_NAME, LOAD_CODING, LOAD_NONCODING
Outputs: PRO_FILE_NAME Column 1 (locus name), 2 (transcript name), 3 (CDS/NC) and 4 (spliced length)

The necessary first step in order to simulate the experiment is the loading of a reference annotation. Input data has to be in GTF format at the path specified by REF_FILE_NAME. Each transcript has to have "exon" features, LOAD_CODING takes into account the ones that have additionally "CDS" features, LOAD_NONCODING those which don't. Initiating the reading of the reference annotation (button "Run" in the toolbar) first causes a check whether the GTF structure is well sorted for efficiency of the subsequent operations. In case, the FLUX SIMULATOR will sort your file in the temporary directory, and subsequently a sorted copy of the file (with the suffix "_sorted" before the extension) should appear in the folder containing the project. Make sure to use sorted files instead of the original files in future runs, because file sorting can contribute a substantial part of the running time.

Upon termination of reading and parsing the annotation, you see several statistics including a histogram of spliced transcript lengths (upper panel) and a zoom-in onto the first 3 quartiles (lower panel). This step also initiates the pro file by writing the first 4 columns, i.e., splice locus ID, transcript ID, CDS/NC and spliced transcript length.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License