Usage

Input files

GTF files:
    GTF files generated by stringtie from short-read RNA-seq.
BED12 file:
    BED12 generated by flair correct. In brief, long-read RNA-seq reads are aligned to the genome by minimap2, converted from sorted BAM to BED12, then splice sites are corrected by flair correct using genome annotations and short-read splice junctions.
TSS signal:
    Counts of uniquely mapped reads (CAGE and/or RAMPAGE) supporting TSS in TSV format (at least 5 fields: chrom, chromStart, chromEnd, strand, score).
TE reference bed:
    TE annotation in BED format (at least 6 fields: chrom, chromStart, chromEnd, ID, name, strand).
eRNA reference bed:
    eRNA annotation in BED format (at least 6 fields: chrom, chromStart, chromEnd, ID, name, strand).
Reference GTF:
    Reference GTF file for gene annotations.

TEIRI_correct.py

TEIRI_correct.py extracts and filters first exons, then corrects first-exon TSS using supporting reads from long-read RNA-seq, CAGE, and RAMPAGE.

usage: TEIRI_correct.py [-h] [-i GTF_LIST] [-r REFERENCE_GTF] [-l CORRECTED_BED12] [-c TSS_SCORE] [--TE_anno TE_ANNO] [--eRNA_anno ERNA_ANNO]
                        [--max_exon1_length MAX_EXON1_LENGTH] [--min_exon1_length MIN_EXON1_LENGTH] [--tss_window TSS_WINDOW] [--max_tss MAX_TSS]
                        [--min_NGS_ratio MIN_NGS_RATIO] [--min_TGS_reads MIN_TGS_READS] [--threshold THRESHOLD] [--TGS_threshold TGS_THRESHOLD]
                        [--SE_exclude SE_EXCLUDE] [--min_NGS_ratio_SE MIN_NGS_RATIO_SE] [--min_TGS_reads_SE MIN_TGS_READS_SE] [--SE_threshold SE_THRESHOLD]
                        [-p PREFIX]

TEIRI_merge.py

TEIRI_merge.py merges transcripts using corrected TSSs and is suitable for multiple samples in one condition.

usage: TEIRI_merge.py [-h] [-i GTF_LIST] [-r REFERENCE_GTF] [-l CORRECTED_BED12] [--corrected_tss CORRECTED_TSS]
                      [--corrected_tss_single CORRECTED_TSS_SINGLE] [--TGS_weight TGS_WEIGHT] [--illumina_threshold ILLUMINA_THRESHOLD]
                      [--nanopore_threshold NANOPORE_THRESHOLD] [--max_transcripts MAX_TRANSCRIPTS] [--trunctated_exclude TRUNCTATED_EXCLUDE]
                      [--min_transcript_length MIN_TRANSCRIPT_LENGTH] [-p PREFIX]

TEIRI_consolidate.py

TEIRI_consolidate.py consolidates TE-initiated RNAs across multiple conditions (for example, different tissues).

usage: TEIRI_consolidate.py [-h] [-i GTF_LIST] [-r REFERENCE_GTF] [--TE_anno TE_ANNO] [--tss_merge_distance TSS_MERGE_DISTANCE]
                            [--min_exon_length MIN_EXON_LENGTH] [-p PREFIX]