Usage¶
TERA is composed of TE identification (TERA detect), annotation (TERA anno), and quantification (TERA quant).
Input files¶
RNA sequencing files: paired end, in FASTQ format (fastq or fastq.gz)
TE reference: TE annotation in BED format (8 fields: chrom, chromStart, chromEnd, ID, name, strand, family, class) and GTF format (use script/TEbedtogtf.R).
Reference genome: reference genome sequence (FASTA) and annotation (GTF) are required. You can create genome index manually before running TERA.
TERA¶
usage: TERA [-h] {detect,anno,quant} ...
TERA: pipeline for Transposable Element-derived RNA Analysis
positional arguments:
{detect,anno,quant} sub-command help
detect detect help
anno anno help
quant quant help
optional arguments:
-h, --help show this help message and exit
TERA detect¶
TERA detect is designed for teRNA detection.
usage: tera.py detect [-h] -fq1 FASTQ1 -fq2 FASTQ2 --TE_bed TE_BED --TE_gtf
TE_GTF -r REF_GENOME -a ANNOTATION [-s {RF,FR}]
[-o OUTPUT_DIR] [-p PREFIX] [-m {1,2}] [-S STAR_INDEX]
[-G GMAP_INDEX] [-g GMAP_INDEX_NAME] [-t NTHREAD]
[--genomeSAindexNbases GENOMESAINDEXNBASES]
[--nthreadsort NTHREADSORT] [--nRAMsort NRAMSORT]
[--nRAMassem NRAMASSEM] [--max_intron MAX_INTRON]
[--min_identity MIN_IDENTITY]
[--min_coverage MIN_COVERAGE]
Main outputs:
${prefix}.gtf(TE transcripts)${prefix}_TE_exon.bed(TE exons): 11 columns for chromosome, start, end, transcript_id, gene_id, strand, exon_type, TE_ID, TE_name, TE_family, TE_class
TERA anno¶
TERA anno is designed for teRNA annotation.
usage: tera.py anno [-h] [-i INPUT] [--TE_bed TE_BED] [-o OUTPUT_DIR]
[-p PREFIX] [-a ANNOTATION] [--TE_exon TE_EXON]
[-d EXON_DIFF]
Main outputs:
${prefix}_TE_exon_anno.bed(TE exons)${prefix}.TE.exon.anno.txt(annotation for TE exons)${prefix}.TE.unit.anno.txt(annotation for TE units)
TERA quant¶
TERA quant is designed for teRNA quantification.
usage: tera.py quant [-h] [-fq1 FASTQ1] [-fq2 FASTQ2] [--TE_gtf TE_GTF]
[-l LEVEL] [-s STRANDED_TYPE] [-o OUTPUT_DIR] [-p PREFIX]
[-r REF_GENOME] [-a ANNOTATION] [--TE_exon TE_EXON]
[-q QUANT] [-i INDEX] [-t NTHREAD]
Main outputs:
${prefix}.transcript.quant.out: all transcripts including TE and nonTE (transcript id, length, eff length, count, TPM)${prefix}.TE.exon.quant.out: TE exon cluster quantification (exon cluster, count, TPM)${prefix}.TE.unit.quant.out: TE unit quantification (TE ID, count)${prefix}.TE.family.quant.out: TE family quantification (family ID, count)