UMI extraction¶
Usage: ucircfull extract_umi [--help] [--version] --input FQ --anchorx SEQ --umi SEQ [--noumi] --outdir DIR --prefix PREFIX --thread INT [--seqkit PATH] [--porechop PATH]
Extract UMI sequence and identify strand from ucircFL-seq raw fastq
Optional arguments:
-h, --help shows help message and exits
-v, --version prints version information and exits
-i, --input FQ ucircFL-seq raw fastq file. [required]
-x, --anchorx SEQ anchor sequence used in 1st strand cDNA synthesis. [required]
-u, --umi SEQ umi pattern. [default: "CTCNNNYRNNNYRNNNYRNNNGAG"]
-n, --noumi no UMIs were added to 1st strand anchor.
-o, --outdir DIR output directory. [default: "."]
-p, --prefix PREFIX output prefix. [default: "circFL"]
-t, --thread INT number of threads used. [default: 4]
--seqkit PATH path to seqkit. [default: "seqkit"]
--porechop PATH path to porechop. [default: "porechop"]
--debug enable debug output.
For ucircFL-seq data in default library preparation data ($rawfastq), run:
ucircfull extract_umi -i $rawfastq -x CTACACGACGCTCTTCCGATCT -o . -p $sample -t $thread
Output¶
$sample_strand.fastq$sample_umi.fasta