Usage¶
Exonize requires two positional arguments that can be followed by optional arguments.
exonize <gff_file_path> <genome_file_path> [OPTIONS]
Required Arguments¶
<gff_file_path>
: Path to the genome annotation file. This file should be in GFF3 or GFF format.<genome_file_path>
: Path to the genome sequence file. The file should be in FASTA format. A.zip
version is also accepted.
Optional Arguments¶
[-gfeat]
: Specifies the gene feature in the annotations file. Default:gene
.[-cdsfeat]
: Specifies the coding sequence feature in the annotations file. Default:CDS
.[-transfeat]
: Specifies the transcript feature in the genome annotations. Default:transcript
.[-sb]
: Annotation coordinates base of input annotations, either0
or1
. Default:1
.[-fb]
: Frame base of input annotations, either0
or1
. Default:0
.[-l]
: Minimum exon length in bases required for the search. Default:30
.[-e]
: E-value threshold (local BLAST search param). Default:1e-3
.[-ts]
: Self-hit overlap threshold. Default:0.5
.[-te]
: Minimum query coverage cut-off. Default:0.9
.[-ce]
: Overlap threshold for constructing the set of representative exons. Default:0.9
.[-ct]
: Overlap threshold for target coordinate reconciliation (local BLAST search param). Default:0.9
.[-tp]
: Minimum length coverage between the query and target (global MUSCL search param). Default:0.9
.[-ta]
: Threshold for the fraction of aligned positions (global MUSCL search param). Default:0.9
.[-ti]
: Peptide identity threshold (global MUSCL search param). Default:0.4
.[-op]
: Search identifier. Default: The stem of the annotations file.[-cn]
: Number of CPUs to use. Default: The available CPU count.[-odp]
: Path to the output directory. Default: The current directory.[--global-search]
: Enables only the global search mode.[--local-search]
: Enables only the local search mode.[--debug]
: Enables debug mode, saving input and output files for the local search.[--soft-force]
: Overwrites the results database if it already exists.[--hard-force]
: Overwrites all internal files if they already exist.[--csv]
: Outputs a.zip
file with a reduced set of results in CSV format.[-v, --version]
: Shows the program's version number and exits.[-h, --help]
: Shows this help message and exits.
NOTE: If neither flag
--global-search
or--local-search
is specified, both searches will run by default.
Example: Human Y chromosome¶
The following steps demonstrate how to run exonize
on a test dataset.
Download the test data
- If you have installed the package from the repo, move into the
test_human_chrom_Y
directory and download the test data:
cd test_human_chrom_Y
source fetch_data.sh
- If you installed the package via
pip
, you can download the script here:fetch_data.sh
.
Run exonize with default parameters:
exonize Homo_sapiens.GRCh38.111.chromosome.Y.gff3 \
Homo_sapiens.GRCh38.dna.chromosome.Y.fa.gz \
--output_prefix Homo_sapiens_chrom_Y \
--csv
This command will enable the global and local searches.