Frequently Asked Questions

General Questions

What is Funannotate2?

Funannotate2 is a comprehensive eukaryotic genome annotation pipeline that provides a complete workflow for annotating fungal, plant, and other eukaryotic genomes. It integrates various tools and databases to produce high-quality gene predictions and functional annotations.

What types of genomes can Funannotate2 annotate?

Funannotate2 is designed primarily for fungal genomes, but it can also be used to annotate other eukaryotic genomes such as plants, insects, and other organisms. However, the default parameters and databases are optimized for fungal genomes.

What are the system requirements for Funannotate2?

Funannotate2 requires:

Linux or macOS operating system
Python 3.7 or later
At least 8 GB of RAM (16 GB or more recommended for larger genomes)
At least 50 GB of free disk space
Multiple CPU cores (8 or more recommended for faster processing)

Installation Questions

How do I install Funannotate2?

See the Installation page for detailed installation instructions.

Why can’t I install GeneMark-ES through conda?

GeneMark-ES is not available through conda due to licensing restrictions. You need to register and download it manually from the GeneMark website.

Usage Questions

What is the recommended workflow for annotating a genome?

The recommended workflow is:

Clean the genome assembly using funannotate2 clean
Train ab initio prediction tools using funannotate2 train
Predict genes using funannotate2 predict
Functionally annotate the predicted genes using funannotate2 annotate

See the Tutorial for a detailed example.

How can I improve gene prediction accuracy?

To improve gene prediction accuracy:

Use high-quality protein evidence from closely related species
Use transcript evidence from RNA-seq data
Use a species-specific Augustus model
Use the appropriate GeneMark mode (ES for self-training, ET for transcript-guided)
Use a BUSCO database appropriate for your organism

What databases does Funannotate2 use for functional annotation?

Funannotate2 uses the following databases for functional annotation:

Pfam: Protein domain annotations
dbCAN: Carbohydrate-active enzyme annotations
MEROPS: Peptidase annotations
SwissProt: Protein annotations
BUSCO: Benchmarking Universal Single-Copy Orthologs

How can I add custom functional annotations?

You can add custom functional annotations by:

Creating a custom database in the appropriate format (FASTA, HMM, etc.)
Using the appropriate search tool (BLAST, HMMER, etc.) to search your proteins against the custom database
Parsing the search results and adding the annotations to the gene models
Using the Funannotate2 API to integrate the custom annotations into the annotation pipeline

Troubleshooting

Why does GeneMark-ES fail on my genome?

GeneMark-ES may fail for several reasons:

The genome assembly is too fragmented (try filtering out short contigs)
The genome assembly contains too many Ns (try cleaning the genome)
The genome is not from a eukaryotic organism (GeneMark-ES is designed for eukaryotes)
GeneMark-ES is not installed correctly (check the installation)

Why does Augustus fail on my genome?

Augustus may fail for several reasons:

The species model does not exist (try using a different species model)
The species model is not appropriate for your organism (try using a more closely related species)
Augustus is not installed correctly (check the installation)
The genome assembly is too fragmented (try filtering out short contigs)

Why are some of my gene models incomplete?

Gene models may be incomplete for several reasons:

The genome assembly is fragmented, and genes span contig boundaries
The gene prediction tools failed to identify the complete gene structure
The gene is genuinely partial (e.g., pseudogene)

Try using protein and transcript evidence to improve gene model completeness.

How can I report a bug or request a feature?

You can report bugs or request features by opening an issue on the GitHub repository.