Clean Module

The clean module provides functions for cleaning and preparing genome assemblies for annotation.

Key Functions

clean

This is the main function for cleaning and preparing genome assemblies. It takes a genome FASTA file as input and produces a cleaned FASTA file as output.

check_inputs

Checks that the input files exist and are valid.

clean_header

Cleans FASTA headers by removing unwanted characters and optionally slicing at a specified character.

filter_contigs

Filters contigs by minimum length.

sort_contigs

Sorts contigs by size or name.