Introduction#
About taxopy#
taxopy is a Python package that provides an interface for assessing NCBI-formatted taxonomic databases. It enables various operations on taxonomic data, such as obtaining complete lineages, determining the lowest common ancestors (LCAs), retrieving taxa names from taxonomic identifiers, and more.
Installation#
You can install taxopy on your computer using Python's pip, uv, or through the pixi, conda or mamba package managers:
Enabling fuzzy search of taxon names
taxopy supports fuzzy string matching to [search for taxa with names that are similar but not identical to the queries][retrieval-of-taxa-with-nearly-matching-names-though-fuzzy-search]. This feature is not enabled by default to avoid additional dependencies. However, you can enable it by installing the fuzzy-matching extra using pip or uv:
Alternatively, you can install the rapidfuzz library alongside taxopy:
Acknowledgements#
Some of the code used to parse taxdump files in taxopy was adapted from CAT/BAT1, a tool for taxonomic assignment of contigs and metagenome-assembled genomes.
-
Von Meijenfeldt, F. A. B., Arkhipova, K., Cambuy, D. D., Coutinho, F. H. & Dutilh, B. E. "Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT". Genome Biology 20, 217 (2019). ↩