Introduction#
About taxopy#
taxopy is a Python package that provides an interface for assessing NCBI-formatted taxonomic databases. It enables various operations on taxonomic data, such as obtaining complete lineages, determining the lowest common ancestors (LCAs), retrieving taxa names from taxonomic identifiers, and more.
Installation#
You can install taxopy on your computer using Python's pip, uv, or through the pixi, conda or mamba package managers:
Enabling fuzzy search of taxon names
taxopy supports fuzzy string matching to search for taxa with names that are similar but not identical to the queries. This feature is not enabled by default to avoid additional dependencies. However, you can enable it by installing the fuzzy-matching extra using pip or uv:
Alternatively, you can install the rapidfuzz library alongside taxopy:
Acknowledgements#
Some of the code used to parse taxdump files in taxopy was adapted from CAT/BAT1, a tool for taxonomic assignment of contigs and metagenome-assembled genomes.
-
Von Meijenfeldt, F. A. B., Arkhipova, K., Cambuy, D. D., Coutinho, F. H. & Dutilh, B. E. "Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT". Genome Biology 20, 217 (2019). ↩