Introduction#
About taxopy
#
taxopy
is a Python package that provides an interface for assessing NCBI-formatted taxonomic databases. It enables various operations on taxonomic data, such as obtaining complete lineages, determining the lowest common ancestors (LCAs), retrieving taxa names from taxonomic identifiers, and more.
Installation#
You can install taxopy
on your computer using Python's pip
, uv
, or through the pixi
, conda
or mamba
package managers:
Enabling fuzzy search of taxon names
taxopy
supports fuzzy string matching to search for taxa with names that are similar but not identical to the queries. This feature is not enabled by default to avoid additional dependencies. However, you can enable it by installing the fuzzy-matching
extra using pip
or uv
:
Alternatively, you can install the rapidfuzz
library alongside taxopy
:
Acknowledgements#
Some of the code used to parse taxdump files in taxopy
was adapted from CAT/BAT1, a tool for taxonomic assignment of contigs and metagenome-assembled genomes.
-
Von Meijenfeldt, F. A. B., Arkhipova, K., Cambuy, D. D., Coutinho, F. H. & Dutilh, B. E. "Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT". Genome Biology 20, 217 (2019). ↩