Apptainer
Apptainer is a software for creating and running containers.
Installation
Section titled “Installation”We will install Apptainer using Pixi. Before proceeding, ensure that you have followed the Pixi installation guide. Once Pixi is installed in your environment, you can install Apptainer globally by running:
pixi global install apptainer
Examples
Section titled “Examples”InterProScan
Section titled “InterProScan”InterProScan is a widely used tool for annotating protein sequences. It can classify sequences into families and predict the presence of domains, repeats, and functional sites. By integrating several analysis tools, InterProScan compares input sequences against reference entries from the InterPro consortium’s member databases, providing comprehensive functional annotations in a single run.
That said, installing InterProScan can be challenging due to its many dependencies, and recent versions are no longer available through Bioconda. A practical solution is to run it inside a container, which makes the installation process much easier and keeps the environment self‑contained.
This guide explains how to run InterProScan version 5.75‑106.0 locally using Apptainer and use it to annotate the proteome of Promethearchaeum syntrophicum.
-
Create a directory within your
$HOME
to store SIF images:Terminal window mkdir -p $HOME/images -
Pull the InterProScan image from Docker Hub:
Terminal window apptainer pull "$HOME/images/interproscan.sif" "docker://interpro/interproscan:5.75-106.0" -
Download and extract the InterProScan data:
Terminal window wcurl "http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.75-106.0/alt/interproscan-data-5.75-106.0.tar.gz"tar zxfv interproscan-data-5.75-106.0.tar.gz -
To simplify running InterProScan, we’ll define a wrapper function named
interproscan
that will let you run the tool via a simple command, skipping the fullapptainer
syntax each time.To set it up, edit your
.bashrc
file, which is located in your home directory, as shown below:Directory$HOME
- .bashrc
- …
Next, add the function below to the end of the file:
.bashrc interproscan() {local output_dir=""local data_dir=""local i=1# If -help or --help is given, run interproscan.sh --help and exitif [[ "$1" == "-help" || "$1" == "--help" ]]; thenapptainer --silent exec \"$HOME/images/interproscan.sif" \/opt/interproscan/interproscan.sh --helpreturn 0fi# Show usage if no arguments givenif [[ $# -eq 0 ]]; thenecho "Usage:"echo " interproscan --data-dir <path> [interproscan.sh arguments]"echoecho "Example:"echo " interproscan --data-dir /path/to/data --output-dir /path/to/output \\"echo " --applications Pfam,NCBIfam --disable-precalc \\"echo " --cpu 16 --input /path/to/input.faa"echoecho "Documentation for interproscan.sh:"echo " https://interproscan-docs.readthedocs.io/en/v5/HowToRun.html"return 1fi# Parse arguments to find output-dir and data-dirwhile [[ $i -le $# ]]; doif [[ "${!i}" == "--output-dir" ]] && [[ $((i+1)) -le $# ]]; then((i++))output_dir="${!i}"elif [[ "${!i}" == "--data-dir" ]] && [[ $((i+1)) -le $# ]]; then((i++))data_dir="${!i}"fi((i++))doneif [[ -z "$data_dir" ]]; thenecho "Error: --data-dir is required" >&2return 1fi# Check if output directory exists and is not emptyif [[ -d "$output_dir" ]] && [[ -n "$(ls -A "$output_dir" 2>/dev/null)" ]]; thenecho "Error: Output directory '$output_dir' is not empty" >&2return 1fi# Verify data directory existsif [[ ! -d "$data_dir" ]]; thenecho "Error: Data directory '$data_dir' does not exist" >&2return 1fi# Create output directorymkdir -p "$output_dir"# Filter out --data-dir from arguments since it's not passed to interproscan.shlocal args=()i=1while [[ $i -le $# ]]; doif [[ "${!i}" == "--data-dir" ]]; then((i++)) # Skip the flag((i++)) # Skip the valueelseargs+=("${!i}")((i++))fidone# Execute interproscan with proper mountsapptainer --silent exec \-B "$data_dir/data:/opt/interproscan/data" \"$HOME/images/interproscan.sif" \/opt/interproscan/interproscan.sh \"${args[@]}"}Then, source
.bashrc
to load the new function into your active shell:Terminal window source $HOME/.bashrc -
Download the proteome of Promethearchaeum syntrophicum from UniProt:
Terminal window wcurl --output UP000321408.faa "https://rest.uniprot.org/uniprotkb/stream?format=fasta&query=%28%28proteome%3AUP000321408%29%29" -
Execute InterProScan to annotate the proteins by searching the Pfam, NCBIfam, CDD, and HAMAP databases:
Terminal window interproscan \--applications Pfam,NCBIfam,CDD,HAMAP --iprlookup --goterms --pathways \--data-dir interproscan-5.75-106.0 --output-dir UP000321408_interproscan \--cpu 16 --input UP000321408.faaThe annotation results are generated in multiple formats and written to separate output files:
DirectoryUP000321408_interproscan
- UP000321408.faa.gff3
- UP000321408.faa.json
- UP000321408.faa.tsv
- UP000321408.faa.xml
Samtools
Section titled “Samtools”mkdir $HOME/imagesapptainer pull "$HOME/images/samtools.sif" "docker://quay.io/biocontainers/samtools:1.22.1--h96c455f_0"apptainer --silent exec "$HOME/images/samtools.sif" samtools --help
samtools() { apptainer --silent exec "$HOME/images/samtools.sif" samtools "$@"}
apptainer cache clean --force