Skip to content

Linux

  • ripgrep
  • fd
  • zet
  • rush
  • xan

Piping and redirecting are fundamental concepts in Linux that allow you to control how data flows between commands and files. They make the command line incredibly powerful by enabling you to chain commands together and manage input/output streams.

  • stdin: Data flows into a program.
  • stdout: Data flows out of a program.
  • stderr: Error messages flow out of a program separately.
graph LR
    INPUT["Data sources"]
    
    subgraph PROGRAM["Process"]
        STDIN["stdin"] --> PROC["🖥️"] 
        PROC --> STDOUT["stdout"]
        PROC --> STDERR["stderr"]
    end
    
    OUTPUT["Data destinations"]
    
    INPUT --> STDIN
    STDOUT --> OUTPUT
    STDERR --> OUTPUT

The pipe operator | connects the stdout of one command to the stdin of another command.

graph LR
    CMD1("Command 1")
    CMD2("Command 2")
    TERM("Terminal")
    
    CMD1 -->|"stdout"| PIPE("|")
    PIPE -->|"stdin"| CMD2
    CMD1 -->|"stderr"| TERM
Counting entries in a FASTA file
awk '/^>/' genome.fasta | wc -l
Terminal window
# ls writes to stdout → grep reads from stdin
ls -la | grep ".txt"
# cat writes file content to stdout → wc reads from stdin
cat file.txt | wc -l
# Multiple pipes create a pipeline
ps aux | grep firefox | awk '{print $2}'
# ↑ ↑
# stdout→stdin stdout→stdin
OperatorDescriptionExample
>Redirect output to file (overwrite)echo "Hello" > file.txt
>>Redirect output to file (append)echo "World" >> file.txt
2>Redirect errors to filecommand 2> errors.log
&>Redirect both output and errorscommand &> all.log
2>&1Merge error stream (stderr → stdout)command > /dev/null 2>&1
<Read input from filesort < names.txt

When you redirect output back to the same file you’re reading from, the operation fails in an unexpected way. Consider this example where you want to sort sequences in a FASTA file by length using SeqKit and save the result back to the same file:

Terminal window
seqkit sort -l sequences.fasta > sequences.fasta

This command empties file.txt completely. The > operator clears the file to prepare for writing before sort gets a chance to read it, resulting in total data loss.

The sponge command solves this problem by reading everything first, then writing to the file:

Terminal window
seqkit sort -l sequences.fasta | sponge sequences.fasta

Since sponge isn’t included in standard Linux installations, you’ll need to install it through the moreutils package. Using Pixi:

Terminal window
pixi global install moreutils

The moreutils package includes sponge along with several other handy command-line tools.

/dev/stdin and /dev/stdout are special file paths that map directly to a process’s standard input and output streams. They allow you to treat these streams like regular files. These pseudo-files are particularly useful when working with programs that don’t support piping directly. Instead of creating temporary files, you can:

  • Pass /dev/stdin as a filename argument to programs that expect to read from a file, allowing them to read from piped input instead.
  • Use /dev/stdout as a filename argument when programs expect to write to a file, redirecting their output to the terminal or another piped command.

One use case is MUSCLE, a tool for multiple sequence alignment that only accepts input and output as files. To make MUSCLE read from stdin and display the output in the terminal (stdout), you can run it like this:

Terminal window
echo -e ">seq1\nMYYGR\n>seq2\nMRYR" | muscle -align /dev/stdin -output /dev/stdout
Terminal window
cat file.txt | sort | uniq > sorted.txt
command 2>&1 | tee output.log
find . -name "*.log" 2> /dev/null
Terminal window
zet intersect <(echo -e "A\nB\nC") <(echo -e "B\nC\nD")