Run shell commands in a scientifically reproducible and robust way
MIT License
This is a small tool that executes single shell commands in a scientifically more reproducible and robust way, by doing the following things:
There are also some further features that are planned to be introduced further down the road, such as:
scishell
, thati:
o:
before output files), instead of running itdot
command.This is the recommended option for most users.
linux-amd64
in the name for 64bit Linux operating systems.$PATH
variable, such/usr/bin
, or even better ~/bin
, if you make sure that the latter is$PATH
(If not, you can add export PATH=~/bin:$PATH
e.g. to~/.bashrc
file and then restart your shell).sci
command in any newlyThis method assumes that you have installed the Go toolchain.
go install github.com/samuell/scicommander/cmd/sci@latest
This will install the sci
command into your PATH
variable, so that it
should be executable from your shell.
(Other installation options to be added shortly)
To view the options of the sci
command, execute:
sci -h
To get the benefits from SciCommander, do the following:
sci run
command.""
or ''
. This is not strictly>
or piping with |
(Alternatively one can just add quotes around those).Now you will notice that if you run your script again, it will skip all commands that have already finished and produced output files.
You will also have files with the extension .au
for every output that you
decorated with the syntax above.
To convert such an audit report into a nice HTML-report, you can run the following:
sci to-html <audit-file>
To demonstrate how you can use SciCommander, imagine that you want to write the
following little toy bioinformatics pipeline, that writes some DNA and converts
its reverse complement, as a shell script, my_pipeline.sh
:
#!/bin/bash
# Create a fasta file with some DNA
echo AAAGCCCGTGGGGGACCTGTTC > dna.fa
# Compute the complement sequence
cat dna.fa | tr ACGT TGCA > dna.compl.fa
# Reverse the DNA string
cat dna.compl.fa | rev > dna.compl.rev.fa
Now, to make the commands run through SciCommander, change the syntax in the script like this:
#!/bin/bash
# Create a fasta file with some DNA
sci run echo AAAGCCCGTGGGGGACCTGTTC '>' dna.fa
# Compute the complement sequence
sci run cat dna.fa '|' tr ACGT TGCA '>' dna.compl.fa
# Reverse the DNA string
sci run cat dna.compl.fa '|' rev '>' dna.compl.rev.fa
Notice that we had to wrap all pipe characters (|
) and redirection characters
(>
) in quotes. This is so that they are not grabbed by bash immediately but
instead passed with the command to SciCommander, and executed as part of its
execution.
An alternative is to encapsulate the full commands in ''
:
#!/bin/bash
# Create a fasta file with some DNA
sci run 'echo AAAGCCCGTGGGGGACCTGTTC > dna.fa'
# Compute the complement sequence
sci run 'cat dna.fa | tr ACGT TGCA > dna.compl.fa'
# Reverse the DNA string
sci run 'cat dna.compl.fa | rev > dna.compl.rev.fa'
Now you can run the script as usual, e.g. with:
bash my_pipeline.sh
Now, the files in your folder will look like this, if you list them with ls -tr
:
my_pipeline.sh
dna.fa.au
dna.fa
dna.compl.fa.au
dna.compl.fa
dna.compl.rev.fa.au
dna.compl.rev.fa
Now, you see that the last .au
file is dna.compl.rev.fa.au
.
To convert this file to HTML and view it in a browser, you can do:
sci to-html dna.compl.rev.fa.au
There is experimental support for running SciCommander commands in bash,
without needing to run them via the sci run
command.
To do this, start the SciCommander shell with the following command:
sci shell
And then, you can run the example commands above as follows:
# Create a fasta file with some DNA
echo AAAGCCCGTGGGGGACCTGTTC > dna.fa
# Compute the complement sequence
cat dna.fa | tr ACGT TGCA > dna.compl.fa
# Reverse the DNA string
cat dna.compl.fa | rev > dna.compl.rev.fa
In other words, no extra syntax is needed.
[1] Although Nextflow and Snakemake already take care of some of the benefits, such as atomic writes, SciCommander adds additional features such as detailed per-output audit logs. It can thus be a great complement to these tools.