gargs

better(?) xargs in go

APACHE-2.0 License

Stars
137
gargs - small fix Latest Release

Published by brentp almost 7 years ago

fix panic when bash process never started (e.g. because of 'argument list too long')

gargs - run-time in log output

Published by brentp over 7 years ago

v0.3.8

  • report run-time in log output
gargs - compression

Published by brentp over 7 years ago

0.3.7

  • compress temporary files with gzip (BestSpeed)
  • default to bash instead of sh if it exists and SHELL is not specified
gargs - v0.3.6

Published by brentp over 7 years ago

0.3.6

  • output gargs version in help.
  • restore --ordered (-o) to keep order of output same as input.
    this will cache 3*proccesses output waiting for the slowest job to finish.
    This means that if the user requested 10 processes (-p 10) then there could
    be up to 30 finished jobs waiting for a slow job to finish. If these are input
    memory, they are guaranteed to take <= 1MB (+ go's overhead). If they are larger
    than 1MB, then their data will be on disk.
    This is implemented carefully such that the performance penalty will be small
    unless there are few extremely long-running process outliers.
  • set $PROCESS_I environment variable for each line (or batch of lines).
  • read GARGS_PROCESS_BUFFER to let user set size of data before a tempfile is used.
  • read GARGS_WAIT_MULTIPLIER to determine how many finished processes will wait for single slow processes
    higher values improve concurrency at the expense of memory.
  • better cleanup of tmp files in case of process halt.
gargs - --log and default to continue on error.

Published by brentp about 8 years ago

0.3.5

  • Flush Stdout every 2 seconds.
  • Nice String() output for *Command that show time to run, error, etc.
  • Colorized errors
  • Fix error/exit-code tracking when a tmpfile is used.
  • remove --continue-on-error (-c) and make that the default. Introduce --stop-on-error (-s).
  • add --log argument where each command is logged. If successful it is prefixed with '#' if not, it is printed as-is. If the entire execution ends succesfully, the last line will be '# SUCCESS' otherwise it will show, e.g. '# FAILED 3 commands'. The failed commands are easily grep'ed from the log with "grep -v ^# $log"
gargs - usability improvements

Published by brentp about 8 years ago

As of this version gargs will not read everything into memory as before. It will read up to 1MB. If it does not get an EOF by then it start using a temp-file. This will reduce memory usage.

It also fixes --nlines to be quite useful e.g.:

cat regions.txt | gargs -p 20 -n 10 "bcftools view some.bam {}"

will send 10 regions to each process to amortize the cost of loading the index into memory. Note that the place-holder {} is specified only once.

It also defaults --sep to "\s+" if --nlines is not specified.

Finally, it adds a --retry argument that takes an integer that indicates the number of times a failed process should be retried. This is nice for transient network errors.

gargs - cleanup

Published by brentp over 8 years ago

remove --shell as an argument and get it from $SHELL from the environment.

gargs - --dry-run

Published by brentp over 8 years ago

usage: gargs [--procs PROCS] [--nlines NLINES] [--sep SEP] [--shell SHELL] [--verbose] [--continue-on-error] [--ordered] [--dry-run] COMMAND

positional arguments:
  command                command to execute

options:
  --procs PROCS, -p PROCS
                         number of processes to use [default: 1]
  --nlines NLINES, -n NLINES
                         number of lines to consume for each command. -s and -n are mutually exclusive. [default: 1]
  --sep SEP, -s SEP      regular expression split line with to fill multiple template spots default is not to split. -s and -n are mutually exclusive.
  --shell SHELL          shell to use [default: bash]
  --verbose, -v          print commands to stderr before they are executed.
  --continue-on-error, -c
                         report errors but don't stop the entire execution (which is the default).
  --ordered, -o          keep output in order of input; default is to output in order of return which greatly improves parallelization.
  --dry-run, -d          print (but do not run) the commands (for debugging)
  --help, -h             display this help and exit
gargs - positional arguments.

Published by brentp over 8 years ago

gargs - gargs

Published by brentp over 8 years ago