sq

sq data wrangler

MIT License

Stars
733

Bot releases are hidden (Show)

sq - v0.39.0

Published by neilotoole over 1 year ago

Added

  • #263: sq version now supports --yaml output.
  • #263: sq version now outputs host OS details with --verbose, --json
    and --yaml flags. The motivation behind this is bug submission: we want
    to know which OS/arch the user is on. E.g. for sq version -j:
{
  "version": "v0.38.1",
  "commit": "eedc11ec46d1f0e78628158cc6fd58850601d701",
  "timestamp": "2023-06-21T11:41:34Z",
  "latest_version": "v0.39.0",
  "host": {
    "platform": "darwin",
    "arch": "arm64",
    "kernel": "Darwin",
    "kernel_version": "22.5.0",
    "variant": "macOS",
    "variant_version": "13.4"
  }
}
  • #263: The output of sq inspect and sq inspect -v has been refactored
    significantly, and should now be easier to work with (docs).
sq - v0.38.1

Published by neilotoole over 1 year ago

Fixed

  • #261: The JSON writer (--json) could get deadlocked when a record contained
    a large amount of data, triggering an internal Flush() (which is mutex-guarded)
    from within the mutex-guarded WriteRecords() method.
sq - v0.38.0

Published by neilotoole over 1 year ago

This release has significant improvements (and breaking changes) to SLQ (sq's query language).

Changed

  • ☢️ #254: The formerly-implicit "WHERE" mechanism now requires an explicit where() function.
    This, alas, is a fairly big breaking change. But it's necessary to remove an ambiguity roadblock.
    See discussion in the issue.

    # Previously
    $ sq '.actor | .actor_id <= 2'
    
    # Now
    $ sq '.actor | where(.actor_id <= 2)'
    
  • #256: Column-only queries are now possible. This has the neat side effect
    that sq can now be used as a calculator.

    $ sq 1+2
    1+2
    3
    

    You may want to use --no-header (-H) when using sq as a calculator.

    $ sq -H 1+2
    3
    $ sq -H '(1+2)*3'
    9
    

Fixed

  • Literals can now be selected (docs).

    $ sq '.actor | .first_name, "X":middle_name, .last_name | .[0:2]'
    first_name  middle_name  last_name
    PENELOPE    X            GUINESS
    NICK        X            WAHLBERG
    
  • Lots of expressions that previously failed badly, now work.

    $ sq '.actor | .first_name, (1+2):addition | .[0:2]'
    first_name  addition
    PENELOPE    3
    NICK        3
    
  • #258: Column aliases can now be arbitrary strings, instead of only a
    valid identifier.

    # Previously only valid identifier allowed
    $ sq '.actor | .first_name:given_name | .[0:2]'
    given_name
    PENELOPE
    NICK
    
    # Now, any arbitrary string can be used
    $ sq '.actor | .first_name:"Given Name" | .[0:2]'
    Given Name
    PENELOPE
    NICK
    
sq - v0.37.1

Published by neilotoole over 1 year ago

Fixed

  • #252: Handle *uint64 returned from DB.
sq - v0.37.0

Published by neilotoole over 1 year ago

Added

  • #244: Shell completion for sq add LOCATION. See docs.
sq - v0.36.2

Published by neilotoole over 1 year ago

Changed

 # mysql "date_format" func
 $ sq '@sakila/mysql | .payment | _date_format(.payment_date, "%m")'
 
 # Postgres "date_trunc" func
 $ sq '@sakila/postgres | .payment | _date_trunc("month", .payment_date)'
sq - v0.36.1

Published by neilotoole over 1 year ago

Fixed

  • sq diff: Renamed --count flag to --counts as intended.
sq - v0.36.0

Published by neilotoole over 1 year ago

The major feature is the long-gestating sq diff.

Added

  • #229: sq diff compares two sources, or tables.
  • sq inspect --dbprops is a new mode that returns only the DB properties.
    Relatedly, the properties mechanism is now implemented for all four supported
    DB types (previously, it was only implemented for Postgres and MySQL).
  • CSV format now colorizes output.

Changed

  • sq inspect -v previously returned DB properties in a field named db_variables.
    This field has been renamed to db_properties. The renaming reflects the fact
    that some of those properties aren't really variables in the sense that they
    can be modified (e.g. DB server version or such).
  • The structure of the former db_variables (now db_properties) field has
    changed. Previously it was an array of {"name": "XX", "value": "YY"} values,
    but now is a map, where the keys are strings, and the values can be either
    a scalar (bool, int, string, etc.), or a nested value such as an array
    or map. This change is made because some databases (e.g. SQLite) feature
    complex data in some property values.
  • CSV format now renders byte sequences as [777 bytes] instead of dumping
    the raw bytes.
  • ☢️ TSV format (--tsv) no longer has a shorthand form -T. Apparently that
    shorthand wasn't used much, and -T is needed elsewhere.
  • ☢️ Likewise, --xml no longer has shorthand -X. And --markdown has lost alias --md.
  • In addition to the format flags --text, --json, etc., there is now
    a --format=FORMAT flag, e.g. --format=json. This will allow sq to
    continue to expand the number of output formats, without needing to have
    a dedicated flag for each format.
sq - v0.35.0

Published by neilotoole over 1 year ago

Added

  • #8: Results can now be output in YAML.

Fixed

  • sq config get OPT --text now prints only the value, not KEY VALUE.
    If you want to see key and value, consider using --yaml, or --text --verbose.
sq - v0.34.2

Published by neilotoole over 1 year ago

Fixed

  • Both --markdown and the alias --md are now supported.
sq - v0.34.1

Published by neilotoole over 1 year ago

Fixed

  • Fixed a minor issue where sq ls -jv and sq ls -yv produced no output
    if config contained no explicitly set options.
sq - v0.34.0

Published by neilotoole over 1 year ago

This release significantly overhauls sq's config mechanism (#199).
For an overview, see the new config docs.

Alas, this release has several minor breaking changes ☢️.

Added

  • sq config ls shows config.
  • sq config get gets individual config option.
  • sq config set sets config values.
  • sq config edit edits config.
    • Editor can be specified via $EDITOR or $SQ_EDITOR.
  • sq config location prints the location of the config dir.
  • --config flag is now honored globally.
  • Many more knobs are exposed in config.
  • Logging is much more configurable. There are new knobs:
    $ sq config set log true
    $ sq config set log.level INFO
    $ sq config set log.file /var/log/sq.log
    
    There are also equivalent flags (--log, --log.file and --log.level) and
    envars (SQ_LOG, SQ_LOG_FILE and SQ_LOG_LEVEL).
  • Several more commands support YAML output:

Changed

  • The structure of sq's config file (sq.yml) has changed. The config
    file is automatically upgraded when using the new version.
  • The default location of the sq log file has changed. The new location
    is platform-dependent. Use sq config get log.file -v to view the location,
    or sq config set log.file /path/to/sq.log to set it.
  • ☢️ Envar SQ_CONFIG replaces SQ_CONFIGDIR.
  • ☢️ Envar SQ_LOG_FILE replaces SQ_LOGFILE.
  • ☢️ Format flag --table is renamed to --text. This is changed because while the
    output is mostly in table format, sometimes it's just plain text. Thus
    table was not quite accurate.
  • ☢️ The flag to explicitly specify a driver when piping input to sq has been
    renamed from --driver to --ingest.driver. This change aligns
    the naming of the ingest options and reduces ambiguity.
    # previously
    $ cat mystery.data | sq --driver=csv '.data'
    
    # now
    $ cat mystery.data | sq --ingest.driver=csv '.data'
    
  • ☢️ sq add no longer has the generic --opts x=y mechanism. This flag was
    ambiguous and confusing. Instead, use explicit option flags.
    # previously
    $ sq add ./actor.csv --opts=header=false
    
    # now
    $ sq add ./actor.csv --ingest.header=false
    
  • ☢️ The short form of the sq add --handle flag has been changed from -h to
    -n. While this is not ideal, the -h shorthand is already in use everywhere
    else as the short form of --header.
    # previously
    $ sq add ./actor.csv -h @actor
    
    # now
    $ sq add ./actor.csv -n @actor
    
  • ☢️ The --pretty flag has been removed. Its only previous use was with the
    json format, where if --pretty=false would output the JSON in compact form.
    To better align with jq, there is now a --compact / -c flag that behaves
    identically to jq.
  • ☢️ Because of the above --compact / -c flag, the short form of the --csv
    flag is changing from -c to -C. It's an unfortunate situation, but alignment
    with jq's behavior is an overarching principle that justifies the change.
sq - v0.33.0

Published by neilotoole over 1 year ago

The headline feature is source groups.
This is the biggest change to the sq CLI in some time, and should make working with lots of sources much easier.

Added

  • #192: sq now has a mechanism to group sources. A source handle can
    now be scoped. For example, instead of @sakila_prod, @sakila_staging, etc,
    you can use @prod/sakila, @staging/sakila. Use sq group prod to
    set the active group (which sq ls respects). See docs.
  • sq group GROUP sets the active group to GROUP.
  • sq group returns the active group (default is /, the root group).
  • sq ls GROUP lists the sources in GROUP.
  • sq ls --group (or sq ls -g) lists all groups.
  • sq mv moves/renames sources and groups.

Changed

  • sq ls now shows the active item in a distinct color. It no longer adds
    an asterisk to the active item.
  • sq ls now sorts alphabetically when using --table format.
  • sq ls now shows the sources in the active group only. But note that
    the default active group is / (the root group), so the default behavior
    of sq ls is the same as before.
  • sq add hello.csv will now generate the handle @hello instead of @hello_csv.
    On a second invocation, it will return @hello1 instead of @hello_csv_1. Why
    this change? Well, with the availability of the source group mechanism, the _ character
    in the handle somehow looked ugly. And more importantly, _ is a relative pain to type.
  • sq ping has changed to support groups. Instead of sq ping --all, you can
    do sq ping GROUP, e.g. sq ping /.
sq - v0.32.0

Published by neilotoole over 1 year ago

Added

  • #187: For csv sources, sq will now try to auto-detect if the CSV file
    has a header row or not. Previously, this needed to be explicitly specified
    via an awkward syntax:

    $ sq add ./actor.csv --opts=header=true
    

    This change makes working with CSV files significantly lower friction.
    A command like the below now almost always works as expected:

    $ cat ./actor.csv | sq .data
    

    Support for Excel/XLSX header detection is in #191.

Fixed

  • sq is now better at detecting the (data) kind of CSV fields. It now more
    accurately distinguishes between Decimal and Int, and knows how to
    handle Datetime.

  • #189: sq now treats CSV empty fields as NULL.

sq - v0.31.0

Published by neilotoole over 1 year ago

Added

  • #173: Predefined variables via --arg
    flag (docs):
    $ sq --arg first TOM '.actor | .first_name == $first'
    

Changes

  • Use --md instead of --markdown for outputting Markdown.

Fixed

  • #185: sq inspect now better handles "too many connections" situations.
  • go.mod: Moved to jackc/pgx v5.
  • Refactor: switched to slog logging library.
sq - v0.30.0

Published by neilotoole over 1 year ago

Added

  • #164: Implemented unique function (docs):
    $ sq '.actor | .first_name | unique'
    
    This is equivalent to:
    SELECT DISTINCT first_name FROM actor
    
  • Implemented count_unique function (docs).
    $ sq '.actor | count_unique(.first_name)'
    

Changed

  • The count function has been changed (docs)
    • Added no-args version: .actor | count equivalent to SELECT COUNT(*) AS "count" FROM "actor".
    • BREAKING CHANGE: The "star" version (.actor | count(*)) is no longer supported; use the
      naked version instead.
  • Function columns are now named according to the sq token, not the SQL token.
    # previous behavior
    $ sq '.actor | max(.actor_id)'
    max("actor_id")
    200
    
    # now
    $ sq '.actor | max(.actor_id)'
    max(.actor_id)
    200
    
sq - v0.29.0

Published by neilotoole over 1 year ago

Added

Changed

  • Renamed groupby to group_by to match jq.
  • Renamed orderby to order_by to match jq.
sq - v0.28.0

Published by neilotoole over 1 year ago

Added

sq - v0.27.0

Published by neilotoole over 1 year ago

Added

sq - v0.26.0

Published by neilotoole over 1 year ago

Added

  • #98: Whitespace is now allowed in SLQ selector names. You can
    do @sakila | ."film actor" | ."actor id".

Fixed

  • #155: sq inspect now populates schema field in JSON for MySQL,
    SQLite, and SQL Server (Postgres already worked).