dolt

Dolt – Git for Data

APACHE-2.0 License

Downloads
2.4K
Stars
17.1K
Committers
143

Bot releases are hidden (Show)

dolt - 0.22.7

Published by oscarbatori almost 4 years ago

We are pleased to announce Dolt 0.22.7.

This release focuses on bug fixes, and performance improvements in SQL. In particular delivered huge performance improvements in our SQL implementation. You can find the scope of these performance improvements detailed on our benchmarks page.

Merged PRs

  • 1116: partition ranges, covering indexes, smarter iterators
  • 1111: README quotes changed bugfix for windows terminal
    On the README there are instructions on how to add values into a table. The values in one portion have single quotes on the outside and have double-quotes for any string. While that format works in a Unix terminal, it doesn't work in
  • 1109: Attempt to add default decimal type to FromKind
  • 1108: Fixed dolt status output incorrectly displayed for staged files
    The function printStagedDiffs always returned 0, even when there were diffs not staged. This return was also causing it to print in printStatus "nothing to commit, working tree clean". This was not the case.
    I changed printStagedDiffs to return the number of the staged tables plus the number of staged docs instead. This prevents it from entering the if statement with the print also.
  • 1107: go/libraries/doltcore/{row, sqle, table}: Generalize TableReader
    Created table.SqlTableReader as a replacement for directly reading from table maps. Used it to replace types.MapIterator in sqle.doltTableRowIter
  • 1106: Added verify-constraints command
  • 1105: /MySQLDockerfile: peg version to match Gemfile.lock BUNDLED WITH
  • 1103: /go/cmd/dolt: added feature flag for keyless schemas
  • 1102: go/libraries/doltcore/{doltdb,table}: remove row access methods from doltdb.Table
    Removed:
    • Table.GetRowByPKVals()
    • Table.GetRow()
    • Table.GetRows()
      Had do to some refactoring along the way to fix dependency cycles.
      Reversed dependency rowconv -> pipeline to pipeline -> rowconv
  • 1101: Ensure that MERGE() works properly with fast forward.
    Added test case as well.
  • 1098: Export NewJSONReader to use in dolthubapi
  • 1097: Fixed table import allowing NULLs in the primary key
    Fixes https://github.com/dolthub/dolt/issues/1096
  • 1093: go/libraries/doltcore/table/editor: Convert TableEditor to interface
  • 1090: Add --author, -m to COMMIT. Add --author to MERGE()
    COMMIT('-m', 'hi', '--author', 'John Doe [email protected]')
    MERGE('feature-branch', '--author', 'John Doe [email protected]')
  • 1089: Add the Dolt mascot to README
  • 1088: fixed rand seed
  • 1087: increase query parallelism from the default of 2 to 8
  • 1085: split TableEditors and IndexEditor to their own package
  • 1084: bats/: keyless spec
    This is a set of skipped BATS tests that provide a spec for keyless tables.
  • 1082: Fixed internal index comparisons considering unnecessary parameters
    Fixes https://github.com/dolthub/dolt/issues/1081
  • 1080: Fixed shell error loop on UNIQUE violation
    Fixes https://github.com/dolthub/dolt/issues/1079
  • 1078: Upgraded to latest go-mysql-server with support for indexed joins on any number of tables
  • 1075: Support for CURRENT_USER SQL function without ()
  • 1074: Add dolt_commit error check when autocommit is off
    Fails loudly when autocommit is off for dolt_commit.
  • 245: Fixed tuple comparisons
  • 240: Enginetests for Keyless tables
  • 239: naked functions
    Fix for naked CURRENT_USER function call was in vitess, this just adds tests.

Closed Issues

  • 1099: MERGE() is creating a new commit on FFs.
  • 1096: Table import can allow NULLs in the primary key
  • 1081: "string is too large for column"
  • 1079: Indefinitely errors in SQL shell once a UNIQUE constraint has been violated
  • 1071: Throw error in DOLT_COMMIT if autocommit is not true
  • 241: expression.Tuple is uncomparable
dolt - 0.22.6

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1068: Add -a flag to cli and DOLT_COMMIT
    This pr adds a -a flag to dolt commit and DOLT_COMMIT. It stages all tables.
    It also cleans up some of the previous work done in #1056 by removing all method handlers in repo_state and moving them to the RepoStateReader and RepoStateWriter
    It does not refactor the RSR/RSW interfaces in environment.go. This will be done in a subsequent pr.

Closed Issues

  • 1067: Support dolt commit -a
dolt - 0.22.5

Published by oscarbatori almost 4 years ago

This is a patch release that adds no new features or bug fixes.

Merged PRs

  • 1065: fix typo in GA for Homebrew
  • 1064: Fix brew formula
dolt - 0.22.3

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1062: Updated go-mysql-server with a patch to fix failing function expressions
  • 238: Zachmu/funcs
    Got rid of all embedded function fields in function types, since they make it impossible for analyzer to finish (function fields are not comparable with reflect.DeepEquals, which the analyzer uses to decide if the query plan has settled).
dolt - 0.22.2

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1059: pass in-memory gc gen when we conjoin
    On the conjoin path, we're not passing "garbage collection generation" when we update the manifest. NomsBlockStore interprets this as the c
    onjoin having been preempted by and out-of-process write and blocks the write.
  • 1052: Vinai/dolt commit author no config
    Add a bats test that models the following behavior.
    1. Unsets name and user.
    2. Makes a sql change
    3. Add a commit with --author
dolt - 0.22.1

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1045: Conslidated benchmark directory
  • 1043: Vinai/1034 add author info
    This adds the --author tag to dolt commit
  • 1042: Vinai/clean up tags
    Cleans up some of the comments left on #1041
  • 1041: Vinai/1023 remove tag info
    Fixes #1022 and #1023
  • 1040: go/libraries/doltcore/remotestorage: Add hedged download requests to remotestorage/chunk_store.
  • 1039: go/libraries/doltcore/remotestorage: Refactor urlAndRanges a bit.
  • 1038: go/libraries/doltcore/remotestorage: Simplify concurrentExec implementation with errgroup.
  • 1037: proto: Add StreamDownloadLocations to ChunkStoreService.
  • 1036: go/cmd/dolt/commands: added write queries and ancestor commit to filter-branch
  • 1033: Temporary parallelism implementation on indexes
  • 1031: reset --hard
  • 1029: Added dolt_version()
    As version() is used to emulate the target MySQL version, I've added dolt_version() so that one may specifically query the dolt version.
  • 1026: Increased the default sql server timeout to 8 hours
  • 1025: go/libraries/doltcore/remotestorage: chunk_store.go: Small improvements to GetDownloadLocations batch size and HTTP GET error logging.
  • 1024: dolt filter-branch
  • 1022: Skipping tests broken by recent changes to info schema (EXTRA)
  • 1021: != operator now uses indexes

Closed Issues

  • 1034: Add --author option to dolt commit
  • 1023: Remove tag info from EXTRA in dolt SQL schema
dolt - 0.22.0

Published by oscarbatori almost 4 years ago

We are excited to announce the minor version release of Dolt 0.22.0.

SQL Tables

We continue to expand the SQL tables that surface information about the commit graph, in this release we added:

  • dolt_commits
  • dolt_commit_ancestors
  • dolt_commit_diffs_<table>

SQL

We added support for prepared statements to our SQL server.

Merged PRs

  • 1019: Fix dolt ls --all
  • 1018: mysql-client-tests: Add some simple client connector tests for prepared statements.
  • 1016: Rewrote the README
  • 1015: go/go.mod: Bump go-mysql-server; support prepared statements.
  • 1014: Added bats test for index merging from branch without index
  • 1013: dolt_commits and dolt commit_ancestors tables
  • 1012: added reset_hard() sql function
  • 1011: Bh/commit diff
  • 1009: Richer commit message for Dolt Homebrew bump
  • 1008: Mergeable Indexes Pt. 2
    Tests for mergeable indexes
  • 1002: s/liquidata-inc/dolthub/ for ishell and mmap-go
  • 1001: NewCreatingWriter breaks dolthubapi with recent changes
    There might be a better fix for this, but dolthubapi uses NewCreatingWriter which breaks with Andy's recent changes (it's being used in dolthubapi here)
  • 233: Reorder Master
  • 232: Indexes search for exact match rather than submatches
  • 231: added 'auto_increment' to EXTRA field for DESCRIBE table;
  • 229: Add support for prepared statements.

Closed Issues

  • 1007: dolt push does not seem to push correctly in Windows Powershell
  • 1003: AUTO_INCREMENT column info does not display in describe table output
dolt - 0.21.4

Published by oscarbatori almost 4 years ago

Merged PRs

  • 999: Another fix to brew bump job
  • 997: Fix typo in brew bump
    From failed run on most recent release:
    Screen Shot 2020-11-05 at 6 29 02 PM
dolt - 0.21.2

Published by oscarbatori almost 4 years ago

Merged PRs

  • 995: support for ALTER TABLE AUTO_INCREMENT
  • 994: Updated namespace for sqllogictest
  • 993: Added WSL notice to README
  • 990: mysql auto increment semantics
  • 989: Fix a few docs typos
  • 988: {bats, go}: Some fixes to InferSchema and add bats test
  • 987: Turbine Import Fix
  • 985: go/**/*.go: Update copyright headers for company name change.
  • 982: go/libraries/utils/async: Have ActionExecutor use sync.WaitGroup.
  • 981: Attempt to clean up error signaling in diff summary.
  • 980: In prettyPrint, defer closing the iterator before doing anything else
    We were missing close() when an UPDATE or INSERT etc. had an error during cursor iteration, therefore leaving a server process running. Also save sql history file before executing the query, so it gets saved even if the user interrupts execution.
  • 976: /.github/workflows/ci-go-tests.yaml: run go tests only when go/ changes
    I think this might be a good addition... will only run go tests when there are go changes
  • 975: Extract some import logic to be used in dolthubapi
    In reference to this comment https://github.com/dolthub/ld/pull/5262#discussion_r514465176
    I had some duplicate logic in dolthubapi for the import table api. I extracted some logic so that I can use InferSchema and MoveDataToRoot to root to reduce some of the duplications
  • 974: Skipped two newly added test queries that don't work in dolt yet
  • 973: Support for COM_LIST_FIELDS, fixed SHOW INDEXES
  • 972: Update README.md
    Removed errant Liquidata reference
  • 971: Added GitHub workflow tests for race conditions
    Will fail until https://github.com/dolthub/dolt/pull/967 is merged into master, however the workflow only works when the PR is based against master. Therefore this PR does not target the aforementioned PR's branch.
  • 970: Memory fix for CREATE INDEX
    Used a pre-existing 16 million row repo to test CREATE INDEX memory usage on.
    Before:
    72.47GB RAM Usage
    18min 48sec
    After:
    1.88GB RAM Usage
    2min 2sec
    Copied the same strategy as used in table_editor.go to periodically flush the contents once some arbitrary amount of operations have been performed.
  • 967: go: Make all tests pass under -race.
  • 966: go/store/types/edits: Rework AsyncSortedEdits to use errgroup, and a transient goroutine for each work item.
  • 965: dolt merge --no-ff
  • 225: Andy/mysql auto increment
  • 224: Zachmu/xx
    Use xxhash everywhere, and standardize the construction of hash keys.
  • 223: Zachmu/in subquery
    Implemented hashed lookups for IN (SELECT ... ) expressions. This is about 5x faster than using indexed lookups into the subquery table in tests.
    In a followup I'm going to replace the existing CRC64 hashing with xxhash everywhere it's used, so we're back to a single hash function.
  • 221: Fixed bug in delete and update caused by indexes being pushed down to tables
  • 220: Support for COM_LIST_FIELDS, fixed SHOW INDEXES
  • 219: Zachmu/turbine perf
    1. Do pushdown analysis within subqueries
    2. Push index lookups down to tables in subqueries
  • 218: Fix unit tests to run with -race.
  • 217: validate auto_increment on in-line and out-of-line PR defs

Closed Issues

  • 978: Support UNIQUE in CREATE TABLE statements, not just in CREATE INDEX statements
  • 962: Index creation must not be limited by working memory
  • 961: UNIQUE does not work on field level
  • 959: Cannot create UNIQUE index on FK fields Dolt considers it duplicate
dolt - 0.21.1

Published by oscarbatori almost 4 years ago

We are excited to announce the release of Dolt 0.21.1, a patch release with functionality and performance improvements.

Benchmarks

A significant new aspect of the Dolt release process will be providing SQL benchmarks. You can read a blog about our approach to benchmarking using sysbench here, and you can find the benchmarking tools here. By way of example the benchmarks for this release were created with the following command:

./run_benchmarks.sh bulk_insert oscarbatori v0.21.0 v0.22.1

This produced the following result, which we host on DoltHub:

Merged PRs

  • 957: go/store/{datas,util/tempfiles}: Fix some races in map writes. One effects clones, one effects only tests.
  • 953: create auto_increment tables with out-of-line PR defs
  • 952: go/libraries/doltcore/sqle: Add support for UPDATE and DELETE using table indexes.
  • 949: auto increment
  • 947: don't drop column values on column drop
  • 946: go/cmd/dolt: commands/sql: Small improvement to only call rowIter.Close() once on sql results iterators.
  • 945: Use docker-compose for orchestrating benchmarking
  • 944: go/store/types: value_store: Optimize GC to work in parallel and use less memory.
  • 942: feature gating
  • 941: Upgraded to latest go-mysql-server and re-enabled query plan tests
  • 939: Added new indexes overwriting auto-generated indexes
  • 938: go/store/{nbs,chunks}: Convert some core methods to provide results in callbacks. Convert some functions to use errgroup.
  • 937: Update README.md with the latest dolt commands
  • 934: Add go routine to clone
    I parallelized the table file writing process by using go routines. Specifically, I made use of the "golang.org/x/sync/errgroup" package which allows for convenient error management across a waitgroup.
    A couple of benchmarks I tested this on were
    1. Dolt-benchmarks-test: No difference in speed really
    2. Coronavirus: Original ~30sec. Current 15sec
    3. Tatoeba Sentence Translation: Original: ~17mins Current: 10mins
  • 933: /go/libraries/doltcore/diff: Ignore NULLs in cell-wise diff
    fix for https://github.com/dolthub/dolt/issues/899
    The from root in this repo has NULL values written to the map which causes erroneous diffs.
    https://www.dolthub.com/repositories/dolthub/us-supreme-court-cases/compare/master/hb502v6tf3uj43ijfhot6dopmgdm1muk
  • 932: /go/cmd/dolt/commands: Help Text Fix
  • 216: Updated sql.MergeableIndexLookup interface
  • 215: memory: *_index.go: Construct sql equality evaluations with accurate types in the literals.
  • 214: auto increment
  • 213: triggers bugfix
    Fixed bug in insert triggers, which couldn't handle out-of-order column insertions.
    Fixes https://github.com/dolthub/dolt/issues/950
  • 212: sql/analyzer: pushdown.go: Allow pushdown on Update, RowUpdateAccumulator and DeleteFrom plan nodes.
  • 211: join bugs
  • 210: sql/plan: {update,insert,update,process}.go: Fix some potential issues with context lifecycle and reuse.
    • insert, update, delete: Only call underlying table editors with our captured
      context once when we are Close(). Return a nil error after that.
    • process: Change to only call onDone when the rowTrackingIter is Closed.
    • process: Change to call childIter.Close() before onDone is called. Child
      iterators have a right to Close() before the context in which they are
      running is canceled.
  • 208: Create UNIQUE index if present in column definition
  • 207: Pushdown and plan decoration
    Two major changes:
    1. Changes to pushdown operation, to push table predicates below join nodes and to fix many bugs and deficiencies. Also large refactoring.
    2. Added DecoratedNodes to query plans to illustrate when indexes are being used to access tables outside the context of a join
dolt - 0.21.0

Published by oscarbatori about 4 years ago

We are excited to announce the release of Dolt 0.21.0. This release contains a host of exciting features, as well as our usual blend of bug fixes and performance improvements.

Squash merge

As a result of our own internal data collaboration projects, we realized that a squash command for condensing change sets as a consideration for collaborators was an essential tool. This is now in Dolt.

NFS Mounted Drives

A user highlighted that Dolt didn't work with NFS mounted drives due to the way it was interacting with the filesystem. We have now fixed this.

Garbage Collection

We now have a dolt gc command for cleaning up unreferenced data. This was requested by several users as a space saving mechanism in production settings.

Performance Improvements

We continue to aggressively pursue performance improvements, most notably a huge improvement in full table scans.

sysbench tooling

As we detailed in a blogpost yesterday we have created a tooling to provide our development team and contributors with a simple way to measure SQL performance. For example, to compare a arbitrary commit to the current working set (to test whether changes introduce expected performance benefits):

$ ./run_benchmarks.sh bulk_insert <username> 19f9e571d033374ceae2ad5c277b9cfe905cdd66

This will build Dolt at the appropriate commits, spin up containers with sysbench, and execute the benchmarks.

Documentation Fixes

An open source contributor provided several fixes to our CLI documentation, which we have gratefully merged.

GCP Remotes

We have fixed Google Cloud Platform remotes motivated by a bug report from a user experimenting with Dolt.

Merged PRs

  • 930: Bump go-mysql-server
  • 929: store/types: value_store.go: GC implementation uses errgroup instead of atomicerr.
  • 928: gc chunks
    Implements garbage collection by traversing a Database from its root chunk and coping all reachable chunks to a new set of NBS tables.
    While "garbage collection generation" will protect the NBS from corruption by out-of-process writers, GC is not currently thread safe for concurrent use in-process. Getting to online GC will require work around protecting in-progress writes that are not yet reachable from the root chunk.
  • 927: /.github/workflows/ci-bats-tests.yaml: skip aws tests if no secrets found
  • 925: benchmark tools
  • 923: doc corrections
    fixed some typos (I think 😊)
  • 922: go/util/sremotesrv: grpc.go: Echo the client's NbsVersion in GetRepoMetadata.
  • 921: fix gcp remotes
  • 920: go/go.mod: Adopt dolthub/fslock fork. Forked version uses Open(RDRW) for lock file on *nix, which works on NFS.
  • 918: /.github/workflows/ci-bats-tests.yaml: remove deprecated syntax
  • 917: Increase maxiumum SQL statement length to 100MB (initially 512K)
    Signed-off-by: Zach Musgrave [email protected]
  • 915: Daylon's suggestions for bheni perf PR Pt. 2
  • 914: Fix for reading old dolt_schemas
  • 913: squash merge
  • 912: go/store/{datas,nbs}: Use application-level temp dir for byte sink chunk files with datas.Puller.
  • 911: Daylon's suggestions for bheni perf PR
  • 910: Adding "Garbage Collection Generation" to manifest file
    This new manifest field will support NomsBlockStore garbage collection and protect against NBS corruption. Storing gcGen in the manifest will support deleting chunks from an NBS in a safe way. NBS instances that see a different gcGen than they saw when they last read the manifest will error and require clients to re-attempt their write.
    NBS will now have three forms of write errors (not including IO errors or other kinds of unexpected errors):
    • nbs.errOptimisticLockFailedTables: Another writer landed a manifest update since the last time we read the manifest. The root chunk is unchanged and the set of chunks referenced in the manifest is either the same or has strictly grown. Therefore the NBS can handle this by rebasing on the new set of tables in the manifest and re-attempting to add the same set of novel tables.
    • nbs.errOptimisticLockFailedRoot: Another writer landed a manifest update that includes a new root chunk. The set of chunks referenced in the manifest is either the same or has strictly grown, but it is not know which chunk are reachable from the new root chunk. The NBS has to pass this value to the client and let them decide. If the client is a datas.database it will attempt to rebase, read the head of the dataset it is committing to, and execute its mergePolicy (Dolt passes a noop mergePolicy).
    • chunks.ErrGCGenerationExpired: This is similar to a moved root chunk, but with no guarantees about what chunks remain in the ChunkStore. Any information from CS.Has(ctx, chunk) is now stale. Writers must rewrite all data to the chunkstore.
  • 909: use tr to lowercase output instead of {output,,}
    lowercasing via parameter expansion ${output,,} is only supported on Bash 4+. I switched to using tr so I could run the tests locally.
  • 205: Implemented drop trigger
    As discussed, we disallow dropping any triggers that are referenced in other triggers.
  • 204: Added trigger statements
dolt - 0.20.2

Published by oscarbatori about 4 years ago

We are excited to announce the release of Dolt 0.20.2, including a minor version bump as we introduce a new feature SQL triggers.

SQL Triggers

SQL triggers are SQL snippets that can be executed every time a row is inserted. Here is a simple example taken from the blog post announcing the feature:

$ dolt sql
> create table a (x int primary key);
> create table b (y int primary key);
> create trigger adds_one before insert on a for each row set new.x = new.x + 1;
> insert into a values (1), (3);
Query OK, 2 rows affected
trigger_blog> select * from a;
+---+
| x |
+---+
| 2 |
| 4 |
+---+

Any legal SQL statement can be executed as a trigger, here we just defined a simple increment.

Merged PRs

  • 908: Added comments for clarity
  • 907: Release
  • 906: Fixed conflict resolution and additional trigger tests
  • 905: Updated to latest go-mysql-server
  • 904: Added trigger functionality to Dolt
  • 900: Reference new org name
  • 897: Fixed CREATE LIKE multi-db
    Fixes https://github.com/liquidata-inc/dolt/issues/654
  • 896: Moved everything over to SHOW CREATE TABLE and fixed diff panic
  • 894: Fixed UNIQUE NULL handling to match MySQL
  • 892: Andy/gc table files
  • 890: Working Ruby ruby/mysql test
    Not to be confused with mysql/ruby which uses the MySQL C API.
  • 889: Release
  • 202: Zachmu/triggers 5
    Added additional validation for trigger creation and execution:
    • Use of NEW / OLD row references
    • Circular trigger chains
  • 200: Zachmu/triggers 4
    Support for DELETE and UPDATE triggers
  • 199: Reference new org name
  • 198: Added proper support for SET NAMES, and also turned off strict checking for setting unknown system variables.
  • 197: Zachmu/user vars
    User vars now working. Can stomp on a system var of the same name, as before my last batch of changes.
  • 196: Allow CREATE TABLE LIKE to reference different databases
  • 195: Zachmu/triggers 3
    This gets SET new.x = foo expressions working for triggers. This required totally rewriting how we were handling setting system variable as well, since these two kinds of statements are equivalent in the parser.
    Also deletes the convert_dates analyzer rule, which impacts 0 engine tests.
  • 194: No longer return span iters from most nodes by default.
  • 193: Implemented CREATE TABLE LIKE and updated information_schema
    Tests will come in a separate PR
dolt - 0.19.2

Published by oscarbatori about 4 years ago

Merged PRs

  • 886: no parallelism if GOMAXPROCS == 1
  • 885: cpp mysql client tests
  • 883: mysql client tests install golang
  • 878: Go MySQL client test
  • 877: validate ref strings when resolving ref specs
    fix for https://github.com/liquidata-inc/dolt/pull/874
  • 875: go: Changes to support some commit walks used in Dolthub diffs when the commits come from different repositories.
  • 874: Added skip bats test for ref spec panic on diff
  • 873: Fixed ActionExecutor causing duplicate key error loop
  • 870: Fixed bug with diffing column defaults
  • 869: update vitess
  • 868: Added perl mysql client tests
  • 867: Added Python SQLAlchemy test to mysql-client-tests
  • 190: Harrison pr
    https://github.com/liquidata-inc/go-mysql-server/pull/189/files and a couple fixes
  • 188: triggers 2
    Insert triggers working for the following cases:
    • Insert some rows
    • Delete some rows
    • Update some rows
      Missing, needs to be added:
    • set new.x = blah as part of a BEFORE INSERT trigger. Need to rewrite the SET handling parser logic for that.
    • error testing for bad triggers (like inserting on the same table the trigger is defined on)
      As part of this, I rewrote the execution logic for Update, Delete, and Insert entirely.
dolt - 0.19.1

Published by oscarbatori about 4 years ago

Merged PRs

  • 862: /go/{go.mod, go.sum}: Update go.mod with go-mysql-server@master
  • 861: dotnet mysql client test
  • 860: Fixed column renaming breaking default values
  • 859: mysql client test c
  • 187: Additions to utc_timestamp and timediff
  • 186: fix collations
  • 185: Fixed bug with column renames breaking default values
  • 184: /sql/expression/function/{date.go, date_test.go, registery.go}: Add utc_timestamp
  • 183: Fixes for bugs found during integration of column defaults
  • 180: triggers
  • 178: Column Defaults Implementation Part 2
    Here is a comprehensive set of tests for default logic. Practically everything that was added in https://github.com/liquidata-inc/go-mysql-server/pull/174 is covered here, including some edge cases. In addition, the memory table implementation was broken/insufficient in a few ways, so I patched those up.
    The biggest change besides the tests is the additional pass when projecting expressions. This is required in order for defaults that reference other columns (especially those that come after) to be able to pull the correct value. This was something I noticed only after I wrote a test that wasn't behaving as expected (compared to the MySQL client). In fact, all of the changes outside of enginetests were due to fixing bugs that were found from testing.
  • 176: Add import statements to readme example
    The example in the readme has no import statements, so it's unclear to someone new to the project. So I added some import statements!
  • 174: Column Defaults Implementation Part 1
    This is missing basically all of the new tests, which will come in a separate PR. Proper expressions -> string behavior will also come in a separate PR. Besides that, this is pretty much most of it barring additional bug fixes. All existing tests (some of which use defaults already) pass.
    For integrators, they'll make use of the new engine.ApplyDefaults method.

Closed Issues

  • 181: undefined ViewSelectPositionStart & ViewSelectPositionEnd
  • 175: Vitess Dep
  • 169: Feedback from attempted usage in Integration tests
dolt - 0.19.0

Published by oscarbatori about 4 years ago

We are excited to announce the release of a minor version of Dolt, going from 0.18.4 to 0.19.0, prompted by the addition of tags, modeled on Git, and read-tables, a command that provides a form of shallow clone operation that clones only the data at a given commit or branch on a remote.

Tags

We are excited about Dolt as a data distribution format, and we want to provide tools for folks distributing data to robustly version their releases. Tags enable data publishers to signal discrete data releases to their users. Users can diff between two tags to compare changes.

dolt read-tables

In discussions with some of our users we found that they were excited about using Dolt as a format for collaborating on data, but needed to then use that data in automated settings where it would be unacceptable to clone the whole history on every clone, for example ephemeral compute resources that use the data for automated jobs. We provided read-tables as a way to obtain the Dolt database at a given commit, or branch.

Other

As usual we include in this release bug fixes and performance improvements. In this release we particularly emphasized fixes to go-mysql-server that is becoming an increasingly important interface for many of our users.

Merged PRs

  • 858: Push/Pull Tags
  • 851: tag redesign
  • 850: Db/containerize mysql test env
  • 849: /mysql-client-tests/{node/, MySQLDockerfile}: POC dockerizing test env
    so the docker container seems to work with the following changes... my JS/Node is kinda wack these days too...
    so this can run like this:
    $ cd mysql-client-tests
    $ docker build -t mysql-tests -f MySQLDockerfile .
    $ docker run -it --rm mysql-tests:latest /bin/bash
    root@e1fbe3c4579f:/mysql-client-tests# bats mysql-client-tests.bats
    ✓ python mysql.connector client
    ✓ python pymysql client
    ✓ mysql-connector-java client
    ✓ node mysql client
    4 tests, 0 failures
    root@e1fbe3c4579f:/mysql-client-tests# exit
    
  • 848: MySQL Client Tests
    This PR will contain the MySQL client tests in each language. I've chosen BATS as the framework to drive the testing because I'm familiar with it. We can swap that out if necessary.
  • 847: add ReadAheadTableReader.go
    The pipeline framework already handles this, but when reading outside a pipeline it can have performance benefit to having a go routine reading the data than the go routine that is processing it.
  • 844: go 1.15 fixes
  • 843: partitioning of dolt tables
  • 842: Tie Dolt commit struct fields to Noms commit struct fields
    As far as I can tell, the constants in doltdb/Commit.go only work because they match the constants here
    Also, there is currently a mismatch between doltdb.parentListField and datas.ParentListField which should cause the list of parents to never be found
  • 841: parallelism
  • 840: query differ patch
  • 839: Bug fix for reverse sequence cursor.
    Not sufficient to just mark a cursor reverse after the fact, need to initialize cursors at all levels of the tree as reverse as well. Still needs tests, but fixes panics in sqllogictest.
  • 162: Changed the test data for one_pk, two_pk to be able to differentiate …
    …between the various columns. This gives us more confidence that we're choosing the right field index in subqueries.
dolt - 0.18.3

Published by oscarbatori about 4 years ago

We are releasing Dolt 0.18.3 with a host of bug fixes and performance improvements. In particular:

  • SHOW TABLES now supports AS OF syntax, which can point at a timestamp or branch
  • subqueries can now reference their surrounding outer scope, bringing them into line with MySQL subqueries
  • substantial memory footprint improvements, particularly when importing data

Merged PRs

  • 837: /{.github, go}: remove check-committers check from ci
  • 835: Updated go-mysql-server, added test for SHOW TABLES AS OF
  • 834: Andy/release tags
    First pass at implementing dolt tag. Basic functionality to create, list, and delete tags. For now the command is hidden. At the Noms layer, tags are implemented as commits. Still todo:
    • Checkout a tag (Git does this via detached head)
    • Push/Pull tags to/from a remote
  • 831: Updated to hard fork of vitess
  • 829: Zachmu/mysql update
    Fixed compile errors and problems in query differ in latest go-mysql-server (not checked in yet). Will update go-mysql-server in go.mod before checking this in. Tests will fail until then.
  • 827: add --show-current option to dolt branch
    I added --show-current option to dolt branch cli.
    ref. https://github.com/liquidata-inc/dolt/issues/818
  • 826: shallow clone
  • 825: Updated version for release of version 0.18.2
  • 159: Updated references for hard fork of vitess
  • 158: Fixed panic when using non-literals as column defaults
    This fixes #104
  • 157: Fixed the run script and a couple of the integrations
  • 156: Bug fix for issue 152
    This fixes #152
  • 155: Zachmu/delete js
    Deleted JS integration and unused theme files.
  • 154: Zachmu/analyzer scope 5
    Final-ish solution for subquery expression analysis. Haven't yet merged the original PR since I still haven't fixed dolt to work with this.
  • 151: Zachmu/analyzer scope 3
    Partial implementation of subqueries with outer scope resolution. Not every subquery works yet, but I believe that this is strictly additive in terms of capabilities: no query that worked before is broken by this change, and many now pass. I want to get this merged in since it includes a few large interface changes and file moves that could conflict with other work. I will verify that this increases pass rates in sqllogictest before merging.

Closed Issues

  • 830: Dropping named foreign key does not allow the foreign key name to be reused
  • 152: Better error message for unquoted string in AS OF queries
dolt - 0.18.2

Published by oscarbatori about 4 years ago

We have released Dolt 0.18.2 with bug fixes, performance improvements (mainly around memory usage), and improvements to the way we handle merges.

Merged PRs

  • 824: /.github/workflows/ci-bats-tests.yaml: revert changes now that breaki…
    fixed here https://github.community/t/github-action-environment-variable-missing-breaking-issue/125913
  • 823: /.github/workflows/ci-bats-tests.yaml: Skip aws remotes bats tests
  • 822: /go/go.mod: update eventsapi generated types
    related to https://github.com/liquidata-inc/dolt/issues/816
  • 820: foreign keys can use primary keys as parent table indexes
    This change makes a special case for foreign keys that reference primary key(s) in the parent table. If all of the parent table columns are primary keys, we create an index to use for the foreign key.
    Ideally we'd make use of the existing clustered index as the parent table index. Because we will soon be adding keyless tables and allowing primary key changes, I'm punting, for now, on a more elegant implementation.
  • 814: Fixed memory leak when running sqllogictests
    Performance seems roughly the same compared to the pre-FK changes from my testing. Perhaps not the best solution, but it fixes the leak while preserving performance.
  • 813: Foreign Key, Index changes
    • Added diff support for foreign keys and indexes
    • Updated commit, stage, and reset logic to match tables based on column tags instead of table names
  • 812: Andy/range reader empty map
  • 811: Andy/foreign key changes
  • 807: ALTER TABLE x ADD INDEX unnamed
  • 805: Added support and tests for CREATE TABLE INDEX
  • 804: Resolved memory leak
    Fixes a memory leak that was observed during a large import. It appears that the primary key tuples were never deallocating, although I could not pinpoint what's holding on to the tuples. The tableEditAccumulator seems to be the only place, and it is discarded after it goes through flushEditAccumulator, so that doesn't seem to be it. However, setting its fields to nil fixes the leak, indicating that there is somewhere holding on to old tableEditAccumulators. I also checked if async.ActionExecutor was the culprit, but manual debugging and testing showed that it wasn't holding on to any references once they went through the work method. This isn't a true fix, as there is still technically a leak for whatever is holding on to the tableEditAccumulator, but it's a leak of mere kilobytes per hour rather than megabytes per minute.
    Here is an image from pprof from before.
    image
    Same query, but with the changes. Will not be equal to zero as the tuples (returned from here) are still stored in memory before being written to disk, so this is what we'd expect/desire.
    image
    Also verified each by watching memory usage in Task Manager (Windows 10).
  • 800: Updated version for release of version 0.18.1
  • 149: Added indexes to CREATE TABLE
  • 147: create parent table indexes for foreign keys
dolt - 0.18.1

Published by oscarbatori over 4 years ago

We are pleased to announce a patch release with bug fixes and performance improvements, specifically:

  • SQL Alchemy Python library could not parse SHOW CREATE statements from Dolt because the datatypes came back upper case (valid SQL, but breaks a hashtable look up in SQL Alchemy metadata parsing)
  • Fix the dolt_history_<table> tables, which had broken commit filtering
  • fix error when attempting to create a commit with unresolved conflicts still in the working set

Merged PRs

  • 799: go/libraries/doltcore/env/actions: commit.go: Error when attempting to create a commit with unresolved conflicts still in the working set.
    I realized when working on a blog post that dolt commit will currently go ahead and create the merge commit even if there are unstaged conflicts in one of the merged tables. This is not the behavior we want.
    Currently, we do not allow staging tables that have conflicts. This change makes it so dolt commit also fails if there are any conflicts in any of the tables in the working set.
  • 796: Andy/history table bug
  • 795: go/libraries/doltcore/doltdb: Use parentsList in a Commit if it is available.
  • 794: uncommenting tests
  • 793: Zachmu/analyzer update
    Upgraded to latest go-mysql-server, fixed breaking changes from analyzer renamings / signature changes.
  • 789: schema merge
    Foreign Key merge is somewhat incomplete here, hence the skipped BATS. I'm going to implement the FK changes we discussed earlier this week in order to finish FK merge.
  • 788: Db/fix ci checkfmt
    Not really sure what's going on here... I reran you branch, which seems to consistently fail the go/utils/repofmt/checkfmt.sh... That script uses goimports (go install golang.org/x/tools/cmd/goimports) to check the import order of the file... Not really sure why this works.
    Alternative I could separate each script into its own run step in the workflow...
  • 787: go/store/datas: commit.go: Start storing commit parents as a List instead of a Set.
    Commit parent order has meaning in our use case. The branch that gets merged
    into is always the first commit in the parents list. Up until this change, the
    noms layer stores commit parents as a Set, which is unordered and orders the
    parents based on their commit hash.
    This changes commit struct to carry both parents Set and parentsList List.
    CommitOptions supplies a ParentsList, but both get stored for backwards
    compatibility with existing dolt clients.
    This change does not include changes to start using the stored parentsList in
    things like ancestor traversal or dolt log. The intended migration is that
    the read logic will read from parentsList in a commit if it is present, and
    will fall back to the parents Set if it is not.
    Eventually we will be able to migrate to not writing parents Set anymore.
    There is no current plan to drop support for reading parents Set when
    parentsList is not available.
  • 786: go/libraries/doltcore/doltdb: commit_spec.go: Make CommitSpec internal state unexported.
  • 785: go/cmd/dolt/commands: merge.go: Make merge operate on commit specs, not branch refs directly.
  • 784: go/cmd/dolt: Change commit spec handling so that abbreviated forms of remotes are supported.
    This changes doltdb.CommitSpec to carry the original input in the refs case, instead of trying to add a prefix or anything else. Instead, doltdb.Resolve() takes the current HEAD ref, and fully resolves the CommitSpec itself.
    There was some confusion in usage across the code base with NewCommitSpec. In particular, there was a lot of NewCommitSpec("HEAD", "some-branch-name-that-is-not-CWB") in order to get NewCommitSpec("some-branch-name-that-is-not-CWB", ""). This collapsed all such uses to the same syntax. "HEAD" is only used for resolving the CWB now.
    In places where the CommitSpec was statically known to not be HEAD, I've not bothered to always thread the CWBRef to the Resolve call.
    This change will break ld Dolt usage, but I will follow up there when this lands in master.
  • 782: go/libraries/doltcore/merge: resolve.go: Fix panic when resolve conflicts for a deleted-in-our-branch row.
  • 781: Zachmu/show foreign keys
    Implemented new foreign key interfaces for go-mysql-server, and upgrade to latest vitess and go-mysql-server
  • 780: Release
  • 145: Zachmu/analyzer scope
    Refactoring / renaming, method comments, and new tests related to analyzer deep dive. Added a new Scope param to every analyzer function (not yet used).
  • 144: Lower-case types in SHOW CREATE TABLE and DESCRIBE TABLE output.
    This matches MySQL behavior, and is required by at least one third-party tool (SqlAlchemy)
  • 143: Zachmu/fk bugfix
    Fixed a couple issues in test setup revealed by testing foreign keys with dolt
  • 142: Formatted the repo
  • 141: Zachmu/desc table
    Added key info to output of DESCRIBE TABLE / SHOW COLUMNS
  • 140: Zachmu/show create foreign keys
    Support for foreign keys in SHOW CREATE TABLE statements, and engine tests of the same.
dolt - 0.18.0

Published by oscarbatori over 4 years ago

We are excited to announce the release of significant enhancements to Dolt's SQL implementation, as well as a new feature, prompting a minor version bump to 0.18.0.

Diffable Queries

Dolt's diff command now supports a -q flag which will evaluate the results of a query at two commits and compare them:
dolt diff -q <query>

This gives users a richer programmatic interface into analyzing the impact of changes through time by showing them the exact changes in, potentially complex, queries at various points in the commit graph. For example, consider the National Vulnerabilities Database on DoltHub:

%dolt diff p30hoseurm9qfl7jhb9t4l7jfnshk6v9 -q 'select floor(impact_score), count(*) from cve group by floor(impact_score)'
+-----+-------------------------+----------+
|     | FLOOR(CVE.impact_score) | COUNT(*) |
+-----+-------------------------+----------+
|  <  | <NULL>                  | 8468     |
|  >  | <NULL>                  | 8428     |
|  <  | 1                       | 3125     |
|  >  | 1                       | 3124     |
|  <  | 2                       | 41465    |
|  >  | 2                       | 41668    |
|  <  | 3                       | 14324    |
|  >  | 3                       | 14318    |
|  <  | 4                       | 5385     |
|  >  | 4                       | 5418     |
|  <  | 5                       | 22039    |
|  >  | 5                       | 22034    |
|  <  | 6                       | 33798    |
|  >  | 6                       | 33890    |
|  <  | 9                       | 282      |
|  >  | 9                       | 283      |
|  <  | 10                      | 16798    |
|  >  | 10                      | 16864    |
+-----+-------------------------+----------+

This shows the changes in the counts of various vulnerabilities across a range of impact score buckets. We are excited to continue enhancing the richness of the programmatic interfaces we provide into Dolt's commit graph to facilitate more robust automated interactions.

Foreign Keys

Many of the major use-cases identified for Dolt involve the need to sync with existing RDBMS systems. With this in mind we are committed to moving towards total SQL compatibility as quickly as possible. A major piece of this is foreign keys. Many application servers use foreign keys to maintain valid structure around their data, and not supporting those constructs would make it tough to replicate an existing table structure that used them without manual schema manipulation.

In this release we now support inter-table foreign keys. We do not yet support intra-table foreign keys (i.e. a constraint on a table against the same table), which is less common, though allowed in the standard. We will support such constraints in a future release.

New SQL functions

We added support for the list of functions below. You can see an index of functions that we support relative to the MySQL standard in our docs.

GET_LOCK
IS_FREE_LOCK
IS_USED_LOCK
RELEASE_LOCK
RELEASE_ALL_LOCKS
ASCII
BIN
BIT_LENGTH
SIGN
UCASE
UNHEX
ACOS
ASIN
ATAN
CRC32
COS
COT
HEX
DEGREES
RADIANS
SIN
TAN
CURDATE
CURRENT_DATE
CURRENT_TIME
CURRENT_TIMESTAMP
CURTIME
DAYNAME
MICROSECOND
MONTHNAME
TIME_TO_SEC
WEEKOFYEAR
DATE_FORMAT
WEEK

Merged PRs

  • 778: Bh/update gms
  • 777: Merge Block
    As discussed (and in relation to https://github.com/liquidata-inc/dolt/issues/773), we will fail on all merges where the schemas are not the same. As a result, one index bats test no longer makes sense, and all other tests that rely on the behavior have been skipped/commented out.
  • 776: moved query_diff to diff -q
  • 774: Foreign Keys Part 3 Episode 2
    The last bit of things to add for foreign keys!
  • 772: Zachmu/show indexes
    Implemented new interface methods needed by latest go-mysql-server
  • 771: Foreign Keys Part 3 Episode 1
    Here are the merge changes along with some bats tests
  • 770: /Jenkinsfile: Remove stages that are now in github actions
  • 769: /.github/workflows/ci-compatibility-tests.yaml: Add compatibility test github actions
  • 768: Added GROUP BY support, tests to dolt query_diff
  • 767: /.github/workflows/ci-check-repo.yaml: Iterating on github actions check repo
    looking into fixing this: https://github.com/liquidata-inc/dolt/pull/767/checks?check_run_id=808971285#step:4:296
    2020/06/25 21:04:02 Error running `git merge-base remotes/origin/db/ci-github-actions-check-repo remotes/origin/master` to find merge parent: exit status 128
    exit status 1
    
    the exit status 128 is from //go/utils/checkcommiters/main.go ... im thinking could be the go version? i had to use 1.13 here to get past the error that occurred when i used go version ^1.13 regarding the -mod=readonly flag....
  • 766: Andy/json dates
    fix addressing https://github.com/liquidata-inc/dolt/issues/755
  • 764: /.github/workflows/ci-bats-tests.yaml: Add dolt bats tests ci github actions
    Seems to run the tests for both linux and macos ... need to fix the broken tests that have specific dependency issues
  • 762: Db/ci GitHub actions go
    Run go tests for dolt on PRs
  • 760: /.github/workflows/bump-brew-formula.yml: Attempt to fix syntax and dispell yaml syntax error
    My best guess at the correct syntax... not a huge fan of actions rn lol
  • 758: Added skipped test for json DATETIME bug.
    Also, added || false on some regexes to create deterministic behavior in old versions of bash.
  • 757: .gitmodules,proto/third_party/golang-protobuf: Remove unused golang-protobuf submodule.
  • 754: proto/{third_party,Makefile}: Adopt protobuf-go and grpc-go for protobuf message generation.
  • 753: Foreign Keys Part 3
    This PR implements the commit functionality, along with commit --force.
  • 750: bats/remotes.bats: Add bats tests, some skipped, for some remote branch ref handling tests.
  • 749: go/go.mod: go get -u ./....
  • 747: bats: Change remotesrv pid handling to capture pid of background process.
  • 746: query_diff via lazy projections
    Another iteration of query-diff. The methodology is:
    • Generate a query plan at each RootValue from & to
    • Alter the query plan to lazily evaluate projections. This allows access to any column that is used in ordering the query results
    • Determine the order of the query results by extracting all primary key columns and SortFields
    • Iterate over the modified query plans, diffing their results by comparing row order. Then evaluate the projections to produce the final diff
      This version does not yet incorporate Noms layer diffing. It also does not yet handle aggregate functions.
      depends on https://github.com/liquidata-inc/go-mysql-server/pull/127
  • 745: README.md: Correct GOROOT to GOPATH in installation instructions.
  • 743: Upgraded go-mysql-server and vitess
  • 740: /benchmark/sql_regressions/run_regressions.sh: Iterating on timing commands
    spits out timing info for these different commands
  • 738: go/store/datas: FindCommonAncestor: Fix potential for explosive growth if RefsByHeight queues if the same references are visited multiple times.
  • 737: First draft of GitHub Action for bumping homebrew on tag
    This relies on us being able to keep a private secret on this repo. I know the repo is public but I believe we can keep a token private to repo admins and then provide access to that to the GitHub Action so that it can raise a PR against the Homebrew repository?
  • 735: Release
  • 139: Added format script
  • 138: Andy/query diff 4
  • 135: Zachmu/show create indexes
    Added non-primary keys to SHOW CREATE TABLE output
  • 134: test fix
  • 133: Zachmu/show indexes
    Fixed show index statements to work with native indexes. Added several new methods to Index interface, which will break existing integrators.
  • 130: more functions
  • 129: pow fix: these children aren't twins
  • 128: Zachmu/wsl fix
    Added WSL-checking to socket checking code for linux, as it appears to be broken. In this case, don't do socket polling, same as Windows and Darwin. Also added additional trace logging to handler.go, and moved the socket polling goroutine to its own method.
  • 127: made Project() public
  • 125: sort by function name
  • 124: Zachmu/vitess upgrade
  • 123: Zachmu/index regression
    Fixed bug in ascend / descend index lookups, added engine tests for same. Also realized that the in-memory unmergeable index implementation didn't support ascened / descend, so fixed that.
  • 122: Fixed a broken test (seemingly broken forever?)
  • 121: Removed CODEOWNERS, require manual review request for PRs
  • 120: crc and hex funcs
  • 119: Zachmu/readme
    Updated docs and README
  • 118: datetime functions
  • 117: trig functions

Closed Issues

  • 765: db abilities
  • 755: From User: Datetime columns in JSON output not returned
  • 741: README refers to 'GOROOT' incorrectly
dolt - 0.17.2

Published by oscarbatori over 4 years ago

A patch release containing bug fixes and performance improvements, in particular we are focused on reducing the number of skipped BATS tests.

Merged PRs

  • 733: go/libraries: sqle: history_table: Thread context on row iterators.
  • 729: Zachmu/engine test
    Run the go-mysql-server engine tests on dolt.
  • 728: added --with-tags flag to schema export
    I only changed dolt schema export. Other commands like dolt show tables or dolt diff --sql still output tags. This is easy to change on case by case basis.
  • 724: Bh/conflicts table3
  • 723: Bh/conflicts table2
  • 115: Zachmu/engine test dolt
    Modified test harness and data to work with doltdb
  • 113: Zachmu/test harness 2
    Rewrote engine tests to use a harness, so that integrators can run them against their own implementation. In the process, broke the giant engine_test.go into many smaller files.

Closed Issues

  • 726: The apache license is missing the certain copyright owner