dolt

Dolt – Git for Data

APACHE-2.0 License

Downloads
2.4K
Stars
17.1K
Committers
143

Bot releases are visible (Hide)

dolt - 0.23.2

Published by github-actions[bot] over 3 years ago

This release supports ALTER TABLE MODIFY COLUMN statements.

Merged PRs

  • 1321: disable query diff
    query diff was a little known feature that was not well documented and was the source of a large amount of maintenance work. We can recover this from git in the future, or reimplement once it is a priority.
  • 1317: Removed extraneous diff.go file
  • 1312: bats tests for primary key column change diff panic
  • 1309: added two skipped bats
  • 1306: Check constraints on commit
  • 1299: Vinai/dolt merge Part 1. Just Fast Forward Merge
    This pr starts the Dolt_Merge process that works with a fast forward.
  • 1290: Added DELIMITER support to the dolt shell
  • 1281: ALTER TABLE MODIFY COLUMN
    Adds support for changing column types.

Closed Issues

  • 1313: Dolt diff panics if you rename a primary key using ALTER
dolt - 0.23.0

Published by github-actions[bot] over 3 years ago

This release introduces several improvements to the query analyzer:

  • Tables can now be joined to subqueries using indexes
  • Indexes and other optimizations now work in INSERT ... SELECT statements
  • Null-safe equals operator (<=>)

Merged PRs

  • 1303: go/go.mod: Bump go-mysql-server
  • 295: Changed analysis to isolate the values side of an INSERT statement
    Also cleaned up create trigger analysis, which broke when I did this at first. Now catches more errors than before.
  • 294: Fixed a bug in join planning for Inserts.
    Table reordering was leaving nodes above the join with incorrect field indexes. This was getting fixed by other analyzer steps for some top-level nodes, but not Inserts.
  • 293: Implement NullSafeEquals: The <=> operator in MySQL.
  • 292: Make join planning work for insert statements
  • 291: Consider SubqueryAlias nodes when planning indexed joins.
dolt - 0.22.14

Published by github-actions[bot] over 3 years ago

This release addresses correctness bugs and provides performance improvements, as well as continuing to flesh Dolt's version control features as SQL functions.

Merged PRs

  • 1298: change table projection implementation
  • 1293: Insertion optimizations
    Go from sql rows to types.Tuples directly and use those tuples for insertion without using row.Row.
  • 1289: Added DOLT_SQL_DEBUG_LOG and DOLT_SQL_DEBUG_LOG_VERBOSE environment vars
    Used to turn on query analyzer debugging
  • 1287: Implement SQL dolt_checkout function
    This pr adds DOLT_CHECKOUT functionality.
  • 290: pushdown to indexed tables
  • 288: remove slow span tag
  • 287: Update README.md to reference Dolt
  • 286: Fixed several bugs preventing indexes from being used in some joins
  • 285: Implemented JOIN_ORDER optimizer hints
    Also:
    • Got rid of expensive comment stripping pass in the parser
    • Fixed test behavior of MySQL executed comment statements like /*!40101 SET NAMES utf8 */
    • Made SHOW VARIABLES output sorted
  • 284: Updated copyright headers and added missing ones
  • 283: Fixed type bugs

Closed Issues

dolt - 0.22.13

Published by github-actions[bot] over 3 years ago

This release addresses a number of correctness bugs and improves query performance.

Merged PRs

  • 1285: Upgraded to latest go-mysql-server, and added a skip for a new query plan test
  • 1284: fix projected columns with single partition indexes
  • 1280: o/libraries/doltcore/table/editor: Add ability to configure map editor flush interval with environment variables.
  • 1269: Implement StatisticsTable interface to provide table statistics to the analyzer.
  • 1267: Fix bug where batched inserts containing subqueires chose non flushing option..
  • 1264: adds ability to redirect stdin from a file
  • 282: Bug fix for pushing a projection down to a table in a subquery more than once
  • 278: remove sync.Once from autoincrement expression
  • 275: add StatisticsTable interface
  • 274: Fixed various bugs in subquery execution, added table ordering optimization
    Fixes many correctness issues in subquery execution:
    • Incorrect field indexes for subquery aliases in some cases
    • Incorrect field indexes for subquery expressions in the case of column pruning in other parts of the query
    • Queries selecting the same table more than once with aliases getting incorrect indexes applied
      Also introduces performance enhancements:
    • Table ordering optimizations based on rows counts of tables in a join
    • Better use of index pushdown for some join queries
  • 273: Added the VALUES() function
    Requested in: https://github.com/dolthub/dolt/issues/1225
    VALUES() is deprecated in the latest versions of MySQL (as of version 8.0.20, released April 2020), but it is recent enough that I feel its inclusion in the engine is justified.

Closed Issues

  • 281: go get github.com/dolthub/go-mysql-server got the wrong path?
  • 279: err:go get github.com/dolthub/go-mysql-server
dolt - 0.22.12

Published by github-actions[bot] over 3 years ago

This release has two notable features:

  1. Binary type support. BINARY, VARBINARY, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB now supported.
  2. dolt diff --cached now supported

Other than that, this release contains performance improvements and bug fixes.

Merged PRs

  • 1252: go/store/nbs: s3_table_reader: Parallelize loading large table file indexes.
  • 1251: Implement dolt diff --cached command
    Made this competing PR with the open source contributed so I could commit bats test changes in conjunction.
  • 1249: corrected sql row for init commit row in dolt_commit_ancestors table
  • 1245: fixes panic in the atomic package
    The atomic package has the following limitation on 32-bit platforms:

    On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.

  • 1242: projected column fixes
    For both indexed and unindexed tables when the engine would call WithProjection on our sql.Table implementation we would return a new sql.Table that embedded the old one and had a member which was the list of projected columns. Then, when sql.Table.PartitionRows was called, it was called on the embedded object directly and the projected columns were not accessible.
  • 1240: Skip And Less Optimizations
  • 1239: Decrease the number of allocations for indexed reads
  • 1237: Fixed a grammatical mistake in the man page for dolt conflicts resolve.
  • 1233: fix panic when key does not exist in table
  • 1230: change panic to error for '.' in Dataset name
    fix for #1144
  • 1229: added sysbench scripts
  • 1228: go/store/nbs: WithoutConjoiner() to configure NBS to not conjoin.
  • 1221: cli output fix
    fixes issues where printf formatting was being unintentionally applied for some cli output
  • 1220: Print pipeline fix
    Fixes issue https://github.com/dolthub/dolt/issues/1219 by not calling on an iterator after it's already returned EOF
  • 1218: remotestorage: Fix download aggregation to correctly aggregated based on prior chunk in batch, not first chunk.
  • 1217: Restored proper DATE functionality
    For DATE, we accidentally forgot to add the time truncation during the perf improvements, so this adds it back in.
  • 1215: Write to root on every loop of sql shell.
    This pr fixes a bug where the root gets written on every loop of the dolt sql shell.
  • 1214: Automated release notes generation
  • 1213: Read tuples from sequences directly without conversion to Values
    Read tuples from sequences directly instead of using values.
  • 1212: decrease allocations made while reading values
  • 1210: Vinai/refactor docs
    This pr refactors DocsReadWriter to simplify the interface. It then removes many of the methods in environment.go that are related to the d
    rw interface.
  • 1209: Added binary types to dolt
    Added BINARY, VARBINARY, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB types.
  • 271: add ability to remove rules
  • 269: Fix: not check tcp6 socket state while ipv6 is disabled
    If ipv6 is disabled on the system, tcp6 will not exist in the /proc/net dir.
    So open /proc/net/tcp6 will produce the error open /proc/net/tcp6: No such file or directory, we should not always check tcp6 socket state(
    unless it is opened).
  • 268: Bug fix for https://github.com/dolthub/dolt/issues/1219

Closed Issues

  • 1219: show indexes crash
  • 1211: Refactor Docs methods out of environment.go
dolt - 0.22.11

Published by github-actions[bot] almost 4 years ago

This is a bug fix release for 0.22.10. This fixes a bug in clone that would sometimes keep large clones from succeeding by not retrying failed downloads.

Merged PRs

  • 1208: go/store/datas: pull.go: Fix Sources refetch on clone download failure to actually try with the newly fetched URLs.
dolt - 0.22.10

Published by github-actions[bot] almost 4 years ago

We are excited to announce 0.22.10 of Dolt. We include the usual slate of bug fixes and performance improvements, as well as:

  • DOLT_RESET exposed in SQL, part of our roadmap of exposing all Dolt's version control features in SQL
  • ON DUPLICATE KEY for defining behavior when a duplicate primary key is encountered

As usual, we are grateful for contributions and bug reports.

Merged PRs

  • 1202: varint benchmark and decoding changes
  • 1199: Adds dfuncs that can only be registered with Dolthub API
  • 1192: Use map.IterRange for iterating over rows
    Before
    ~/datasets/wikipedia-ngrams>GOMAXPROCS=1 time dolt sql -r null -q 'SELECT * FROM bigram_counts'
    310.13 real       285.36 user         9.16 sys
    
    After
    ~/datasets/wikipedia-ngrams>GOMAXPROCS=1 time dolt sql -r null -q 'SELECT * FROM bigram_counts'
    253.91 real       233.10 user         9.67 sys
    
  • 1190: go/cmd/dolt: Add debugging flag to run with Jaeger span reporting to localhost.
    This adds a --jaeger flag to dolt CLI which installs a Jaeger Tracer as the
    global opentracing Tracer. The Tracer is configured to report to an HTTP
    collector running on http://localhost:14268, which is the port that docker
    image jaegertracing/all-in-one listens on.
    Also adds some parameters in places where Dolt constructs sql.Contexts to pass
    the correct Tracer through.
    Also adds a few new Span points, in things like nbs.Get and
    metaSequenceImpl.getChildSequence.
  • 267: sql/plan/exchange.go: Increase exchange node row chan buffer to 16 * parallelis
    m.
  • 265: Fixed bug in inserting literal NULL values as part of a SELECT statement
  • 264: Fix bugs in union distinct semantics, allow them in inserts
  • 263: Export the struct param for NNary
  • 262: sql/plan: exchange.go: Recover from panics in goroutine spawned by iterPartitio
    n.
  • 258: sql/session.go: NewSpanIter: Enable this when the tracer is not Noop.
  • 257: disable projection on indexed join

Closed Issues

  • 261: Preserve NULL values on insert into tables using sub select queries
  • 260: Support for UNION in INSERT statements
  • 259: Support for UNION DISTINCT
dolt - 0.22.9

Published by github-actions[bot] almost 4 years ago

Merged PRs

  • 1169: go/libraries/doltcore/sqle: Keyless tables don't have PK index -- fix describe panic
  • 1167: C# test for alternate MySQL connector library, upgraded existing to u…
    …se dotnet 5 (up from 3)
  • 1162: unrolled decode varint decode loop
    30% faster on the benchmark in this PR.
    BenchmarkUnrolledDecodeUVarint/binary.UVarint-8 1000000000 0.0372 ns/op
    BenchmarkUnrolledDecodeUVarint/unrolled-8 1000000000 0.0258 ns/op
  • 1159: Bh/hang fix
    Fixes https://github.com/dolthub/dolt/issues/1153
    and disables GC on import errors.
  • 1157: Address escaping in longtext/json
    Longext is no longer being exported properly in json and csv. This pr fixes that. Cc linked comment.
  • 1151: Implement dolt status table
    Implement the dolt_status table. Schema is of the form table_name, staged(bool), status,
  • 1149: /go/libraries/doltcore/sqle: move table cache to DoltSession, purge on root change
    I'm ambivalent about where the table cache lives, but we need to have access to it when we change the root of a sqle.Database
    This PR purges the table cache when we change roots. IE SET @@dolt_head = hashof(...). The intention is to limit the scope of edit sessio
    ns, and table mutations in general, and contain them to the working root of a sqle.Database.
    Currently there is a tricky reference chain within the table cache: sqle.tableCache -> sqle.WritableDoltTable -> sqle.sqlTableEditor
    -> table/editor.TableEditSesson
    When we change roots in the database we also change the root within the TableEditSession. They're both referencing the new working root.
    Without purging the cache, we will keep old tables from the previous working root that still have a reference to the TableEditSession.
    This hasn't caused any issues (yet), but I think it's prudent to limit the scope and lifetime of these interconnected pieces of state. The
    next step is to create a fresh TableEditSession each time we switch roots.
  • 256: added describe queries for keyless tables
  • 255: This function implement an Naryfunction type.
    Allows you to define sqle functions that have multiple children.
  • 254: Fixed UNHEX/HEX roundtrip
    Simple fix but I ended up completely reevaluating our binary type implementation. Fixed a bug found in the cast package we were using to
    convert strings, and also changed UNHEX to return the proper SQL type.
  • 252: Added hash functions
  • 249: Alias bug fixes
    Fixes a number of buggy behaviors involving column indexes and table name resolution.

Closed Issues

  • 1161: Primary keyless tables seem to break DESCRIBE
  • 1153: p.StopWithErr(err) is hanging on large imports
dolt - 0.22.8

Published by github-actions[bot] almost 4 years ago

Merged PRs

  • 1142: make script executable
  • 1141: uncomment platform specific code
  • 1139: index bug fix
  • 1136: send map sizes for legacy diff summary
  • 1134: Andy/ungate keyless tables
  • 1133: Andy/keyless import/export
    Adds keyless table support for dolt table import .... table import -c now creates keyless tables if the -pk option is not provided.
  • 1132: Added release automation for Dolt
    The Release workflow is kicked off by the user hitting the run workflow GUI, and then entering version.
    The basic workflow is as follows:
    • bump-version
    • checks out the code, and creates a branch, for example v0.23.0-release if the release parameter is 0.23.0
    • updates the version string
    • snaps a commit
    • creates a tag
    • pushes the tag and branch
    • create-release
    • checks out the code at the newly created tag
    • creates a release
    • builds the binaries
    • uploads the binaries to the release
    • homebrew-bump
    • creates a PR to bump the Homebrew formula
      Possible enhancements
    • validate version string passed by user
    • use GitHub API to create PR of release branch back to master
    • notifications
    • MSI creation
  • 1130: Andy/keyless tables merge
  • 1124: /go/libraries/doltcore/diff: Keyless Table Diff
  • 1120: Vinai/docs read writer
    This pr creates a DocsReadWriter which factors out some of the additional docs methods that were stuck in repo_state.json. Subsequent refac
    toring across files to account for this change.
  • 248: additional tests
    add a table with multiple keys
    an index that has a subset of those keys in a different order
    a couple queries
  • 246: Error changes for INSERT ON DUPLICATE KEY UPDATE

Closed Issues

  • 1126: Incorrect Foreign Key error on merge
dolt - 0.22.7

Published by oscarbatori almost 4 years ago

We are pleased to announce Dolt 0.22.7.

This release focuses on bug fixes, and performance improvements in SQL. In particular delivered huge performance improvements in our SQL implementation. You can find the scope of these performance improvements detailed on our benchmarks page.

Merged PRs

  • 1116: partition ranges, covering indexes, smarter iterators
  • 1111: README quotes changed bugfix for windows terminal
    On the README there are instructions on how to add values into a table. The values in one portion have single quotes on the outside and have double-quotes for any string. While that format works in a Unix terminal, it doesn't work in
  • 1109: Attempt to add default decimal type to FromKind
  • 1108: Fixed dolt status output incorrectly displayed for staged files
    The function printStagedDiffs always returned 0, even when there were diffs not staged. This return was also causing it to print in printStatus "nothing to commit, working tree clean". This was not the case.
    I changed printStagedDiffs to return the number of the staged tables plus the number of staged docs instead. This prevents it from entering the if statement with the print also.
  • 1107: go/libraries/doltcore/{row, sqle, table}: Generalize TableReader
    Created table.SqlTableReader as a replacement for directly reading from table maps. Used it to replace types.MapIterator in sqle.doltTableRowIter
  • 1106: Added verify-constraints command
  • 1105: /MySQLDockerfile: peg version to match Gemfile.lock BUNDLED WITH
  • 1103: /go/cmd/dolt: added feature flag for keyless schemas
  • 1102: go/libraries/doltcore/{doltdb,table}: remove row access methods from doltdb.Table
    Removed:
    • Table.GetRowByPKVals()
    • Table.GetRow()
    • Table.GetRows()
      Had do to some refactoring along the way to fix dependency cycles.
      Reversed dependency rowconv -> pipeline to pipeline -> rowconv
  • 1101: Ensure that MERGE() works properly with fast forward.
    Added test case as well.
  • 1098: Export NewJSONReader to use in dolthubapi
  • 1097: Fixed table import allowing NULLs in the primary key
    Fixes https://github.com/dolthub/dolt/issues/1096
  • 1093: go/libraries/doltcore/table/editor: Convert TableEditor to interface
  • 1090: Add --author, -m to COMMIT. Add --author to MERGE()
    COMMIT('-m', 'hi', '--author', 'John Doe [email protected]')
    MERGE('feature-branch', '--author', 'John Doe [email protected]')
  • 1089: Add the Dolt mascot to README
  • 1088: fixed rand seed
  • 1087: increase query parallelism from the default of 2 to 8
  • 1085: split TableEditors and IndexEditor to their own package
  • 1084: bats/: keyless spec
    This is a set of skipped BATS tests that provide a spec for keyless tables.
  • 1082: Fixed internal index comparisons considering unnecessary parameters
    Fixes https://github.com/dolthub/dolt/issues/1081
  • 1080: Fixed shell error loop on UNIQUE violation
    Fixes https://github.com/dolthub/dolt/issues/1079
  • 1078: Upgraded to latest go-mysql-server with support for indexed joins on any number of tables
  • 1075: Support for CURRENT_USER SQL function without ()
  • 1074: Add dolt_commit error check when autocommit is off
    Fails loudly when autocommit is off for dolt_commit.
  • 245: Fixed tuple comparisons
  • 240: Enginetests for Keyless tables
  • 239: naked functions
    Fix for naked CURRENT_USER function call was in vitess, this just adds tests.

Closed Issues

  • 1099: MERGE() is creating a new commit on FFs.
  • 1096: Table import can allow NULLs in the primary key
  • 1081: "string is too large for column"
  • 1079: Indefinitely errors in SQL shell once a UNIQUE constraint has been violated
  • 1071: Throw error in DOLT_COMMIT if autocommit is not true
  • 241: expression.Tuple is uncomparable
dolt - 0.22.6

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1068: Add -a flag to cli and DOLT_COMMIT
    This pr adds a -a flag to dolt commit and DOLT_COMMIT. It stages all tables.
    It also cleans up some of the previous work done in #1056 by removing all method handlers in repo_state and moving them to the RepoStateReader and RepoStateWriter
    It does not refactor the RSR/RSW interfaces in environment.go. This will be done in a subsequent pr.

Closed Issues

  • 1067: Support dolt commit -a
dolt - 0.22.5

Published by oscarbatori almost 4 years ago

This is a patch release that adds no new features or bug fixes.

Merged PRs

  • 1065: fix typo in GA for Homebrew
  • 1064: Fix brew formula
dolt - 0.22.3

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1062: Updated go-mysql-server with a patch to fix failing function expressions
  • 238: Zachmu/funcs
    Got rid of all embedded function fields in function types, since they make it impossible for analyzer to finish (function fields are not comparable with reflect.DeepEquals, which the analyzer uses to decide if the query plan has settled).
dolt - 0.22.2

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1059: pass in-memory gc gen when we conjoin
    On the conjoin path, we're not passing "garbage collection generation" when we update the manifest. NomsBlockStore interprets this as the c
    onjoin having been preempted by and out-of-process write and blocks the write.
  • 1052: Vinai/dolt commit author no config
    Add a bats test that models the following behavior.
    1. Unsets name and user.
    2. Makes a sql change
    3. Add a commit with --author
dolt - 0.22.1

Published by oscarbatori almost 4 years ago

Merged PRs

  • 1045: Conslidated benchmark directory
  • 1043: Vinai/1034 add author info
    This adds the --author tag to dolt commit
  • 1042: Vinai/clean up tags
    Cleans up some of the comments left on #1041
  • 1041: Vinai/1023 remove tag info
    Fixes #1022 and #1023
  • 1040: go/libraries/doltcore/remotestorage: Add hedged download requests to remotestorage/chunk_store.
  • 1039: go/libraries/doltcore/remotestorage: Refactor urlAndRanges a bit.
  • 1038: go/libraries/doltcore/remotestorage: Simplify concurrentExec implementation with errgroup.
  • 1037: proto: Add StreamDownloadLocations to ChunkStoreService.
  • 1036: go/cmd/dolt/commands: added write queries and ancestor commit to filter-branch
  • 1033: Temporary parallelism implementation on indexes
  • 1031: reset --hard
  • 1029: Added dolt_version()
    As version() is used to emulate the target MySQL version, I've added dolt_version() so that one may specifically query the dolt version.
  • 1026: Increased the default sql server timeout to 8 hours
  • 1025: go/libraries/doltcore/remotestorage: chunk_store.go: Small improvements to GetDownloadLocations batch size and HTTP GET error logging.
  • 1024: dolt filter-branch
  • 1022: Skipping tests broken by recent changes to info schema (EXTRA)
  • 1021: != operator now uses indexes

Closed Issues

  • 1034: Add --author option to dolt commit
  • 1023: Remove tag info from EXTRA in dolt SQL schema
dolt - 0.22.0

Published by oscarbatori almost 4 years ago

We are excited to announce the minor version release of Dolt 0.22.0.

SQL Tables

We continue to expand the SQL tables that surface information about the commit graph, in this release we added:

  • dolt_commits
  • dolt_commit_ancestors
  • dolt_commit_diffs_<table>

SQL

We added support for prepared statements to our SQL server.

Merged PRs

  • 1019: Fix dolt ls --all
  • 1018: mysql-client-tests: Add some simple client connector tests for prepared statements.
  • 1016: Rewrote the README
  • 1015: go/go.mod: Bump go-mysql-server; support prepared statements.
  • 1014: Added bats test for index merging from branch without index
  • 1013: dolt_commits and dolt commit_ancestors tables
  • 1012: added reset_hard() sql function
  • 1011: Bh/commit diff
  • 1009: Richer commit message for Dolt Homebrew bump
  • 1008: Mergeable Indexes Pt. 2
    Tests for mergeable indexes
  • 1002: s/liquidata-inc/dolthub/ for ishell and mmap-go
  • 1001: NewCreatingWriter breaks dolthubapi with recent changes
    There might be a better fix for this, but dolthubapi uses NewCreatingWriter which breaks with Andy's recent changes (it's being used in dolthubapi here)
  • 233: Reorder Master
  • 232: Indexes search for exact match rather than submatches
  • 231: added 'auto_increment' to EXTRA field for DESCRIBE table;
  • 229: Add support for prepared statements.

Closed Issues

  • 1007: dolt push does not seem to push correctly in Windows Powershell
  • 1003: AUTO_INCREMENT column info does not display in describe table output
dolt - 0.21.4

Published by oscarbatori almost 4 years ago

Merged PRs

  • 999: Another fix to brew bump job
  • 997: Fix typo in brew bump
    From failed run on most recent release:
    Screen Shot 2020-11-05 at 6 29 02 PM
dolt - 0.21.2

Published by oscarbatori almost 4 years ago

Merged PRs

  • 995: support for ALTER TABLE AUTO_INCREMENT
  • 994: Updated namespace for sqllogictest
  • 993: Added WSL notice to README
  • 990: mysql auto increment semantics
  • 989: Fix a few docs typos
  • 988: {bats, go}: Some fixes to InferSchema and add bats test
  • 987: Turbine Import Fix
  • 985: go/**/*.go: Update copyright headers for company name change.
  • 982: go/libraries/utils/async: Have ActionExecutor use sync.WaitGroup.
  • 981: Attempt to clean up error signaling in diff summary.
  • 980: In prettyPrint, defer closing the iterator before doing anything else
    We were missing close() when an UPDATE or INSERT etc. had an error during cursor iteration, therefore leaving a server process running. Also save sql history file before executing the query, so it gets saved even if the user interrupts execution.
  • 976: /.github/workflows/ci-go-tests.yaml: run go tests only when go/ changes
    I think this might be a good addition... will only run go tests when there are go changes
  • 975: Extract some import logic to be used in dolthubapi
    In reference to this comment https://github.com/dolthub/ld/pull/5262#discussion_r514465176
    I had some duplicate logic in dolthubapi for the import table api. I extracted some logic so that I can use InferSchema and MoveDataToRoot to root to reduce some of the duplications
  • 974: Skipped two newly added test queries that don't work in dolt yet
  • 973: Support for COM_LIST_FIELDS, fixed SHOW INDEXES
  • 972: Update README.md
    Removed errant Liquidata reference
  • 971: Added GitHub workflow tests for race conditions
    Will fail until https://github.com/dolthub/dolt/pull/967 is merged into master, however the workflow only works when the PR is based against master. Therefore this PR does not target the aforementioned PR's branch.
  • 970: Memory fix for CREATE INDEX
    Used a pre-existing 16 million row repo to test CREATE INDEX memory usage on.
    Before:
    72.47GB RAM Usage
    18min 48sec
    After:
    1.88GB RAM Usage
    2min 2sec
    Copied the same strategy as used in table_editor.go to periodically flush the contents once some arbitrary amount of operations have been performed.
  • 967: go: Make all tests pass under -race.
  • 966: go/store/types/edits: Rework AsyncSortedEdits to use errgroup, and a transient goroutine for each work item.
  • 965: dolt merge --no-ff
  • 225: Andy/mysql auto increment
  • 224: Zachmu/xx
    Use xxhash everywhere, and standardize the construction of hash keys.
  • 223: Zachmu/in subquery
    Implemented hashed lookups for IN (SELECT ... ) expressions. This is about 5x faster than using indexed lookups into the subquery table in tests.
    In a followup I'm going to replace the existing CRC64 hashing with xxhash everywhere it's used, so we're back to a single hash function.
  • 221: Fixed bug in delete and update caused by indexes being pushed down to tables
  • 220: Support for COM_LIST_FIELDS, fixed SHOW INDEXES
  • 219: Zachmu/turbine perf
    1. Do pushdown analysis within subqueries
    2. Push index lookups down to tables in subqueries
  • 218: Fix unit tests to run with -race.
  • 217: validate auto_increment on in-line and out-of-line PR defs

Closed Issues

  • 978: Support UNIQUE in CREATE TABLE statements, not just in CREATE INDEX statements
  • 962: Index creation must not be limited by working memory
  • 961: UNIQUE does not work on field level
  • 959: Cannot create UNIQUE index on FK fields Dolt considers it duplicate
dolt - 0.21.1

Published by oscarbatori almost 4 years ago

We are excited to announce the release of Dolt 0.21.1, a patch release with functionality and performance improvements.

Benchmarks

A significant new aspect of the Dolt release process will be providing SQL benchmarks. You can read a blog about our approach to benchmarking using sysbench here, and you can find the benchmarking tools here. By way of example the benchmarks for this release were created with the following command:

./run_benchmarks.sh bulk_insert oscarbatori v0.21.0 v0.22.1

This produced the following result, which we host on DoltHub:

Merged PRs

  • 957: go/store/{datas,util/tempfiles}: Fix some races in map writes. One effects clones, one effects only tests.
  • 953: create auto_increment tables with out-of-line PR defs
  • 952: go/libraries/doltcore/sqle: Add support for UPDATE and DELETE using table indexes.
  • 949: auto increment
  • 947: don't drop column values on column drop
  • 946: go/cmd/dolt: commands/sql: Small improvement to only call rowIter.Close() once on sql results iterators.
  • 945: Use docker-compose for orchestrating benchmarking
  • 944: go/store/types: value_store: Optimize GC to work in parallel and use less memory.
  • 942: feature gating
  • 941: Upgraded to latest go-mysql-server and re-enabled query plan tests
  • 939: Added new indexes overwriting auto-generated indexes
  • 938: go/store/{nbs,chunks}: Convert some core methods to provide results in callbacks. Convert some functions to use errgroup.
  • 937: Update README.md with the latest dolt commands
  • 934: Add go routine to clone
    I parallelized the table file writing process by using go routines. Specifically, I made use of the "golang.org/x/sync/errgroup" package which allows for convenient error management across a waitgroup.
    A couple of benchmarks I tested this on were
    1. Dolt-benchmarks-test: No difference in speed really
    2. Coronavirus: Original ~30sec. Current 15sec
    3. Tatoeba Sentence Translation: Original: ~17mins Current: 10mins
  • 933: /go/libraries/doltcore/diff: Ignore NULLs in cell-wise diff
    fix for https://github.com/dolthub/dolt/issues/899
    The from root in this repo has NULL values written to the map which causes erroneous diffs.
    https://www.dolthub.com/repositories/dolthub/us-supreme-court-cases/compare/master/hb502v6tf3uj43ijfhot6dopmgdm1muk
  • 932: /go/cmd/dolt/commands: Help Text Fix
  • 216: Updated sql.MergeableIndexLookup interface
  • 215: memory: *_index.go: Construct sql equality evaluations with accurate types in the literals.
  • 214: auto increment
  • 213: triggers bugfix
    Fixed bug in insert triggers, which couldn't handle out-of-order column insertions.
    Fixes https://github.com/dolthub/dolt/issues/950
  • 212: sql/analyzer: pushdown.go: Allow pushdown on Update, RowUpdateAccumulator and DeleteFrom plan nodes.
  • 211: join bugs
  • 210: sql/plan: {update,insert,update,process}.go: Fix some potential issues with context lifecycle and reuse.
    • insert, update, delete: Only call underlying table editors with our captured
      context once when we are Close(). Return a nil error after that.
    • process: Change to only call onDone when the rowTrackingIter is Closed.
    • process: Change to call childIter.Close() before onDone is called. Child
      iterators have a right to Close() before the context in which they are
      running is canceled.
  • 208: Create UNIQUE index if present in column definition
  • 207: Pushdown and plan decoration
    Two major changes:
    1. Changes to pushdown operation, to push table predicates below join nodes and to fix many bugs and deficiencies. Also large refactoring.
    2. Added DecoratedNodes to query plans to illustrate when indexes are being used to access tables outside the context of a join
dolt - 0.21.0

Published by oscarbatori about 4 years ago

We are excited to announce the release of Dolt 0.21.0. This release contains a host of exciting features, as well as our usual blend of bug fixes and performance improvements.

Squash merge

As a result of our own internal data collaboration projects, we realized that a squash command for condensing change sets as a consideration for collaborators was an essential tool. This is now in Dolt.

NFS Mounted Drives

A user highlighted that Dolt didn't work with NFS mounted drives due to the way it was interacting with the filesystem. We have now fixed this.

Garbage Collection

We now have a dolt gc command for cleaning up unreferenced data. This was requested by several users as a space saving mechanism in production settings.

Performance Improvements

We continue to aggressively pursue performance improvements, most notably a huge improvement in full table scans.

sysbench tooling

As we detailed in a blogpost yesterday we have created a tooling to provide our development team and contributors with a simple way to measure SQL performance. For example, to compare a arbitrary commit to the current working set (to test whether changes introduce expected performance benefits):

$ ./run_benchmarks.sh bulk_insert <username> 19f9e571d033374ceae2ad5c277b9cfe905cdd66

This will build Dolt at the appropriate commits, spin up containers with sysbench, and execute the benchmarks.

Documentation Fixes

An open source contributor provided several fixes to our CLI documentation, which we have gratefully merged.

GCP Remotes

We have fixed Google Cloud Platform remotes motivated by a bug report from a user experimenting with Dolt.

Merged PRs

  • 930: Bump go-mysql-server
  • 929: store/types: value_store.go: GC implementation uses errgroup instead of atomicerr.
  • 928: gc chunks
    Implements garbage collection by traversing a Database from its root chunk and coping all reachable chunks to a new set of NBS tables.
    While "garbage collection generation" will protect the NBS from corruption by out-of-process writers, GC is not currently thread safe for concurrent use in-process. Getting to online GC will require work around protecting in-progress writes that are not yet reachable from the root chunk.
  • 927: /.github/workflows/ci-bats-tests.yaml: skip aws tests if no secrets found
  • 925: benchmark tools
  • 923: doc corrections
    fixed some typos (I think 😊)
  • 922: go/util/sremotesrv: grpc.go: Echo the client's NbsVersion in GetRepoMetadata.
  • 921: fix gcp remotes
  • 920: go/go.mod: Adopt dolthub/fslock fork. Forked version uses Open(RDRW) for lock file on *nix, which works on NFS.
  • 918: /.github/workflows/ci-bats-tests.yaml: remove deprecated syntax
  • 917: Increase maxiumum SQL statement length to 100MB (initially 512K)
    Signed-off-by: Zach Musgrave [email protected]
  • 915: Daylon's suggestions for bheni perf PR Pt. 2
  • 914: Fix for reading old dolt_schemas
  • 913: squash merge
  • 912: go/store/{datas,nbs}: Use application-level temp dir for byte sink chunk files with datas.Puller.
  • 911: Daylon's suggestions for bheni perf PR
  • 910: Adding "Garbage Collection Generation" to manifest file
    This new manifest field will support NomsBlockStore garbage collection and protect against NBS corruption. Storing gcGen in the manifest will support deleting chunks from an NBS in a safe way. NBS instances that see a different gcGen than they saw when they last read the manifest will error and require clients to re-attempt their write.
    NBS will now have three forms of write errors (not including IO errors or other kinds of unexpected errors):
    • nbs.errOptimisticLockFailedTables: Another writer landed a manifest update since the last time we read the manifest. The root chunk is unchanged and the set of chunks referenced in the manifest is either the same or has strictly grown. Therefore the NBS can handle this by rebasing on the new set of tables in the manifest and re-attempting to add the same set of novel tables.
    • nbs.errOptimisticLockFailedRoot: Another writer landed a manifest update that includes a new root chunk. The set of chunks referenced in the manifest is either the same or has strictly grown, but it is not know which chunk are reachable from the new root chunk. The NBS has to pass this value to the client and let them decide. If the client is a datas.database it will attempt to rebase, read the head of the dataset it is committing to, and execute its mergePolicy (Dolt passes a noop mergePolicy).
    • chunks.ErrGCGenerationExpired: This is similar to a moved root chunk, but with no guarantees about what chunks remain in the ChunkStore. Any information from CS.Has(ctx, chunk) is now stale. Writers must rewrite all data to the chunkstore.
  • 909: use tr to lowercase output instead of {output,,}
    lowercasing via parameter expansion ${output,,} is only supported on Bash 4+. I switched to using tr so I could run the tests locally.
  • 205: Implemented drop trigger
    As discussed, we disallow dropping any triggers that are referenced in other triggers.
  • 204: Added trigger statements