dolt

Dolt – Git for Data

APACHE-2.0 License

Downloads
2.4K
Stars
17.1K
Committers
143

Bot releases are visible (Hide)

dolt - 1.8.2

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6368: Allow -- to escape arg parsing
    Fixes: https://github.com/dolthub/dolt/issues/6001
  • 6367: Bug fixes for dolt_patch
    Adding support in dolt_patch() to:
    • return data diff patch statements when there are schema changes
    • return schema diff patch statements for charset/collation changes

go-mysql-server

Closed Issues

  • 6001: Customer wants to delete a branch named -b
  • 6332: Process.Progress (map[string]TableProgress) race condition

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.97 1.4
groupby_scan 12.98 17.95 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.21 2.26 1.9
index_scan 32.53 58.92 1.8
oltp_point_select 0.14 0.47 3.4
oltp_read_only 2.71 8.13 3.0
select_random_points 0.31 0.78 2.5
select_random_ranges 0.37 1.14 3.1
table_scan 33.12 58.92 1.8
types_table_scan 74.46 170.48 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 6.09 6.55 1.1
oltp_insert 2.86 3.07 1.1
oltp_read_write 6.55 15.83 2.4
oltp_update_index 2.86 3.19 1.1
oltp_update_non_index 2.97 3.19 1.1
oltp_write_only 3.82 7.84 2.1
types_delete_insert 5.37 7.17 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.8.1

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6363: CLI remote connections
    Enable the connection of SQL enabled CLI commands to remote servers with addition of three flags:
    • --host - The server to connect to. Using this flag forces the use of a remote connection no matter what.
    • --port - The port to use. Defaults to 3306. No effect when --host is not used.
    • --no-tls - Disable the use of SSL. This is required to connect to your local host.
      Also special cased the sql command to allow starting without a DB. Enables connecting to an empty repo so you can CREATE DATABASE FOO.
  • 6362: add warning message for unmigrated cli commands
    Adds warning message for unmigrated cli commands when they are attempted to be used remotely.
  • 6288: migrates dolt merge to use sql queries
    This change updates dolt merge to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922

go-mysql-server

  • 1877: Add "Sliding Range Join" execution plan
    This is the draft implementation of the "Sliding Range Join" execution plan. This allows for more performant joins when the join condition checks that the column on one table is within a range specified by two columns on the other table.
    TODO: Elaborate in this description before making this PR not a draft.
  • 1875: match notation for decimal parsing
    Follow up to https://github.com/dolthub/go-mysql-server/pull/1874
    Turns out it is possible to specify floats using scientific notation, which caused some issues with conversions especially around large decimals.
  • 1874: avoid scientific notation for floats/decimals
    We use some string comparison logic to find precision loss and determine if something should be float/decimal type.
    When printing very large floats, the %v format option in the fmt packages defaults to scientific notation, so like 1.234e567.
    This does not work well with our (hacky) string code.
    The %f option doesn't work very well, as it likes to append .00000.
    It appears strconv.FormatFloat does what we want, so I just picked that.
    fmt package docs: https://pkg.go.dev/fmt
    Format Float docs: https://pkg.go.dev/strconv#FormatFloat
    fix for: https://github.com/dolthub/dolt/issues/6322
  • 1873: Supporting mysql.help_ tables
    This first pass adds the table schemas to the mysql database for help_keyword, help_category, help_topic, and help_relation. There is no support for data in the tables yet; we're starting with just table schemas to see if that's enough for tool compatibility.
    Related to: https://github.com/dolthub/dolt/issues/6308
    I still need to do acceptance testing with the exact repro in the issue linked above. I'll follow up with Max on that tomorrow to confirm. I'm hoping just supporting the schema will be enough for the FusionAuth tool, but we may end up needing to populate these tables, too.
  • 1862: implementing lateral joins

vitess

Closed Issues

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.97 1.4
groupby_scan 12.98 18.28 1.4
index_join 1.25 4.82 3.9
index_join_scan 1.21 2.26 1.9
index_scan 33.12 59.99 1.8
oltp_point_select 0.14 0.47 3.4
oltp_read_only 2.71 8.13 3.0
select_random_points 0.31 0.8 2.6
select_random_ranges 0.37 1.14 3.1
table_scan 33.12 59.99 1.8
types_table_scan 74.46 170.48 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.74 5.99 1.3
oltp_insert 2.43 2.97 1.2
oltp_read_write 6.21 15.55 2.5
oltp_update_index 2.43 3.07 1.3
oltp_update_non_index 2.39 2.97 1.2
oltp_write_only 3.49 7.7 2.2
types_delete_insert 4.74 6.67 1.4
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.8.0

Published by github-actions[bot] over 1 year ago

This release contains backwards incompatible changes:

  • The dolt_merge() procedure now is able to merge with a dirty working set if no conflicts will occur. This differs from the previous behavior which would always fail a merge if there are any uncommitted changes in the working set.

Per Dolt’s versioning policy, this is a minor version bump because users of dolt_merge() results will need to be aware of this new behavior.

Merged PRs

dolt

  • 6360: Slightly better copy in README opening
  • 6352: Bug fix: Encode binary values in hex for SQL patch statements
    Current SQL formatting code converts binary values to strings, which won't round-trip back to the database correctly. This PR changes binary values to be hex-encoded.
    Fixes https://github.com/dolthub/dolt/issues/6350
  • 6346: Allow empty databases in the shell prompt
    Easy to reproduce issue:
    mkdir foobar
    cd foobar
    dolt sql-server
    
    dolt sql
    > select 1;
    
    And panic() follows. This actually happens for any command, so create database dba included.
  • 6344: updates dolt_merge to allow dirty working sets with no conflicts
    Currently, the dolt_merge stored procedure doesn't allow any uncommitted changes in the working set, it will abort the merge if any are found. However the CLI (and git) allows uncommitted changes in the working set if they won't get stomped by the merge. This change updates the dolt_merge stored procedure to allow that.

Closed Issues

  • 6350: dolt patch generates invalid statement for tables with varbinary keys
  • 6322: Incorrect DECIMAL precision
  • 6319: dolt gives constraint violation on merge when one doesn't exist

What's Changed

Full Changelog: https://github.com/dolthub/dolt/compare/v1.7.6...v1.8.0

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.03 3.02 1.5
groupby_scan 12.98 17.95 1.4
index_join 1.3 4.82 3.7
index_join_scan 1.23 2.3 1.9
index_scan 32.53 58.92 1.8
oltp_point_select 0.14 0.47 3.4
oltp_read_only 2.71 8.13 3.0
select_random_points 0.31 0.8 2.6
select_random_ranges 0.37 1.14 3.1
table_scan 33.12 58.92 1.8
types_table_scan 74.46 170.48 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 6.32 6.43 1.0
oltp_insert 2.76 3.19 1.2
oltp_read_write 6.55 15.83 2.4
oltp_update_index 3.02 3.19 1.1
oltp_update_non_index 2.81 3.19 1.1
oltp_write_only 3.96 7.84 2.0
types_delete_insert 5.47 7.04 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.7.6

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6345: Small changes to help improve the wait_for_connection failures
    • Pick from a random port from 2048 - 6146
    • Use nc to check if port is in use rather than lsof
    • Wait longer, in case the server retries successfully.
      I tested by running 10 changes through GH actions. 5 jobs were controls with no changes, 5 had this change. The control group had 2 jobs which failed waiting to connect, while the group with this change had no failures.
  • 6331: Update unique constraint violations when rows are deleted
    Issue https://github.com/dolthub/dolt/issues/6319 shows a case where our unique constraint checker reports false positives. This can happen when a row that is removed and a row that is inserted have duplicate values for columns that have a unique index declared over them. The root cause is that the deleted row is being incorrectly included in the rows checked for the unique constraint.
    I made two changes in this PR to fix this issue:
    • when row inserts or row deletes are processed by the unique constraint validator, it was previously not updating the data in it's index copy. This means if a row delete diff event comes in first, it wasn't getting removed from the index and when the row insert event comes in, it thinks there is a legitimate unique constraint violation with the deleted row. (Note that this local secondary index data is a copy of the secondary index data maintained by other parts of the prolly table merger. The primary and secondary index mergers were correctly updating the primary and secondary indexes on disk, but the copies held by the uniqueValidator were not being updated. It would be ideal if these mutable maps were shared by the various merge validators in the future.)
    • when a row delete event is processed by the unique constraint validator, in addition to removing the row from the secondary index copy as described above, the code now looks in the artifact map, which is where violations are stored, and removes any unique constraint violations for this row, and if there is only one unique constraint violation left for the same unique values in the index, that last violation is also removed (i.e. it's not ever valid to have one row that violates a unique constraint).
      I briefly considered other ways to address this, such as processing all row deletion events first, and then resetting the row diff iterator to replay other diff events, but directly updating the violation artifacts felt like the most efficient approach, although the logic is a bit complicated.
  • 6328: go/go.mod,go/.../events/go.mod: Bump golang.org/x/net dependency.
  • 6324: dolt checkout message
    Currently the dolt checkout can't work with a server running on a local database. There are more details to be sorted out with session persistence in the server which aren't addressed yet.
    The SQLEngine execution for using SQL as the backend is complete, but now we simply print a message stating that the user should stop the server.
  • 6321: Migrate the clean command to sql
    The dolt_clean() stored procedure pretty much does exactly what the CLI does, so this was an easy migration. What the dolt clean command doesn't have is tests, and neither does the dolt_clean() procedure for that matter. Unfortunately, the command experience itself isn't great, and differs significantly from git clean (https://github.com/dolthub/dolt/issues/6313). So instead of spending time on testing and a full rewrite of dolt clean, I'm making this small tactical change to do nothing but the migration.
    Related: https://github.com/dolthub/dolt/issues/3922

Closed Issues

  • 6325: Vulnerability of dependency "golang.org/x/net"
  • 6246: dolt sql prompt is missing the current db when a server is running
  • 6308: mysql.help_topic not found error

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 3.07 1.5
groupby_scan 12.98 17.95 1.4
index_join 1.25 4.74 3.8
index_join_scan 1.18 2.26 1.9
index_scan 33.12 58.92 1.8
oltp_point_select 0.14 0.47 3.4
oltp_read_only 2.66 7.98 3.0
select_random_points 0.3 0.78 2.6
select_random_ranges 0.36 1.14 3.2
table_scan 33.12 58.92 1.8
types_table_scan 75.82 170.48 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 6.09 6.21 1.0
oltp_insert 2.71 3.19 1.2
oltp_read_write 6.32 15.55 2.5
oltp_update_index 2.81 3.19 1.1
oltp_update_non_index 2.97 3.19 1.1
oltp_write_only 3.96 7.84 2.0
types_delete_insert 5.88 6.91 1.2
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.7.5

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6323: Bug Fix: Merge with convergent schema changes
    There were a couple of gaps in our schema merge logic and testing when convergent schema changes happen on both sides of a merge. This change fixes an issue where columns could be omitted if they changed in exactly the same way, and also an issue with spurious FK conflicts if identical FKs were added on both sides of a merge.
  • 6317: Fix intermittent BATS failure
    Fix intermittent BATS failure that happens when a tag like "v2" appears in the commit hash.
  • 6311: During CWBHeadRef/SetCWBHeadRef, surface RSLoadErr if it is present.
    This avoids the panic seen in https://github.com/dolthub/dolt/issues/6306 by surfacing the un-processed error RSLoadErr during CWBHeadRef/SetCWBHeadRef calls.
  • 6310: go/go.mod: Bump zap and cloud.google.com/go/storage.
  • 6302: Add more node integration tests for the hosted workbench
  • 6299: Migrate dolt tag to use SQL
    This change updates dolt tag command to use only SQL.
    Related: #3922
  • 6296: Migrate dolt revert to sql backend
    • Update the dolt revert command to use dolt_revert() stored procedure.
    • Update dolt_revert() stored procedure to respect the dolt_ignore table
    • Add dolt_revert() dedicated engine tests
    • dolt revert bats tests enabled for remote testing
      https://github.com/dolthub/dolt/issues/3922
  • 6281: return commit hash for dolt_merge() when --no-ff is not specified
    Fix a bug where we didn't return the commit hash for dolt_merge() when the --no-ff argument wasn't specified, and we performed a non-ff merge.
    Made it so that fast-forward merged will also return the new HEAD commit hash.
    Additionally, changed many enginetests to use the new DoltCommitType to validate dolt commit hashes.
    Also prevents panics when doing call dolt_merge().
    Companion PR:
    https://github.com/dolthub/go-mysql-server/pull/1865
  • 6278: Bump google.golang.org/grpc from 1.29.1 to 1.53.0 in /go/gen/proto/dolt/services/eventsapi
    Bumps google.golang.org/grpc from 1.29.1 to 1.53.0.
  • 6252: Set the DB name on the sql.Context when in remote mode
    The SQL.Context used in the remote connection case is kind of an empty shell since the connection has all the details for working with the DB. One side effect of having a dummy sql.Context was not having the database name set in the context. The shell uses it to set it's prompt, but this could have ramifications elsewhere we haven't discovered.
    No new tests - we don't have expect set up for this tool yet, and I don't want to block on this improvement which is fairly harmless if the existing tests pass.
    https://github.com/dolthub/dolt/issues/6246
  • 6209: Migrate dolt checkout to new CLI framework.
    This PR migrates dolt branch to invoke SQL commands instead of manipulating the database directly. This allows it to work even on remote connections.
    We implement this by adding an additional flag to the dolt_checkout stored proceudre: --global. This tells the SQL environment to persist this checkout for subsequent connections, making it the new default branch for future connections.
    If a server is currently running, dolt_checkout("--global") will return an error. This is because we try to avoid changing the default branch on a server until we can properly consider the indented effect on existing connections.
    Because working sets behave differently in our SQL environment than the command line (command line has a single working set, SQL land has a working set per-branch), we only permit changing the default branch when there would be no observable different in behavior: that is, when both the former and the new branch are clean. If this isn't the case, we return an error. This guarentees that a user calling dolt checkout from the command line will never see behavior that differs from the original CLI implementation. They will, at worst, get a helpful error message.

go-mysql-server

  • 1873: Supporting mysql.help_ tables
    This first pass adds the table schemas to the mysql database for help_keyword, help_category, help_topic, and help_relation. There is no support for data in the tables yet; we're starting with just table schemas to see if that's enough for tool compatibility.
    Related to: https://github.com/dolthub/dolt/issues/6308
    I still need to do acceptance testing with the exact repro in the issue linked above. I'll follow up with Max on that tomorrow to confirm. I'm hoping just supporting the schema will be enough for the FusionAuth tool, but we may end up needing to populate these tables, too.
  • 1865: Add CustomValidator interface
    In many places we expect to see a commit hash in our result.
    Since the hashes take into account the system time when computing, it is difficult to predict what they will be.
    This PR adds a new interface that can be implemented on the dolt side to check for commit hashes.

vitess

  • 254: Fix for using unqouted reserved words ('count' specifically) in the VALUES function
  • 253: Automatically concatenate adjacent string literals
    Fixes https://github.com/dolthub/dolt/issues/5232
    This is to match the mysql behavior:
    mysql> select "a" 'b'   "c";
    +-----+
    | a   |
    +-----+
    | abc |
    +-----+
    
    The grammar can't accommodate this so it has to go in the lexer. It doesn't work if the strings are broken up by a mysql special comment, so we still have to special case that.
    Also removed support for using string literals as table aliases. MySQL has a mode to support using double-quoted strings only as identifiers, but it's not on by default and isn't supported anywhere else in the grammar.

Closed Issues

  • 6155: Feature Request: Is there any chance to support Supabase or VectorDB?
  • 6306: Panic during status check on db that failed to clone

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.89 2.71 1.4
groupby_scan 12.08 16.71 1.4
index_join 1.16 4.33 3.7
index_join_scan 1.12 2.11 1.9
index_scan 30.26 55.82 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.29 0.75 2.6
select_random_ranges 0.35 1.08 3.1
table_scan 30.81 56.84 1.8
types_table_scan 69.29 158.63 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.0 5.88 1.2
oltp_insert 2.39 2.86 1.2
oltp_read_write 6.43 15.27 2.4
oltp_update_index 2.52 3.02 1.2
oltp_update_non_index 2.52 2.97 1.2
oltp_write_only 3.55 7.43 2.1
types_delete_insert 4.74 6.55 1.4
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.7.4

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6287: Run backwards compatibility tests against more recent versions of dolt.
    We no longer support the old __LD_1__ file format, so the backwards compatibility tests right now are only testing against massively out of date and unsupported versions of DOLT.
    This list includes v0.50.0, the first version where the new file format is the default, as well as v0.75.0, the other major release before v1.0.0, and then v1.0.0 and every minor release since.
  • 6285: Fix: unstaged ignored tables are staged after a cherry-pick abort
    Fix issue where unstaged ignored tables are staged after a cherry-pick abort
  • 6284: Autoincrement bug fixes
    This PR adds the ability to manually reset a table's next auto increment to a lower value than those previously used, and fixes a number of other long standing bugs related to auto increment mangement
    Fixes https://github.com/dolthub/dolt/issues/6253
  • 6259: Migrate dolt cherry-pick to only use SQL
    This change migrates dolt cherry-pick to only use SQL.
    Related: https://github.com/dolthub/dolt/issues/3922

go-mysql-server

  • 1868: Fix bug in OrderedDistinct over Projection
    A mistype was checking the table name twice for column equality, rather than table and column name. The bug led to using OrderedDistinct in inappropriate cases.
  • 1867: Support for USING character set expressions
    Fixes https://github.com/dolthub/dolt/issues/6291
  • 1865: Add CustomValidator interface
    In many places we expect to see a commit hash in our result.
    Since the hashes take into account the system time when computing, it is difficult to predict what they will be.
    This PR adds a new interface that can be implemented on the dolt side to check for commit hashes.
  • 1864: No parallelism for children of ordered distinct
    We permitted parallelism into an OrderedDistinct node, which is a specialized Distinct node that expects results sorted on a specific index key. This change prevents parallelizing children of OrderedDistinct.

vitess

  • 253: Automatically concatenate adjacent string literals
    Fixes https://github.com/dolthub/dolt/issues/5232
    This is to match the mysql behavior:
    mysql> select "a" 'b'   "c";
    +-----+
    | a   |
    +-----+
    | abc |
    +-----+
    
    The grammar can't accommodate this so it has to go in the lexer. It doesn't work if the strings are broken up by a mysql special comment, so we still have to special case that.
    Also removed support for using string literals as table aliases. MySQL has a mode to support using double-quoted strings only as identifiers, but it's not on by default and isn't supported anywhere else in the grammar.
  • 252: Fixed keyword usage in primary key clauses
    Fixes https://github.com/dolthub/dolt/issues/6290
  • 251: Fixed missing support for collations as strings
    Fixes issue https://github.com/dolthub/dolt/issues/6192
    We only allowed collations to be declared after the CREATE TABLE portion in their non-string form. This adds support for the string form.

Closed Issues

  • 6253: Can't set auto_increment counter to values already used even if records are deleted
  • 5232: Support implicit string concatenation
  • 6291: Support CONVERT USING <collation> syntax
  • 6290: primary key (password) requires password to be backticked

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.66 1.4
groupby_scan 12.3 17.01 1.4
index_join 1.21 4.33 3.6
index_join_scan 1.14 2.14 1.9
index_scan 30.26 55.82 1.8
oltp_point_select 0.15 0.46 3.1
oltp_read_only 2.86 7.98 2.8
select_random_points 0.3 0.75 2.5
select_random_ranges 0.35 1.08 3.1
table_scan 30.81 55.82 1.8
types_table_scan 69.29 158.63 2.3
reads_mean_multiplier 2.3
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.0 5.88 1.2
oltp_insert 2.39 2.91 1.2
oltp_read_write 6.32 15.27 2.4
oltp_update_index 2.43 3.02 1.2
oltp_update_non_index 2.48 2.97 1.2
oltp_write_only 3.62 7.56 2.1
types_delete_insert 5.09 6.43 1.3
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.7.3

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6289: Update Dolt's binary size in README
    We went from 68M to 103M because of the new collations.
  • 6283: Remove temporary workaround for loopback bug in GMS
    Resolves: https://github.com/dolthub/dolt/issues/6239
  • 6280: Add ignore support to dolt_cherry_pick procedure.
    This updates the dolt_cherry_pick procedure to properly work with ignored tables.
  • 6277: adds test for reset handling ignored tables
    Adds tests for dolt reset and dolt_reset() to check their handling of ignored tables.
  • 6276: Use correct local branch name when setting upstream in dolt_checkout
    I found an issue in dolt_checkout that occurs in the narrow case where both the -b and --track flags are provided, in order to create a new branch that tracks an upstream branch with a different name.
    In that case, we fail to set the upstream correctly because we accidentally look for a local branch to mark using the remote branch name, instead of the local branch name.
    I added a regression test.
  • 6271: Revert transaction-unsafe implementation of SessionStateAdapter::SetCWBHeadRef
    This recently added implementation of SetCWBHeadRef was a designed to fix an issue when evalutating call dolt_branch('-m', $ACTIVE_BRANCH, $NEW_BRANCH) (which would cause a panic) by changing the active branch on the SQL session to match the new branch. However, this implementation has some problems.
    When operating on the DB from a SQL-context, we use transactions to ensure that changes are applied atomically. However currently, renaming branches doesn't use transactions because they touch multiple branches at once and we can't really model that as a transaction. This becomes a problem when mixed with DoltSession operations, which do use transactions.
    My original implementation attempted to work around this by calling commitTransaction before and after setting the new current working branch. This would allow SwitchActiveBranch to see the newly created branch. However, this is not a correct use of manual transaction management and is not generally safe.
    No matter the solution to the original bug, doing manual transaction management in SetCWBHeadRef is not a good idea. It's been reverted, and the call to SetCWBHeadRef within RenameBranch has been removed.
    The last commit in this PR is an attempt to implement the branch switch within RenameBranch itself. This also uses manual transaction management, but this is isolated to a single execution path. I welcome feedback on whether there's a better way to do this.
  • 6270: Remove alter_statement column from dolt_schema_diff results.
    Remove alter_statement column from dolt_schema_diff results.
    This change is a result of feedback about the dolt_schema_diff output: https://github.com/dolthub/docs/pull/1575#discussion_r1247212714
    We're removing the alter_statement column, and requiring the user to get this information from dolt_patch.
    The migrated command dolt diff is now doing this: we identify the changed tables using dolt_schema_diff, then get the alter statements from dolt_patch.
    This update also resolves https://github.com/dolthub/dolt/issues/6265 , which is a feature request to return empty to_*/from_* columns in cases when the table does not exist at the to or from revision.

go-mysql-server

  • 1864: No parallelism for children of ordered distinct
    We permitted parallelism into an OrderedDistinct node, which is a specialized Distinct node that expects results sorted on a specific index key. This change prevents parallelizing children of OrderedDistinct.
  • 1861: chore: remove refs to deprecated io/ioutil
  • 1860: chore: unnecessary use of fmt.Sprintf
  • 1859: chore: use copy(to, from) instead of a loop
  • 1856: Support IPV6 loopback address for looking up user credentials
    Map "::1" and "127.0.0.1" to localhost when looking up users.
    There don't appear to be tests for this code path. TBD if I'll add some.
    Related to: https://github.com/dolthub/dolt/issues/6239
  • 1854: Prevent loops in stored procedures from returning multiple result sets
    The query in https://github.com/dolthub/dolt/issues/6230 was causing rows from many result sets to be returned from a stored procedure. We already have code that limits BEGIN/END blocks to return the last SELECTed result set; this PR extends that logic to loop constructs as well.
    Fixes: https://github.com/dolthub/dolt/issues/6230
    Dolt CI Checks: https://github.com/dolthub/dolt/pull/6245
  • 1853: chore: slice replace loop
  • 1852: Alter stored procedure execution to deal with statements that commit transactions
    This change adds checks to begin a new transaction whenever there isn't one during stored procedure execution. This lets things like dolt_commit() execute correctly in stored procedures.
  • 1851: memo.Literal has different type than lookup
    This panics on dolt:
    CREATE TABLE tab2(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
    CREATE UNIQUE INDEX idx_tab2_0 ON tab2 (col1 DESC,col4 DESC);
    CREATE INDEX idx_tab2_1 ON tab2 (col1,col0);
    CREATE INDEX idx_tab2_2 ON tab2 (col4,col0);
    CREATE INDEX idx_tab2_3 ON tab2 (col3 DESC);
    INSERT INTO tab2 VALUES(0,344,171.98,'nwowg',833,149.54,'wjiif');
    INSERT INTO tab2 VALUES(1,353,589.18,'femmh',44,621.85,'qedct');
    SELECT pk FROM tab2 WHERE ((((((col0 IN (SELECT col3 FROM tab2 WHERE ((col1 = 672.71)) AND col4 IN (SELECT col1 FROM tab2 WHERE ((col4 > 169.88 OR col0 > 939 AND ((col3 > 578))))) AND col0 >= 377) AND col4 >= 817.87 AND (col4 > 597.59)) OR col4 >= 434.59 AND ((col4 < 158.43)))))) AND col0 < 303) OR ((col0 > 549)) AND (col4 BETWEEN 816.92 AND 983.96) OR (col3 BETWEEN 421 AND 96);
    
    The PutField function expects the value to match the tuple descriptor exactly, and will panic if it does not.
    The section of code in memo that creates a new range uses the type from the expression, but in other places it uses the index column expression types.
    An alternative solution would be to have some logic in dolt to convert to the corresponding sql.Type based off the val.Enc
  • 1850: Name resolution correctness tests
    This fixes many of the remaining correctness tests for TestSimpleQueries, TestsJoinOps, TestJoinPlanning, TestColumnAliases, TestDerivedTableOuterScopeVisibility, TestAmbiguousColumnResolution, TestReadOnlyVersionedQueries with the new name resolution strategy.
    Many of the query plans are slightly different but mostly equivalent. Join rearrangements and un-nesting in particular are better after this change, because I needed the transform logic work for both. There are a variety of other bugs the slight plan differences exposed that are fixed now.
    This does not fix every set of enginetests, there is still a lot to do. But I'm locking in compatibility for most of the core tests to prevent backsliding.
    The next follow-up is probably replacing the old name resolution. I will need to figure out if triggers, procs, prepared statements need any sort of special treatment.

Closed Issues

  • 6239: dolt sql-server --host 0.0.0.0 authentication behavior differs from mysql
  • 6265: Include added and dropped tables in dolt_schema_diff table function

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.71 1.4
groupby_scan 12.3 16.41 1.3
index_join 1.18 4.33 3.7
index_join_scan 1.14 2.18 1.9
index_scan 30.26 55.82 1.8
oltp_point_select 0.15 0.46 3.1
oltp_read_only 2.91 8.13 2.8
select_random_points 0.3 0.77 2.6
select_random_ranges 0.35 1.1 3.1
table_scan 30.81 56.84 1.8
types_table_scan 70.55 161.51 2.3
reads_mean_multiplier 2.3
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.09 6.09 1.2
oltp_insert 2.48 2.97 1.2
oltp_read_write 6.55 15.55 2.4
oltp_update_index 2.57 3.07 1.2
oltp_update_non_index 2.71 3.02 1.1
oltp_write_only 3.75 7.7 2.1
types_delete_insert 5.18 6.67 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.7.2

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6268: chore: slice replace loop
  • 6267: change commit to print with printCommitInfo
    Changes commit to print with printCommitInfo. This prevents seg fault errors from using dolt log as was previously done.
  • 6258: Update dolt_schema_diff test data to be more descriptive
    This matches the upcoming documentation for dolt_schema_diff function: https://github.com/dolthub/docs/pull/1575
  • 6247: removed io/ioutil usage
    Also fixed a resource leak in a test revealed by this fix
  • 6241: Migrate dolt conflicts resolve to use only SQL
    This change migrates dolt conflicts resolve to only use SQL.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 6223: Migrate dolt conflicts cat to SQL
    Migrate dolt conflicts cat to use SQL.
    Currently only dolt conflicts cat is migrated, dolt conflicts resolve is not yet migrated.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 6214: updates dolt reset to use sql queries
    This change updates dolt reset to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922

Closed Issues

  • 6269: Three table join sensitive to join order

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.96 2.66 1.4
groupby_scan 12.3 16.71 1.4
index_join 1.16 4.25 3.7
index_join_scan 1.12 2.11 1.9
index_scan 30.26 53.85 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.29 0.74 2.6
select_random_ranges 0.35 1.06 3.0
table_scan 30.81 55.82 1.8
types_table_scan 69.29 155.8 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.91 5.88 1.2
oltp_insert 2.39 2.86 1.2
oltp_read_write 6.43 15.0 2.3
oltp_update_index 2.76 2.97 1.1
oltp_update_non_index 2.61 2.97 1.1
oltp_write_only 3.49 7.43 2.1
types_delete_insert 4.91 6.32 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.7.1

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6250: fixes infinite loop related
  • 6249: Fix for cherry-pick docs and PK mapping error message
    • Update cherry-pick docs now that schema and data conflicts are supported in cherry-pick.
    • Fix a bug in schema.go where the incorrect column name (Invalid) was being printed out due to a shadowed col variable.
  • 6244: dolt clone: Fix dolt clone run against a sql-server where the database has been GCd.
    A long-standing bug in the remotesapi which the sql-server exposes could cause a dolt clone to fail when running against a database which had been garbage collected. This change fixes the bug in the server. It also patches the client behavior so that it will tolerate responses from older versions of dolt.
  • 6243: Fix dolt status formatting
    dolt status output was missing some new lines
    taylor@MacBook-Pro-3 test % dolt status
    On branch main
    Your branch is ahead of 'origin/main' by 1 commit.
    (use "dolt push" to publish your local commits)Changes not staged for commit:
    (use "dolt add <table>" to update what will be committed)
    (use "dolt checkout <table>" to discard changes in working directory)
    modified:         table1
    taylor@MacBook-Pro-3 test % dolt status
    
    taylor@MacBook-Pro-3 test % dolt status
    On branch main
    Your branch is up to date with 'origin/main'.nothing to commit, working tree clean
    
  • 6242: adds print statements for errors in commit
    adds missing print statements for errors occurring in dolt commit
  • 6240: Add global arguments section to dump-docs command
  • 6221: Migrate dolt diff and dolt show to SQL implementation
    This change migrates the current dolt diff implementation to use only SQL.
    This is done by offloading most of the diffing logic to dolt_diff and dolt_schema_diff table functions, and making some of the common code generic, so we don't have to rely on DoltDb objects (which rely on dolt internals).
    This change also partially migrates dolt show to SQL.
    dolt show has the ability to log some of the storage internals in two cases:
    1. --no-pretty flag is specified
    2. dolt show is invoked against non-commit hashes
      When either of the above cases occurs, we use DoltEnv to get these structures and log them. In all other cases, we use SQL to get the list of changes.

Closed Issues

  • 6230: Stored procedure includes results from SET statements
  • 4204: Set commit message and author while using @@dolt_transaction_commit
  • 3521: Multi-database state syncing in transactions
  • 5933: DBeaver does not remove tables correctly when database isn't specified in connection
  • 3789: Loosen identifier validation to match MySQL
  • 4885: SAVEPOINT fails if database is not selected.
  • 5042: Dolt allows creating trigger in not current schema
  • 5314: USE database/branch broken for write statements
  • 5364: mysql system db isn't "use"able via sql-server
  • 5816: call dolt_checkout(<table>) does not work for revision databases

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.71 1.4
groupby_scan 12.52 16.71 1.3
index_join 1.18 4.33 3.7
index_join_scan 1.12 2.07 1.8
index_scan 30.81 54.83 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.81 7.98 2.8
select_random_points 0.29 0.75 2.6
select_random_ranges 0.35 1.08 3.1
table_scan 30.81 55.82 1.8
types_table_scan 69.29 161.51 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.82 6.09 1.3
oltp_insert 2.52 2.91 1.2
oltp_read_write 6.21 15.0 2.4
oltp_update_index 2.48 2.97 1.2
oltp_update_non_index 2.39 2.97 1.2
oltp_write_only 3.49 7.43 2.1
types_delete_insert 5.28 6.55 1.2
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.7.0

Published by github-actions[bot] over 1 year ago

This release contains backwards incompatible changes:

  • Dolt commands which have been migrated to use SQL (sql, add, blame, status, commit, branch) have new behavior which may impact existing user workflows. Specifically, If the --user flag is in use, there is a new --password argument which will be required. If one is not presented, you will be prompted for a password, and this will hang waiting for input. Furthermore, previously these commands would allow users to specify incorrect credentials silently. If bad credentials are given now, the commands will fail appropriately. If you want to use a password in your automated jobs, but don't want to specify the password on the command line, you can set the DOLT_CLI_PASSWORD environment variable.

Per Dolt’s versioning policy, this is a minor version bump (major.minor.patch).

Merged PRs

dolt

  • 6237: chore: fmt modify
    unnecessary use of fmt.Sprintf
    use fmt.Errorf(...) instead of errors.New(fmt.Sprintf(...))
    use fmt.Printf instead of fmt.Println(fmt.Sprintf(...)) (but don't forget the newline)
  • 6235: Bring back pretty content for commit when commiting to a local db
    The output of the commit command changed with: https://github.com/dolthub/dolt/pull/6138
    This change brings back the pretty log command output when running in a local context. This is a temporary change since we probably won't get to the log command for awhile yet.
  • 6234, 6233, 6232: Continuous Integration Updates
  • 6229: Support for dolt_ procedures inside user stored procedures
    These are just tests, the actual necessary changes are in https://github.com/dolthub/go-mysql-server/pull/1852
    This addresses a large part of https://github.com/dolthub/dolt/issues/5829, but see the last comment there: because of how we analyze / execute stored procedures, things dolt_checkout() doesn't work in stored procedures in some cases (namely when using unqualified table names).
  • 6176: Enable authentication in the CLI commands migrated to sql backend
    Enable Authentication in the following ways:
    • New Global Flag --password
    • DOLT_CLI_PASSWORD environment variable
    • Ask for a password with a prompt.
    • Automatic authentication to a server using the secret in the sql-server.lock file
      One significant change in behavior is that previously if a user presented a non-sense username/password, we'd accept it as a super user identity. If a real user was specified, we would promote that user to a super user - regardless of if the password was correct.
      Now, if a --user flag is presented, the user must present a password by flag, env var or prompt. If the user/pwd combination is not a known user, the command will fail. This applies to both local and remote mode.
      Important to call out that this isn't about security. If you want to be a super user, you can just not provide a --user. This behavior is to enable consistent behavior of client applications where they need to test permissions. We were making this impossible before because bad credentials were being promoted.
      Related: https://github.com/dolthub/dolt/issues/3922

go-mysql-server

  • 1848: IntDiv.Type() should always return either uint64 or int64
    Previously, our IntDiv.convertLeftRight() used IntDiv.Type() to determine the larger type between IntDiv.Left.Type() and IntDiv.Right.Type() to avoid precision loss when doing internal calculations. Now, that logic is moved from IntDiv.Type() to IntDiv.convertLeftRight(), and IntDiv.Type() can only return uint64 or int64.
    This should fix the sql correctness regression from https://github.com/dolthub/go-mysql-server/pull/1834
  • 1847: Fix TargetSchema.Resolved() to check targetSchema column default expressions
    A couple SchemaTarget implementations weren't checking if the targetSchema was resolved as part of the Resolved() method. Added tests, audited the other implementations, and simplified the logic to use a new method on Schema to check that column default expressions are resolved.
    Fixes: https://github.com/dolthub/dolt/issues/6206
    Dolt CI Run: https://github.com/dolthub/dolt/pull/6213
  • 1839: Slow degenerate semi join, hoist select opt
    This enables recursive subquery decorrelations, and adds a hash join execution option for semi joins that is equivalent to cached subquery existence checks.

Closed Issues

  • 5829: dolt_commit() will break some following statements in stored procedures

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.96 2.66 1.4
groupby_scan 12.3 17.01 1.4
index_join 1.16 4.41 3.8
index_join_scan 1.12 2.11 1.9
index_scan 30.81 54.83 1.8
oltp_point_select 0.15 0.46 3.1
oltp_read_only 2.86 7.98 2.8
select_random_points 0.3 0.75 2.5
select_random_ranges 0.35 1.06 3.0
table_scan 30.81 54.83 1.8
types_table_scan 70.55 155.8 2.2
reads_mean_multiplier 2.3
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.18 5.88 1.1
oltp_insert 2.57 2.91 1.1
oltp_read_write 6.32 15.27 2.4
oltp_update_index 2.48 3.07 1.2
oltp_update_non_index 2.57 2.91 1.1
oltp_write_only 3.55 7.43 2.1
types_delete_insert 5.09 6.55 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 1.9
dolt - 1.6.1

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6215: fix error return code in commit
    fixes commit to return an error when there is an error processing result rows
  • 6212: Add dolt_schema_diff table function
    dolt_schema_diff will return the schema diffs between refs and optionally will filter those changes to a specific table.
    This is a new table function that provides us with the information we need for dolt diff.
    The original PR for this change was accidentally merged too early. The comments from that PR have been integrated into this change.
  • 6203: Bug fixes for resolving the default DB name to the correct branch, was broken in several cases

go-mysql-server

  • 1847: Fix TargetSchema.Resolved() to check targetSchema column default expressions
    A couple SchemaTarget implementations weren't checking if the targetSchema was resolved as part of the Resolved() method. Added tests, audited the other implementations, and simplified the logic to use a new method on Schema to check that column default expressions are resolved.
    Fixes: https://github.com/dolthub/dolt/issues/6206
    Dolt CI Run: https://github.com/dolthub/dolt/pull/6213
  • 1846: update information_schema.processlist to correctly display status of processes and databases
    We used to hardcode "Query", now we reference process.Command
    Additionally, we now get the database from the current session and use that variable.
    fix for: https://github.com/dolthub/dolt/issues/6023
  • 1843: Improvements to CAST and CONVERT functions
    This PR adds support for casting/converting to FLOAT and DOUBLE types with the CAST and CONVERT functions. It also adds support for length (aka precision) and scale type constraints (e.g. CAST(1.2345 AS DECIMAL(3,2))).
    Parser support for DOUBLE and FLOAT with CAST and CONVERT: https://github.com/dolthub/vitess/pull/249
    Fixes: https://github.com/dolthub/dolt/issues/5835

Closed Issues

  • 6206: Panic when altering a column in a table where another column uses a function in a default expression
  • 6183: dolt is hitting a sql lock when trying to start after shutting down
  • 5835: CAST not fully supported
  • 6023: information_schema.processlist result is inconsistent with show processlist

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.71 1.4
groupby_scan 12.3 16.71 1.4
index_join 1.18 4.33 3.7
index_join_scan 1.14 2.14 1.9
index_scan 30.26 55.82 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 8.13 2.8
select_random_points 0.29 0.75 2.6
select_random_ranges 0.35 1.08 3.1
table_scan 30.81 55.82 1.8
types_table_scan 70.55 158.63 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.49 5.67 1.3
oltp_insert 2.18 2.76 1.3
oltp_read_write 6.09 15.27 2.5
oltp_update_index 2.26 2.91 1.3
oltp_update_non_index 2.26 2.81 1.2
oltp_write_only 3.25 7.43 2.3
types_delete_insert 4.57 6.32 1.4
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.6.0

Published by github-actions[bot] over 1 year ago

This release contains backwards-incompatible changes:

  • information_schema tables now contain entries for revision-qualified database names (e.g. mydb/mybranch) if any are active in the current session. There is always an entry for the base database name (mydb) as well. This also applies to statements such as show databases.

In line with Dolt's version policy, this release is therefore a minor version increment (major.minor.patch)

Merged PRs

dolt

  • 6210: Revert "Add dolt_schema_diff table function"
    Reverts dolthub/dolt#6185
    Change 6185 was merged in ahead of approval.
  • 6204: sqle: cluster: Add dolt_cluster_transition_to_standby stored procedure.
    This new stored procedure can be used to gracefully transition a primary to a standby while not all replicas in the cluster are up and able to become current. In contrast to dolt_assume_cluster_role, this stored procedure can only transition to role standby, but takes a second integer parameter representing the number of replica servers in the cluster which must be made current in order for the transition to succeed.
  • 6191: Use a table's workingRoot when getting declared FKs.
    Fixes https://github.com/dolthub/dolt/issues/6178
  • 6190: Ignore the sql-server.lock file if the PID in the file matches current dolt process
    https://github.com/dolthub/dolt/issues/6183
    Regarding automated testing: This is an awkward one because we don't know what PID a process will be given. I tested locally with a docker images I hand crafted, but otherwise I'm going to punt on testing this in CI.
  • 6188: Reverting changes to database name management: branch-qualified databases now included in info_schema tables
    Fixes https://github.com/dolthub/dolt/issues/6173
  • 6185: Add dolt_schema_diff table function
    dolt_schema_diff will return the schema diffs between refs and optionally will filter those changes to a specific table.
    This is a new table function that provides us with the information we need for dolt diff.
  • 6162: Savepoint operations now support multiple databases in a transaction
  • 6138: update dolt commit to use sql queries
    This change updates dolt commit to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 6128: Migrate dolt branch to new CLI framework.
    This PR migrates dolt branch to invoke SQL commands instead of manipulating the database directly. This allows it to work even on remote connections.
    The only thing that hasn't been migrated yet is dolt branch --datasets. It's not a documented flag. We're currently using it for an internal test and can migrate it if we can figure out a way to rewrite the test.

go-mysql-server

  • 1846: update information_schema.processlist to correctly display status of processes and databases
    We used to hardcode "Query", now we reference process.Command
    Additionally, we now get the database from the current session and use that variable.
    fix for: https://github.com/dolthub/dolt/issues/6023
  • 1844: fix panic for group by binary type
    We made a bad type assertion for sql.StringType.
    Additionally, this fixes a issue where UnaryExpressions with GetFields would incorrectly throw a functional dependency error with ONLY_FULL_GROUP_BY enabled.
    Fix for second part of: https://github.com/dolthub/dolt/issues/6179
  • 1841: adding version and version_comment values
    @@version now returns 8.0.11
    @@version_comment now returns "Dolt"; in mysql, this appears to be dependent on OS / method of install
  • 1840: deduplicate (hash) intuple for and queries
    This PR was originally supposed to fix it: original fix: https://github.com/dolthub/go-mysql-server/pull/1677, but AND statements weren't covered.
    fix for: https://github.com/dolthub/dolt/issues/6189
  • 1838: resolve aliases in subqueries in function arguments
    The rule reorderProjection also replaces subqueries with getfields in projections when they are used by subqueries, but it did not check for function expressions.
    This meant that aliases in subqueries as arguments to functions threw a "x" could not be found error.
    This PR just has the section of reorderProjection that is supposed to find deferredColumns also look at the arguments of functions recursively (because we can nest functions).
    Additionally, there was another schema type bug:
    tmp> select 0 as foo, if((select foo), 123, 456);
    +-----+----------------------------+
    | foo | if((select foo), 123, 456) |
    +-----+----------------------------+
    | 0   | 127                        |
    +-----+----------------------------+
    1 row in set (0.00 sec)
    
    MySQL returns an Integer type for if statement, and if either argument is a String, it always returns a String.
    fix for: https://github.com/dolthub/dolt/issues/6174
  • 1836: update cached table count in prepared statements
    Prepared statements were caching table counts. We need to update the table count when finalizing prepared statements to bring table count up to date with any intermediate edits.
  • 1834: fix expected schema for sum(literal)
    The code path we take when print rows to shell is different than spooling from server.
    In the sql case, we ignore the schema we get from analysis.
    In the server case, we actually read the schema, and ensure that the rows are of that type.
    When doing sum(literal), we use the type of the literal. In this issue, the literal was 1, so an INT8, which caps out at 127.
    sum() is always supposed to return a float64, so I made a change to do that.
    I checked by starting mysql with --column-type-info option, and it does appear that any columns coming from sum() has a DECIMAL type.
    Fix for: https://github.com/dolthub/dolt/issues/6120
  • 1827: Remove db-specific transaction interfaces / logic

vitess

  • 251: Fixed missing support for collations as strings
    Fixes issue https://github.com/dolthub/dolt/issues/6192
    We only allowed collations to be declared after the CREATE TABLE portion in their non-string form. This adds support for the string form.
  • 249: Various small parser improvements
    Various small parser improvements:
    • Allow column definitions to use the MySQL INVISIBLE keyword. The implementation still ignores the INVISIBLE keyword, but it will no longer cause a parser error.
    • Support DOUBLE and FLOAT in the CAST and CONVERT functions.
    • Allow a trigger body to be a single CALL statement.
  • 248: Support for index hint in foreign key definition
  • 247: Support FK definitions inline in column definitions
    Adds support for declaring FK references inline in column definitions. Does not support ON DELETE and ON UPDATE yet. Example: ALTER TABLE t ADD COLUMN col2 int REFERENCES other_table(id);
    Also cleaned up a few rules around non-reserved keywords to enable event to be used unquoted in ALTER TABLE statements.

Closed Issues

  • 6178: "SHOW CREATE TABLE foo AS OF 'bar'" shows incorrect constraints
  • 6095: Allow conflict resolving for dolt_cherry_pick()
  • 6192: Dolt does not allow strings as collate name
  • 6179: Phpmyadmin crashes Dolt
  • 6174: Unable to resolve alias in subquery in projection
  • 6189: Incorrect query results for GROUP BY
  • 6173: list branch-qualified databases in information_schema
  • 6120: select count(1) from <table> does not return 1
  • 6181: Commit/push via stored procedure is unreliable
  • 6157: select count(*) from <table> confusingly slow for some users
  • 1800: Extracting update query columns

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.96 2.71 1.4
groupby_scan 12.3 17.01 1.4
index_join 1.16 4.49 3.9
index_join_scan 1.12 2.18 1.9
index_scan 30.81 55.82 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.29 0.75 2.6
select_random_ranges 0.35 1.06 3.0
table_scan 30.81 55.82 1.8
types_table_scan 69.29 158.63 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.47 6.32 1.2
oltp_insert 2.81 3.13 1.1
oltp_read_write 6.67 15.55 2.3
oltp_update_index 2.76 3.25 1.2
oltp_update_non_index 2.86 3.19 1.1
oltp_write_only 3.96 7.7 1.9
types_delete_insert 5.99 7.04 1.2
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.5.0

Published by github-actions[bot] over 1 year ago

This release contains backwards incompatible changes:

  • The dolt_merge() procedure now returns the commit hash of the merge as well as fast-forward and conflicts result flags that it returned before.
  • The databases key is no longer accepted in config.yaml files. Use the data_dir key instead.

Per Dolt’s versioning policy, this is a minor version bump (major.minor.patch) because consumers of the dolt_merge() results will have to update to the new return values.

Merged PRs

dolt

  • 6163: add hash column to result set for dolt_merge(..., '--no-ff')
    fix for: https://github.com/dolthub/dolt/issues/6159
  • 6158: improve valid identifer logic
    Changed our regex for identifiers to better match MySQL.
    Now, we allow all characters within the unicode range [\u0001-\uFFFF].
    However, to utilize characters outside of ASCII range (most of unicode) they must be back-quoted
    create table `ドルト` (i int primary key);
    
    This works in MySQL without the backquotes, but that is a separate fix likely stemming from vitess.
    This is now valid in dolt and MySQL, which might be bad for our branching capabilities.
    create table `branch/table` (i int primary key);
    
    Fix for: https://github.com/dolthub/dolt/issues/6156
    Docs: https://dev.mysql.com/doc/refman/8.0/en/identifiers.html
  • 6114: Better docs for sql-server command
    Provides a full example of a YAML file with all currently supported fields and their defaults (several were missing).
    Also removes the databases YAML config element.

go-mysql-server

  • 1836: update cached table count in prepared statements
    Prepared statements were caching table counts. We need to update the table count when finalizing prepared statements to bring table count up to date with any intermediate edits.
  • 1830: Use SO_REUSEADDR and SO_REUSEPORT options when creating the sql server on Unix
    This prevents a transient error we've been seeing where the server sometimes fails to start, and the OS claims port already in use, even though we've already confirmed that the port is not in use prior to running dolt sql-server.
  • 1829: plan.TableCountLookup short circuits count()
    In many cases it is unnecessary to read an entire table to report count(*). We can use the RowCount() interface to jump to the answer.
  • 1828: Consolidated collation maps
    Main file to check is the generate/main.go file.
  • 1825: implement create spatial ref sys
    This implements the create spatial reference system ..., which lets users add custom SRID to the information schema.
    MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/create-spatial-reference-system.html
    MySQL is much more restrictive when it comes to what is a valid DEFINITION for an entry in this table, and the rules are unclear, so we are much more permissive for now.
    Additionally, this information persist in MySQL between server restarts, which we do not do. However, MySQL does throw a warning stating that updating may discard any changes the user makes.
    Lastly, the values persist between test runs, and we don't support deleting from information_schema, so some tests are modified.
    fix for: https://github.com/dolthub/dolt/issues/6002
  • 1822: join filter closure and constant join lookups
    This PR adds a set of join planning improvements.
    1. Table aliases can accept multi column indexes
      We have never been able to choose a multi-expression range scan through table aliases.
      Before:
    tmp2> explain select * from t alias where a = 1 and b = 1 and c = 1;
    +-----------------------------------------------------------+
    | plan                                                      |
    +-----------------------------------------------------------+
    | Filter                                                    |
    |  ├─ (((alias.a = 1) AND (alias.b = 1)) AND (alias.c = 1)) |
    |  └─ TableAlias(alias)                                     |
    |      └─ IndexedTableAccess(t)                             |
    |          ├─ index: [t.a]                                  |
    |          ├─ filters: [{[1, 1]}]                           |
    |          └─ columns: [a b c]                              |
    +-----------------------------------------------------------+
    
    After:
    tmp2> explain select * from t alias where a = 1 and b = 1 and c = 1;
    +-----------------------------------------------------------+
    | plan                                                      |
    +-----------------------------------------------------------+
    | Filter                                                    |
    |  ├─ (((alias.a = 1) AND (alias.b = 1)) AND (alias.c = 1)) |
    |  └─ TableAlias(alias)                                     |
    |      └─ IndexedTableAccess(t)                             |
    |          ├─ index: [t.a,t.b,t.c]                          |
    |          ├─ filters: [{[1, 1], [1, 1], [1, 1]}]           |
    |          └─ columns: [a b c]                              |
    +-----------------------------------------------------------+
    
    This has silently been impacting join performance in particular, where table aliases are more common. This is a small change but I'd expect this to have a broad positive impact for customers.
    2. Join equivalence closure
    A join like select * from xy join uv on x = u join ab on u = a has two initial join edges, x = u and u = a. Those edges create expression groupings xy x uv, uv x ab, xy x uv x ab. There misses a transitive edge, x = a, with a corresponding join group xy x ab. We should generate plans for most transitive edges now (transitive edges in apply joins are harder).
    For joins with many tables this will unlock many potential join paths.
    3. Use functional dependencies to find more lookup and merge joins
    We can use constants and aggregated equivalency sets (equal filters) to be more aggressive with lookup join selection. Previously we only searched the current join ON equal conditions for expressions that match an index prefix for a lookup join, but constants are also valid lookup keys.
    Refer to https://github.com/dolthub/dolt/issues/5993 and https://github.com/dolthub/dolt/issues/3797 for in-depth examples.
    4. Use functional dependencies to do better lookup join costing
    Even though we do not have index statistics, we can still use functional dependencies on indexes to detect whether a lookup will have MAX_1_ROW. Two examples where we can detect MAX_1_ROW: our lookup index is the primary key, and our lookup key provides a constant or equals expression for every pk column; our lookup index is unique, our lookup key has constants or equal expressions for every column, and we can prove that every key expression is non-nullable.
    MAX_1_ROW lookups are a rare binary condition, most of the time selectivity is in the continuous range 0-1. When they do occur they are usually the most efficient access pattern. Many of the test changes from HASH_JOIN or MERGE_JOIN to LOOKUP_JOIN are a result of this improvement. The issues linked above in (3) have practical examples.

vitess

  • 247: Support FK definitions inline in column definitions
    Adds support for declaring FK references inline in column definitions. Does not support ON DELETE and ON UPDATE yet. Example: ALTER TABLE t ADD COLUMN col2 int REFERENCES other_table(id);
    Also cleaned up a few rules around non-reserved keywords to enable event to be used unquoted in ALTER TABLE statements.
  • 246: support CREATE SPATIAL REFERENCE SYSTEM ... syntax
    Syntax for: https://github.com/dolthub/dolt/issues/6002
  • 240: Support more JSON_TABLE functionality
    Source: https://dev.mysql.com/doc/refman/8.0/en/json-table-functions.html
    JSON_TABLE(
    expr,
    path COLUMNS (column_list)
    )   [AS] alias
    column_list:
    column[, column][, ...]
    column:
    name FOR ORDINALITY
    |  name type PATH string path [on_empty] [on_error]
    |  name type EXISTS PATH string path
    |  NESTED [PATH] path COLUMNS (column_list)
    on_empty:
    {NULL | DEFAULT json_string | ERROR} ON EMPTY
    on_error:
    {NULL | DEFAULT json_string | ERROR} ON ERROR
    
    Note: the MySQL docs indicate that PATH is optional in the NESTED case, but it doesn't seem that way.
    I chose to follow what they say rather than what they do.

Closed Issues

  • 6123: event reserved keyword quoting differences with MySQL
  • 6002: Support CREATE SPATIAL REFERENCE SYSTEM query
  • 6159: Feature request: call dolt_merge(..., '--no-ff') should return the commit hash
  • 6156: Dolt does not appear to support all MySQL FK and Table valid naming characters

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.71 1.4
groupby_scan 12.3 17.01 1.4
index_join 1.16 4.41 3.8
index_join_scan 1.12 2.14 1.9
index_scan 30.81 55.82 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.3 0.75 2.5
select_random_ranges 0.35 1.06 3.0
table_scan 30.81 55.82 1.8
types_table_scan 70.55 158.63 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.88 6.21 1.1
oltp_insert 2.81 3.13 1.1
oltp_read_write 6.79 15.27 2.2
oltp_update_index 3.07 3.19 1.0
oltp_update_non_index 2.97 3.13 1.1
oltp_write_only 4.03 7.56 1.9
types_delete_insert 5.57 6.79 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.9
dolt - 1.4.2

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6164: Bug fix: Enable dolt_cherry_pick stored procedure to work with @@autocommit
    Previously, when dolt_cherry_pick encountered a data or schema conflict, or a constraint violation, it would return an error with a message about the problem and how to resolve. This works well when @@autocommit=0, but when @@autocommit=1 the error causes the automatic transaction management to automatically rollback the changes, which prevents users from being able to examine and fix the conflicts.
    This PR changes dolt_cherry_pick to be more similar to how dolt_merge reports conflicts – instead of returning an error message, dolt_cherry_pick now returns additional fields that show the number of tables with data conflicts, number of tables with schema conflicts, and number of tables with constraint violations.

go-mysql-server

  • 1823: Trim spaces and empty statements to the right in planbuilder.Parse
  • 1807: add support for more JSON_TABLE() functionality
    This PR adds support for:
    • FOR ORDINALITY columns, which is just an auto increment
    • DEFAULT <value> ON ERROR/EMPTY , which fills in values when encountering either an error or a missing value
    • when this isn't specified, NULL is used
    • ERROR <value> on ERROR/EMPTY, which throws an error when encountering either an error or a missing value
    • when this isn't specified, we ignore errors and fill in values with NULL
    • NESTED columns, which is a way to extract data from objects within objects and so on
    • when there are multiple NESTED columns, they are "sibling" nested, they take turns being NULL
      Note: there is a skipped test highlighting a bug in either our jsonpath implementation or it's something here...
      Companion PR: https://github.com/dolthub/vitess/pull/240
      MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/json-table-functions.html

vitess

  • 245: allow event as table and column name
    The PR allows EVENT non-reserved keyword to be used as table and column name without quoting.
    The missing edge case includes using EVENT for user name or host name.
  • 240: Support more JSON_TABLE functionality
    Source: https://dev.mysql.com/doc/refman/8.0/en/json-table-functions.html
    JSON_TABLE(
    expr,
    path COLUMNS (column_list)
    )   [AS] alias
    column_list:
    column[, column][, ...]
    column:
    name FOR ORDINALITY
    |  name type PATH string path [on_empty] [on_error]
    |  name type EXISTS PATH string path
    |  NESTED [PATH] path COLUMNS (column_list)
    on_empty:
    {NULL | DEFAULT json_string | ERROR} ON EMPTY
    on_error:
    {NULL | DEFAULT json_string | ERROR} ON ERROR
    
    Note: the MySQL docs indicate that PATH is optional in the NESTED case, but it doesn't seem that way.
    I chose to follow what they say rather than what they do.

Closed Issues

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.61 1.4
groupby_scan 12.3 16.71 1.4
index_join 1.18 4.33 3.7
index_join_scan 1.14 2.11 1.9
index_scan 30.81 53.85 1.7
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.3 0.75 2.5
select_random_ranges 0.35 1.06 3.0
table_scan 30.81 54.83 1.8
types_table_scan 69.29 155.8 2.2
reads_mean_multiplier 2.3
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.57 6.21 1.1
oltp_insert 2.76 2.91 1.1
oltp_read_write 6.43 15.27 2.4
oltp_update_index 2.61 3.25 1.2
oltp_update_non_index 2.76 3.13 1.1
oltp_write_only 3.68 7.56 2.1
types_delete_insert 5.18 7.04 1.4
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.4.1

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6144: Feature: Support for cherry-picking commits that require conflict resolution
    This change enables dolt cherry-pick and call dolt_cherry_pick(...) to cherry-pick commits that cause conflicts or constraint violations and to cherry-pick commits that contain schema changes.
  • 6143: Ripped out batch processing in SQL engine / session
    This functionality is pretty fragile and requires a lot of special-casing in various places to get right, and I no longer have confidence it's correct. There are lots of statement types that work incorrectly in the primary use case (importing a large SQL script). It won't work when talking to a running server without a lot of work, and probably won't provide much benefit. So kill it.
  • 6142: New interface method to return underlying dolt dbs from a SqlDatabase…
    … implementation (rather than type switch)
  • 6106: Create the batsee output directory if one doesn't exist yet.
    Create the batsee output directory if one doesn't exist yet.
    Also adding the output directory to .gitignore.

go-mysql-server

  • 1823: Trim spaces and empty statements to the right in planbuilder.Parse

Closed Issues

  • 6136: WHERE clause with LIKE expression over JSON column results in nil panic

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.96 2.97 1.5
groupby_scan 12.3 17.01 1.4
index_join 1.16 4.41 3.8
index_join_scan 1.12 2.14 1.9
index_scan 30.81 54.83 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.29 0.75 2.6
select_random_ranges 0.35 1.06 3.0
table_scan 31.37 55.82 1.8
types_table_scan 70.55 158.63 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.82 5.99 1.2
oltp_insert 2.61 2.91 1.1
oltp_read_write 6.43 15.27 2.4
oltp_update_index 2.57 3.02 1.2
oltp_update_non_index 2.61 3.02 1.2
oltp_write_only 3.75 7.56 2.0
types_delete_insert 4.82 6.55 1.4
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.4.0

Published by github-actions[bot] over 1 year ago

This release contains potentially breaking behavior changes having to do with transaction behavior and database naming.

  • CALL dolt_checkout('mybranch') is no longer an error if the currently checked out branch has uncommitted changes.
  • use mydb/mybranch; select * from information_schema.columns and similar queries used to return table_schema with the branch-qualified database name. information_schema tables, show databases, etc. now always contain only base, unqualified database names. The database() function still returns the name of the database in the connection string or in use statements.

Per Dolt's versioning policy, this may require users to update code that expects call dolt_checkout() to fail on a dirty working set. Thus, the minor (major.minor.patch) version bump.

Merged PRs

dolt

  • 6137: update maven version for orm tests
  • 6133: Update README.md
    Add some notes on where the final go binary goes for folks not familiar with golang
  • 6122: Re-merge of #5968 (transactions and db naming)
    This reverts commit bb03b8cb25d3710d98acc0035a4ec9dccd38634e.
  • 6117: Moved remote bats into their own workflow so they don't cancel local …
    …bats when they fail
  • 6107: Skip remotes-sql-server.bats and sql-server.bats when running BATS in remote-engine configuration.
    This should prevent the test failures that are blocking PRs, even if I don't fully understand why these test failures are happening.
  • 6011: Rewrite dolt status subcommand to use SQL queries to generate results.
    Rewrite dolt status subcommand to use SQL queries to generate results.
    This change adds remote column to the dolt_branches table. This column identifies the remote for the branch.
    This change adds dolt_count_commits function, which returns the number of commits between two commits.
    Added more BATS tests for dolt status.
  • 5968: Transaction and DB name resolution changes
    This change introduces two major behavior changes to how dolt manages database names and session state in transactions:
    1. It's now possible to edit different branch heads and databases without explicitly switching to a new working set via dolt_checkout(). E.g., this session updates the branch b1:
    use mydb;
    insert into `mydb/b1`.t1 values (1);
    commit;
    
    These database-qualified INSERT / UPDATE etc. statements used to silently drop data if the database named wasn't the current database, in some cases. They now work correctly in all cases. We still fail transactions that attempt to make changes to more than one working set for now, but this restriction will be removed in a future revision.
    2) The behavior of dolt_checkout() has changed. It is no longer an error to call dolt_checkout() with a dirty working set. USE mydb/branch is now almost exactly equivalent to dolt_checkout('branch'). dolt_checkout() now has the side-effect of setting the session's current DB name to the base name of the database. E.g. after use mydb/branch, select database(), active_branch() returns mydb/branch, branch. If you then call dolt_checkout('branch2'), then the same query returns mydb, branch2. dolt_checkout is different from USE statements in that it changes which branch the unqualified database name (mydb) resolves to for this session.

go-mysql-server

  • 1821: Bug fix: The result schema of SELECT INTO should always be an empty schema
    SELECT INTO currently returns it's child node's schema as its result schema, but it doesn't actually return row data in that schema. This causes a problem over a SQL connection when clients see a result schema and then see row data that doesn't match that schema. This causes clients to freak out and close the connection from their side. Since SELECT INTO always sends its results to a file or SQL vars (and NOT over the SQL connection), its result schema should always be the empty schema.
    Fixes: https://github.com/dolthub/dolt/issues/6105
  • 1819: Lazy load large character set encoding weight maps.
    Improves dolt startup times substantially.
  • 1818: Throw correct error for non-existent qualified column on table function
    Fix for: https://github.com/dolthub/dolt/issues/6101
  • 1817: Ignore FULLTEXT in CREATE TABLE
    This change allows us to ignore any FULLTEXT keys when using CREATE TABLE. This should unblock customers who just need their statements to parse, but don't actually need the functionality of FULLTEXT. We still error when trying to manually add a FULLTEXT index using ALTER or CREATE INDEX. Since this isn't really "correct" behavior, I did not add any tests.
  • 1816: Support TableFunction aliasing
    Added string field to expression.UnresolvedTableFunction to so an alias can be specified.
    Removed the rule disambiguate_table_functions, TableFunctions will default to using function name as table name when alias isn't provided.
    Companion PR: https://github.com/dolthub/vitess/pull/244
    Fix for: https://github.com/dolthub/dolt/issues/5928
  • 1814: Add filters to memo
    Scalar expressions added to memo along with scalar properties, expression ids, filter closures. Goal here is equivalent behavior to before, just with filters represented differently. Filter organization mostly mirrors the plan package, except scalar and relational expressions are both represented as expression groups here. Done in a rush, still back and forth on whether there should be an interface there.
    Additionally:
    scalar expressions added to memo along with scalar properties, expression id
    rewrites join planning and costing to use bitset representations of filters
    refactors codegen so definition files are yaml, source is compiled independently from target code
    The organization is bit wonky b/c this should be using my name resolution symbol tables, and the entire tree should be memoized not just the join tree (used temporary solutions for the problems created by both of these).
    Re: https://github.com/dolthub/dolt/issues/5993
  • 1791: Functional dependencies
    Functional dependencies track 1) key uniqueness, constant propagation, column equivalence, nullability sets.
    This information is built bottom-up from tables scans through projections, and is used to answer certain questions about relational nodes:
    1. What is the equivalence closure for a join condition?
    2. Are a set of filters redundant?
    3. Do a set of index expressions comprise a strict key for a LOOKUP_JOIN?
    4. Does a subquery decorrelation scope have a strict/null-safe key for an ANTI_JOIN?
    5. Are the grouping columns a strict key of the table (only_full_group_by is unnecessary)
    6. Is the relation sorted on a given column set? (is a Sort already enforced)
    7. Is a relation constant? (Max1Row)
      Questions (1) and (3) contribute towards fixing this issue: https://github.com/dolthub/dolt/issues/5993. Question (2) contributes to filter pruning. Question (4) is relevant for this issue: https://github.com/dolthub/dolt/issues/5954.
  • 1787: Changes to USE and prepared statements
    This introduces two changes to how databases are resolved:
    1. USE statements now are handled by the Session with a new interface
    2. Tables in prepared statements now retain a copy of their Database implementation, rather than re-resolving it by name during execution.
      Both of these changes are to support Dolt's new database name semantics.

vitess

  • 245: allow event as table and column name
    The PR allows EVENT non-reserved keyword to be used as table and column name without quoting.
    The missing edge case includes using EVENT for user name or host name.
  • 244: parse table_functions with aliases
    Syntax support for: https://github.com/dolthub/dolt/issues/5928
  • 243: Add ignore/replace modifiers to load data
  • 242: allow EVENTS to be parsed as non-reserved keyword
    Transferred EVENTS keywords into non_reserved_keyword list, allowing statements using information_schema.events table to parse.
    For some reason EVENT cannot be transferred into non_reserved_keyword, causing shift/reduce and reduce/reduce conflicts.

Closed Issues

  • 6123: event reserved keyword quoting differences with MySQL
  • 6101: Missing column triggers "table not found" error
  • 5928: Table function aliases
  • 6105: Server connection lost when executing prepared statement from mysql client
  • 6119: dolt sql-client does not pick database
  • 6109: Flaky CI tests when running start_sql_server
  • 6082: Commands that use new SQL backend display incorrect error when run outside of a repo
  • 5983: LOAD Data into a table with column defaults panics
  • 5678: Feature request: Add --skip-empty to DOLT_COMMIT()
  • 5982: LOAD DATA does not support replace or ignore options

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.0 2.66 1.3
groupby_scan 12.52 16.71 1.3
index_join 1.21 4.33 3.6
index_join_scan 1.14 2.11 1.9
index_scan 30.81 54.83 1.8
oltp_point_select 0.14 0.46 3.3
oltp_read_only 2.86 7.98 2.8
select_random_points 0.29 0.75 2.6
select_random_ranges 0.35 1.06 3.0
table_scan 30.81 54.83 1.8
types_table_scan 71.83 155.8 2.2
reads_mean_multiplier 2.3
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.99 6.32 1.1
oltp_insert 2.86 3.13 1.1
oltp_read_write 6.79 15.55 2.3
oltp_update_index 2.97 3.13 1.1
oltp_update_non_index 2.91 3.13 1.1
oltp_write_only 4.1 7.7 1.9
types_delete_insert 5.88 6.91 1.2
writes_mean_multiplier 1.4
Overall Mean Multiple 1.9
dolt - 1.3.0

Published by github-actions[bot] over 1 year ago

This release contains a behavior change to LOAD DATA LOCAL, which now defaults to have the same effect as IGNORE in order to match MySQL.

Per Dolt's versioning policy, this may require users to update code that uses LOAD DATA LOCAL. Thus, the minor (major.minor.patch) version bump.

Merged PRs

dolt

  • 6092: Integration tests for load data ignore/replace
  • 6086: Add support for --skip-empty to dolt commit and dolt_commit()
    The new --skip-empty flag can be passed to dolt commit ... or call dolt_commit(...) and have the commit operation be a no-op, instead of an error, if there are no changes staged to commit. It is an error to use --skip-empty together with --allow-empty.
    Fixes: https://github.com/dolthub/dolt/issues/5678
  • 6077: Evaluate column default expressions during merge
    Previously, we only supported evaluating column default expressions during merge when the expression only contained literals. This PR expands that support to evaluate column default expressions that contain functions and column references.

go-mysql-server

  • 1812: Support load data ignore/replace
    Here are the docs for load data with ignore/replace modifiers: https://dev.mysql.com/doc/refman/8.0/en/load-data.html#load-data-error-handling
    This also changes LOCAL to have the same effect as IGNORE to match mysql
  • 1811: Hoist subquery filters bug
    Hoist filters is supposed to move filters that do not reference tables in the current sope upwards. We did not descend subqueries when checking for that condition, mistakenly hoisting filters in some cases.
    Re: https://github.com/dolthub/dolt/issues/6089
  • 1797: Add filters to memo
    Scalar expressions added to memo along with scalar properties, expression ids, filter closures. Goal here is equivalent behavior to before, just with filters represented differently. Filter organization mostly mirrors the plan package, except scalar and relational expressions are both represented as expression groups here. Done in a rush, still back and forth on whether there should be an interface there.
    Additionally:
    • scalar expressions added to memo along with scalar properties, expression id
    • rewrites join planning and costing to use bitset representations of filters
    • refactors codegen so definition files are yaml, source is compiled independently from target code
      The organization is bit wonky b/c this should be using my name resolution symbol tables, and the entire tree should be memoized not just the join tree (used temporary solutions for the problems created by both of these).
      Re: https://github.com/dolthub/dolt/issues/5993
  • 1791: Functional dependencies
    Functional dependencies track 1) key uniqueness, constant propagation, column equivalence, nullability sets.
    This information is built bottom-up from tables scans through projections, and is used to answer certain questions about relational nodes:
    1. What is the equivalence closure for a join condition?
    2. Are a set of filters redundant?
    3. Do a set of index expressions comprise a strict key for a LOOKUP_JOIN?
    4. Does a subquery decorrelation scope have a strict/null-safe key for an ANTI_JOIN?
    5. Are the grouping columns a strict key of the table (only_full_group_by is unnecessary)
    6. Is the relation sorted on a given column set? (is a Sort already enforced)
    7. Is a relation constant? (Max1Row)
      Questions (1) and (3) contribute towards fixing this issue: https://github.com/dolthub/dolt/issues/5993. Question (2) contributes to filter pruning. Question (4) is relevant for this issue: https://github.com/dolthub/dolt/issues/5954.
      This master's thesis explains how to build the derivation graph starting at page 113: https://cs.uwaterloo.ca/research/tr/2000/11/CS-2000-11.thesis.pdf. The graph is composed of (determinant) -> (dependent) relationships on columns to track these properties. They color edges and nodes to differentiate constant, nullability, equivalence attributes. Any set of of columns uniquely determines the value of constants, so they have empty determinants: () -> (colSet). We differentiate strict keys (set of columns unique and non-nullable index) from lax keys (index that maybe be non-unique or nullable).
      Cockroach implemented a version that uses flattened to/from sets rather than individual nodes for determinant/dependents, and makes optimizations for quickly computing candidate keys: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/props/func_dep.go.
      My encoding is a little different. First, I assume that attributes trickle down from nullability -> constant -> equivalence -> functional dependencies. An FD built in this order simplifies the upstream additions in a way that avoids having to recompute dependency closures (ex: nullability, constant, and equiv columns don't recompute keys). Second, I assume FDs will be limited to primary and secondary key indexes; keys will have either strict or lax determinants, and the dependents are always assumed to be the rest of the table. So far this drops LEFT_JOIN right-equivalence relations that translate to lax-keys after the join, which could opportunistically be converted back to strict keys by downstream operators. If this was a mistake we can undo that, add back in dependent column sets.
      We need to support a handful of operators to use FDs in the join memo:
    • Table scan
    • Cross join
    • Inner join
    • Left join
    • Project (Distinct)
    • Filter
      Missing:
    • Full outer join
    • Synthesized columns
      Additionally:
    • the memo needs to embed equal filters in a format with expression ids
    • join reordering should compute equivalence closures for join edges
    • join selection should use functional dependencies to check if lookup expressions are valid
      Missing practical considerations:
    • when we determine a lookup expression comprises a strict key for a table, we need a way to backfill constants and equivalences used to make that decision
    • filters should maybe be represented in memo selection nodes to support redundancy elimination

vitess

  • 243: Add ignore/replace modifiers to load data
  • 241: Walking sub-nodes for SHOW TABLE statements
    When preparing a SHOW TABLES statement with a bound variable in the filter clause (e.g. SHOW TABLES FROM mydb WHERE Tables_in_mydb = ?;) GMS and Vitess were identifying the bound variable parameters differently and causing the SQL client on the other end to panic. Vitess code in conn.go walks the parsed tree and looks for SQLVal instances to identify the parameters and then returns that metadata over the SQL connection. The SHOW TABLES statement above fails because the sqlparser AST wasn't including all the members of SHOW TABLES node in the walk. This case is a little tricky to test directly in go-mysql-server, because it only repros in a running sql-server when running over a Vitess conn.
    The GMS and Vitess layers are both calculating bind variable metadata, with two different techniques, and whenever they get out of sync, we will see issues like this that only appear when running over a SQL connection. Longer term, we may want consider allowing GMS to return its bind variable metadata and avoid Vitess needing to re-calculate it, if we see more instances of this problem.
    Fixes: https://github.com/dolthub/go-mysql-server/issues/1793

Closed Issues

  • 5982: LOAD DATA does not support replace or ignore options
  • 5678: Feature request: Add --skip-empty to DOLT_COMMIT()
  • 6089: Analyzer Error: failed to replan join: field "id" is not on schema

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.93 2.71 1.4
groupby_scan 12.3 16.71 1.4
index_join 1.18 4.18 3.5
index_join_scan 1.14 2.11 1.9
index_scan 30.26 54.83 1.8
oltp_point_select 0.14 0.47 3.4
oltp_read_only 2.86 8.13 2.8
select_random_points 0.3 0.77 2.6
select_random_ranges 0.35 1.1 3.1
table_scan 30.26 55.82 1.8
types_table_scan 70.55 158.63 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.0 6.32 1.3
oltp_insert 2.52 3.02 1.2
oltp_read_write 6.43 15.55 2.4
oltp_update_index 2.57 3.13 1.2
oltp_update_non_index 2.66 3.07 1.2
oltp_write_only 3.62 7.84 2.2
types_delete_insert 5.0 6.91 1.4
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.2.5

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6090: Actually run bats tests in a configuration where all commands connect to a server process.
    Due to a typo in the .yaml script, we weren't actually running bats tests with the $SQL_ENGINE environment variable set, which meant that we weren't actually testing the ability for commands to seamlessly connect to running server processes. This PR fixes this.
  • 6065: update dolt blame to use sql backend and accept specific revision
    This change updates dolt blame to use the appropriate sql engine to generate results. This change also allows users to specify a specific revision to annotate from.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 6015: Migrate dolt add to use Sql backend.
    This PR migrates dolt add to invoke SQL commands instead of manipulating the database directly. This allows it to work even on remote connections.

go-mysql-server

Closed Issues

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.96 2.76 1.4
groupby_scan 12.3 17.01 1.4
index_join 1.16 4.25 3.7
index_join_scan 1.12 2.14 1.9
index_scan 30.81 55.82 1.8
oltp_point_select 0.15 0.47 3.1
oltp_read_only 2.86 8.13 2.8
select_random_points 0.29 0.77 2.7
select_random_ranges 0.35 1.08 3.1
table_scan 30.81 55.82 1.8
types_table_scan 69.29 158.63 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.57 6.55 1.2
oltp_insert 2.66 3.13 1.2
oltp_read_write 6.79 15.83 2.3
oltp_update_index 2.81 3.19 1.1
oltp_update_non_index 2.81 3.25 1.2
oltp_write_only 4.03 7.84 1.9
types_delete_insert 5.47 7.04 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 2.0
dolt - 1.2.4

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

go-mysql-server

Closed Issues

  • 6063: Excessive Memory usage related to JSON column and multi table JOIN
  • 6076: Dolt backup restore drops user root, prevents access to other databases.
  • 5936: Add dolt table import --append
  • 5662: JSON operator ->> not supported

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.96 2.71 1.4
groupby_scan 12.3 17.01 1.4
index_join 1.18 4.1 3.5
index_join_scan 1.12 2.07 1.8
index_scan 30.26 55.82 1.8
oltp_point_select 0.14 0.48 3.4
oltp_read_only 2.86 8.13 2.8
select_random_points 0.29 0.77 2.7
select_random_ranges 0.35 1.08 3.1
table_scan 30.81 56.84 1.8
types_table_scan 69.29 161.51 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.09 6.32 1.2
oltp_insert 2.48 3.13 1.3
oltp_read_write 6.32 15.55 2.5
oltp_update_index 2.43 3.13 1.3
oltp_update_non_index 2.61 3.02 1.2
oltp_write_only 3.55 7.84 2.2
types_delete_insert 5.0 6.91 1.4
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0
dolt - 1.2.3

Published by github-actions[bot] over 1 year ago

Merged PRs

dolt

  • 6071: replace encoding/json with goccy/go-json for improved unmarshalling
  • 6060: Test CI with automatic server connection for CLI commands.
    This adds additional CI workflows that start a Dolt SQL server before running the bats tests. The dolt commands in the tests should connect to the running server instead of running their own engine.
    The file local-remote.bats contains a list of bats files that are not yet confirmed to work against a remote server because they contain dolt commands that haven't been migrated yet. This is a burndown list that will shrink as we migrate more commands.

go-mysql-server

  • 1806: improve conversion from JsonDocument to string
  • 1803: Added "utf8mb3_czech_ci" collation, fixed missing collation check for enum/set
    Fixes https://github.com/dolthub/go-mysql-server/issues/1801
    Adds the requested collation, and fixes the panic. The panic came from an oversight when checking for a collation's implementation. enum and set use the collation during type creation, which occurs before we've verified the collation's implementation. The other string types do not use the collation during type creation, so we return the appropriate error as a result.

Closed Issues

  • 6076: Dolt backup restore drops user root, prevents access to other databases.
  • 5936: Add dolt table import --append
  • 5662: JSON operator ->> not supported
  • 1801: Creating an enum column with collation utf8_czech_ci causes panic

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 1.89 2.71 1.4
groupby_scan 12.08 16.71 1.4
index_join 1.16 4.25 3.7
index_join_scan 1.1 2.11 1.9
index_scan 30.26 54.83 1.8
oltp_point_select 0.14 0.47 3.4
oltp_read_only 2.71 7.98 2.9
select_random_points 0.29 0.77 2.7
select_random_ranges 0.34 1.08 3.2
table_scan 30.26 54.83 1.8
types_table_scan 68.05 158.63 2.3
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.0 6.09 1.2
oltp_insert 2.43 2.97 1.2
oltp_read_write 6.32 15.55 2.5
oltp_update_index 2.52 3.07 1.2
oltp_update_non_index 2.57 3.02 1.2
oltp_write_only 3.55 7.7 2.2
types_delete_insert 4.82 6.79 1.4
writes_mean_multiplier 1.5
Overall Mean Multiple 2.0