dolt

Dolt – Git for Data

APACHE-2.0 License

Downloads
2.4K
Stars
17.1K
Committers
143

Bot releases are visible (Hide)

dolt - 1.21.4

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6870: Revert "Change parent order processing in commit iterator …"
    Reverting this commit after receiving customer testing feedback that there are still cases where we aren't finding the most specific commit and that query latency for some dolt_diff_<tablename> queries has increased.
    This reverts commit 2e5e6aa968c645d150ba112a6d507b3d516f296c
  • 6866: Store source database info in tables and columns.
    This allows the engine to differentiate between tables with the same name from different databases, in line with MySQL. This also moves us toward a less error-prone model for resolving table references in queries, where we generate IDs for tables and columns during plan building and use those instead of doing lots of string comparisons.
    The main reason for supporting this is that it allows writing queries like:
    SELECT * from `db/main`.table join `db/dev`.table using (pk)
    without having to put each branch in its own subquery.
    This is the dolt side of https://github.com/dolthub/go-mysql-server/pull/2090

go-mysql-server

  • 2091: Non-ambiguous ORDER BY col should not error
  • 2090: Allow queries with duplicate table names if those tables come from different databases and all references are unambiguously resolvable.

Closed Issues

  • 5305: Argument order is very important to the dolt CLI
  • 6861: Merging main into another branch breaks dolt_diff_<table>
dolt - 1.21.3

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6862: Ensure dolt_diff_<tablename> reports most-specific commit that changed a row
    Changes parent order processing in commit iterator so that we prioritize traversing the parent of a merge commit that supplied the commits merged in that commit, before we traverse the other merge parent.
    Fixes: https://github.com/dolthub/dolt/issues/6861

Closed Issues

  • 6861: Merging main into another branch breaks dolt_diff_<table>
  • 6859: CREATE VIEW with columns

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.14 2.76 1.3
groupby_scan 13.22 17.32 1.3
index_join 1.37 4.41 3.2
index_join_scan 1.27 2.14 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.41 2.4
oltp_read_only 3.3 7.3 2.2
select_random_points 0.33 0.67 2.0
select_random_ranges 0.39 0.92 2.4
table_scan 34.33 55.82 1.6
types_table_scan 74.46 158.63 2.1
reads_mean_multiplier 2.0
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.28 5.88 1.1
oltp_insert 2.61 2.81 1.1
oltp_read_write 7.04 13.95 2.0
oltp_update_index 2.52 2.86 1.1
oltp_update_non_index 2.71 2.81 1.0
oltp_write_only 3.75 6.91 1.8
types_delete_insert 4.91 6.32 1.3
writes_mean_multiplier 1.3
Overall Mean Multiple 1.7
dolt - 1.21.2

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6864: dbfactory: aws: For AWS remotes using an explicit file credentials, periodically refresh the credentials the AWS client uses from the file contents.
    Some use cases put attenuated, expiring credentials into files. It's nice to pick up the new credentials without needing to recreate the client.
  • 6858: Allow dolt version in folders without full permissions
    This change allows dolt version to run in folders without write permissions.
    Resolves: https://github.com/dolthub/dolt/issues/2898
  • 6857: Improve import table error message
    This changes improves the error message for dolt table import to be more readable and useful. Notably, this change adds the column names, table name, and file line number to the error output.
    Resolves: https://github.com/dolthub/dolt/issues/5718

Closed Issues

  • 2898: dolt version errors out if run in a folder without permissions
  • 5910: Can't push if sql-server is running
  • 6860: dolt_diff_$table doesn't return the original commit that changed a row
  • 5718: Error messages on insert can be enhanced to aid with debugging
  • 6854: Can't reverse tables to model in navicat. Problem with name resolution on an order by clause on the triggers table.
  • 6852: "SQLSTATE[HY093]: Invalid parameter number" with Laravel 10
dolt - 1.21.1

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6847: has_ancestor() function
    Motivated by this query:
    select from_x, to_x, to_commit, tag_name as version from dolt_diff_xy,
    LATERAL (
    select tag_name
    from dolt_tags
    where has_ancestor(to_commit, tag_hash)
    order by date desc
    limit 1
    ) tag
    
    "Get the latest tag for a commit", which we solve by reading the tags table backwards, and returning the first that is an ancestor of our target commit.
  • 6845: add --silent option for push, pull, fetch
    Adds a --silent option for push, pull, fetch to suppress output of progress information.
    Resolves: https://github.com/dolthub/dolt/issues/6828
  • 6834: migrate dolt ls to use sql queries
    This change updates dolt ls to use the appropriate sql engine to generate results. This change also creates a system variable that, when enabled, shows dolt system tables in show tables and information_schema.tables.
    Related: https://github.com/dolthub/dolt/issues/3922
    Resolves: https://github.com/dolthub/dolt/issues/2073
  • 6833: JSON and stats interface refactor
    Bring Dolt into line with the new stats and json interfaces.

go-mysql-server

  • 2091: Non-ambiguous ORDER BY col should not error
  • 2088: fix off by one for found_rows when limit > count(*)
    We had a bug where we could increment the limit counter before receiving the EOF error.
    fixes https://github.com/dolthub/dolt/issues/6829
    companion pr: https://github.com/dolthub/vitess/pull/283
  • 2085: small fixes for new test harness using server engine
    This PR adds small fixes to server_engine that is used to test existing engine tests over running server.
  • 2081: Refactor JSON interfaces, stats interfaces
    This tries to organize some of the stats and json code.
    Stats has interfaces in the sql package to avoid circular dependencies (catalog depends on stats, stats depends on sql.Row, etc). So stats package can depend on sql for row and types, and most of GMS can use the generic sql interfaces without having to take a dependency on the stats package.
    memory and enginetests are the two main places where we depend on the stats concrete implementations. Rather than putting the statistic objects in memory, I chose to put them into a separate package. The two main reasons are 1) the Dolt side has a conversion to the memory implementation as a presentation layer, and it made more sense in my head for the "presentation layer" to be in the stats package rather than memory; 2) I'm going to add logic that operates on histograms, and it seemed convenient to have that in its own package with a concrete implementation there for testing. If this is wrong I can always reverse course.
    Most of the json changes are to try to make custom json types (statistic, histogram) more closely aligned with the default JSONDocument implementation, remove unused interfaces, deduplicate methods that all have a generic implementation. Other than typing additions in helper functions, the biggest change is probably adding the ToInterface{} method, which converts a generic JSONWrapper interface (newly changed from JSONValue to avoid overloading the term) into a map of strings->interface{}. I mostly did this because I was having to go from Statistic->[]byte->map[string]interface{} to use our json serialization primitives. ToInterface{} might be less performant than inlining search, extract, etc on structs (for example, we don't need to convert Statistic an interface to extract the row_count value), but the generic intermediate makes it pretty easy to plug new json types. Again this may be misguided, happy to reverse course if I'm overcomplicating something.
  • 2080: drop sort when secondary index is available

vitess

Closed Issues

  • 6817: [no conn] no binlog connection to source, attempting to establish one {}
  • 1256: dolt diff empty output with table filter on newly created tables
  • 6828: $ dolt push generates lots of Uploading... text in cron
  • 2073: Need a way to list all dolt system tables in SQL context
  • 6829: SQL_CALC_FOUND_ROWS doesn't work?
dolt - 1.21.0

Published by github-actions[bot] about 1 year ago

This release contains backwards incompatible changes:

  • The dolt_docs system table was updated so that it is created on SQL writes and returns an empty index on reads. Previously reading or writing to the dolt_docs table in a SQL context would error if no docs existed and users needed to manually create the table. Now reading and writing to dolt_docs will no longer error due to the table not existing, and trying to create a dolt_docs table will error.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6840: migrate dolt merge-base to use sql queries
    This change updates dolt merge-base to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 6831: go: sqle: cluster: When performing a graceful transition to standby, take mysql and dolt_branch_control replication state into account.
    Graceful transitions to standby block on the primary until a certain number of replicas are trued up. They then return a status of whether each database on each replica is caught up, so that a control plane agent can pick a caught up server to be next primary, for example.
  • 6827: Automatically create dolt_docs table
    Currently if you are using a SQL-only interface like Hosted, you have to manually create the dolt_docs system table to use it. This changes that functionality to be the same as the dolt_ignore system table, which is created on writes and returns an empty index on reads. The table will always exist as far as the mysql engine is concerned.
    Related to https://github.com/dolthub/dolt/issues/6809
  • 6826: Support AS OF with dolt_ignore table.
    Fixes https://github.com/dolthub/dolt/issues/6823
    To make this work cleanly, this PR creates a new interface, VersionedTable, which is implemented by DoltTable and classes that wrap it (Like WriteableDoltTable and IgnoreTable.)
  • 6822: Add defaults for dolt table row counts
    We could do a better job keeping track of row counts for special tables. In the meantime, we want tables to implement sql.StatisticsTable and return row count values high enough that we use HASH_JOIN instead of INNER_JOIN.
  • 6794: Prolly stats
    Add in-place ANALYZE TABLE support for Prolly trees:
    • every index prefix gets a histogram
    • level of tree with > 20 chunks (or level 0) is chosen for buckets
    • each chunk = a histogram bucket
    • full table scan to fill bucket, no sampling or sketches
      Adds variety of unit and enginetests guide-rails for statistic values, data format, and serialization to the GMS side.
      This is not hooked into the costing logic yet, and there is no automatic refresh lifecycle.
      Dolt companion: https://github.com/dolthub/go-mysql-server/pull/2071
  • 6787: add --all flag for dolt push
    This PR adds --all flag to dolt push command and update some error messages as well as success push messages to be returned.
    Also includes:
    • removing some duplicate tests from dolt-push.bats
    • splitting remote dolt push and pull tests from remotes.bats into remotes-push-pull.bats

go-mysql-server

  • 2088: fix off by one for found_rows when limit > count(*)
    We had a bug where we could increment the limit counter before receiving the EOF error.
    fixes https://github.com/dolthub/dolt/issues/6829
    companion pr: https://github.com/dolthub/vitess/pull/283
  • 2086: Server handling parsed statements
    Initially this was going to be a bit more involved, as I was planning on having Dolt expose a new interface, and we'd directly pass in GMS ASTs rather than Vitess ASTs. The Dolt interface approach turned out to be a lot more involved than first anticipated, and the construction of GMS ASTs needs state that we will not have at higher layers, and exposing such state is also a lot more involved. Therefore, I've made a compromise by accepting Vitess ASTs instead, which makes this vastly simpler. It's not going to be quite as powerful, but I think it can still serve our purposes for the foreseeable future.
    This basically works by hijacking that fact that we'll sometimes process Vitess ASTs via the prepared cache. If we receive a Vitess AST, then we skip the cache, otherwise we access the cache like the normal workflow.
  • 2084: Remove Dead Grant/Revoke code
  • 2079: reverse filters for reverse lookups
    fixes https://github.com/dolthub/dolt/issues/6824
  • 2078: fix panic when order by column is out of range
    We used to blindly index ORDER BY values, which resulted in panics.
    Now, we throw the appropriate error.
    Additionally, this matches MySQL behavior when performing ORDER BY with indexes < 1.
    • ORDER BY 0 = error
    • ORDER BY -1 = noop
  • 2077: sql.StatsTable->RowCount returns whether the estimate is exact
    Add return argument for whether the a RowCount can be a substitute for count(*).
  • 2076: pick float type if one side is non-number type
    If one side or comparison is non-number type and the other side is number type, then convert to float to compare and the non-number type value can be float type.
  • 2075: Fix information_schema row count regression
    If a table does not implement RowCount() we used to use 1000 as a default value for row costing. A recent refactor changed that to 0. This fixes information schema tables to report the 1000 value again, which is usually accurate for small databases because of default tables and columns. I also fixed some issues with database reporting for info schema tables.
    This regression probably still exists for some dolt tables and table functions. I will do a pass and see if I can add some more accurate values on the Dolt side.
  • 2074: optimize min(pk) and max(pk)
    This PR adds an optimization to queries that have a MIN or MAX aggregation over a PRIMARY KEY column.
    Since indexes are already sorted (and we support a reverse iterator) we can look at the first/last row to answer queries in this form.
    The new analyzer rule, replaceAgg, converts queries of the format select max(pk) ... from ... to the equivalent select pk ... from ... order by pk limit 1. Then, we depend on an replacePkSort to apply IndexedTableAccess
    Additionally, this PR has replacePkSort optimization apply to queries that have filters (specifically those that were pushed down to IndexedTableAccess)
    There is also a some refactoring and tidying up.
    Fixes https://github.com/dolthub/dolt/issues/6793
  • 2071: Updates for Dolt stats
    • json_value and json_length added
    • json_table edited to support json document inputs.
    • our custom json marshaller supports types that implement the json.Marshaller interface
    • increased recursive iter limit to 10,000 to more easily generate 3-level prolly trees for statistics testing
      Note: the json_value notation is different than mysql's. I accept the type as a third parameter, rather than expecting a RETURNING clause.
  • 2067: adding sqllogictests
    This PR adds some utility scripts to convert CRDB testing files into SQLLogicTest format.
    Additionally, this adds tests focusing on join and subqueries.
    Some notable tests are added as skipped enginetests.
  • 1888: Remove filters from LookupJoins when they're provably not required.
    During joins, we still evaluate every filter for every candidate row. But based on the join implementation, some of those filters must necessarily be true, so we don't need to evaluate them.
    In most joins the performance cost of this isn't that bad, but this problem is most noticeable in LookupJoins where a point lookup is constructed from several columns, at which point the filter evaluation can dominate the runtime.

vitess

Closed Issues

  • 6829: SQL_CALC_FOUND_ROWS doesn't work?
  • 6824: Reverse indexed table walks generate incorrect ordering when using multiple filters ranges.
  • 6460: Push/Pull/Fetch while sql-server is running
  • 6406: FOUND_ROWS() returns incorrect results
  • 6823: Panic using AS OF with dolt_ignore system table
  • 6819: JSON numbers less than 0.5 is treated as 0
  • 6626: Implement dolt push --all

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.71 1.3
groupby_scan 12.98 17.32 1.3
index_join 1.32 4.49 3.4
index_join_scan 1.25 2.14 1.7
index_scan 33.72 55.82 1.7
oltp_point_select 0.17 0.4 2.4
oltp_read_only 3.25 7.17 2.2
select_random_points 0.32 0.67 2.1
select_random_ranges 0.38 0.92 2.4
table_scan 33.72 55.82 1.7
types_table_scan 74.46 158.63 2.1
reads_mean_multiplier 2.0
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.57 5.37 1.2
oltp_insert 2.22 2.66 1.2
oltp_read_write 6.67 13.46 2.0
oltp_update_index 2.39 2.66 1.1
oltp_update_non_index 2.3 2.61 1.1
oltp_write_only 3.3 6.67 2.0
types_delete_insert 4.82 5.77 1.2
writes_mean_multiplier 1.4
Overall Mean Multiple 1.7
dolt - 1.20.0

Published by github-actions[bot] about 1 year ago

Merged PRs

This release contains backwards incompatible changes:

  • dolt merge command line tool's exit status has changed for conflicts in merges. Perviously merges which resulted in conflicts returned an exit status of 0, indicating success. This release changes the behavior to return a 1 in the event of a failed merge.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

dolt

go-mysql-server

  • 2074: optimize min(pk) and max(pk)
    This PR adds an optimization to queries that have a MIN or MAX aggregation over a PRIMARY KEY column.
    Since indexes are already sorted (and we support a reverse iterator) we can look at the first/last row to answer queries in this form.
    The new analyzer rule, replaceAgg, converts queries of the format select max(pk) ... from ... to the equivalent select pk ... from ... order by pk limit 1. Then, we depend on an replacePkSort to apply IndexedTableAccess
    Additionally, this PR has replacePkSort optimization apply to queries that have filters (specifically those that were pushed down to IndexedTableAccess)
    There is also a some refactoring and tidying up.
    Fixes https://github.com/dolthub/dolt/issues/6793
  • 2073: Bug fixes for explicit DEFAULT values in INSERT statements
    Previously this was broken when DEFAULT values referred to other columns.
    Fixes https://github.com/dolthub/dolt/issues/6430
    Needs more tests, but this fixes the immediate buggy behavior.
    Also fixes a related bug in MySQL: https://bugs.mysql.com/bug.php?id=112708
  • 2072: support json_valid() function
    Add support for JSON_VALID(val) function
  • 2070: Add routines to PrivilegeSets
    Added some Unit tests as well. This is in preparation for supporting routine grants.
  • 2068: Virtual column proof of concept
    This needs a lot more tests, but this demonstrates the approach within can work well enough
  • 2065: Remove redundant information_schema creation in the driver example
    In the driver's example, information_schema is duplicated.
    It is created in two places.
    1. factory.Resolve()
    2. Catalog in Analyzer in Driver.OpenConnector()
      There is no need to create it in the factory.Resolve().
      I checked databases using this:
    diff --git a/driver/_example/main.go b/driver/_example/main.go
    index 34e0580ed..adbc1a249 100644
    --- a/driver/_example/main.go
    +++ b/driver/_example/main.go
    @@ -35,6 +35,14 @@ func main() {
    rows, err := db.Query("SELECT * FROM mytable")
    must(err)
    dump(rows)
    +
    +       rows, err = db.Query("SHOW DATABASES")
    +       must(err)
    +       for rows.Next() {
    +               var db string
    +               must(rows.Scan(&db))
    +               fmt.Println("db:", db)
    +       }
    }
    func must(err error) {
    

Closed Issues

  • 6793: No index used when calling select max(date) ...
  • 6430: Column defaults do not get applied correctly with explicit DEFAULT keyword
  • 6785: Read-only databases allow creating new branches using DOLT_CHECKOUT()
  • 6808: Support JSON_VALID()
  • 6788: Make uuid primary keys function more like auto_increment
  • 6799: Make Laravel example app work with Dolt
  • 6286: Fix field indexes after analysis

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.71 1.3
groupby_scan 12.98 17.01 1.3
index_join 1.32 4.57 3.5
index_join_scan 1.25 2.14 1.7
index_scan 34.33 54.83 1.6
oltp_point_select 0.17 0.4 2.4
oltp_read_only 3.25 7.17 2.2
select_random_points 0.32 0.68 2.1
select_random_ranges 0.38 0.9 2.4
table_scan 34.33 54.83 1.6
types_table_scan 74.46 155.8 2.1
reads_mean_multiplier 2.0
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.41 5.37 1.2
oltp_insert 2.18 2.66 1.2
oltp_read_write 6.55 13.46 2.1
oltp_update_index 2.22 2.66 1.2
oltp_update_non_index 2.22 2.61 1.2
oltp_write_only 3.25 6.55 2.0
types_delete_insert 4.49 5.77 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 1.8
dolt - 1.19.0

Published by github-actions[bot] about 1 year ago

This release contains backwards incompatible changes:

  • MySQL 8.0 Server Connection Metadata – When connecting to a Dolt sql-server, Dolt now advertises that it is a MySQL 8.0 server, instead of a MySQL 5.7 server as in previous Dolt versions. This is a backwards incompatible change, because some clients (e.g. Laravel) use this metadata to determine which SQL features and syntax to use.
  • Recovery of Dropped Databases – Dropped databases can now be restored by using the dolt_undrop() stored procedure. This is a backwards incompatible change, because disk drive space is no longer automatically reclaimed after dropping a database; instead, it can be manually purged with the new dolt_purge_dropped_databases(); procedure.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6795: /.github/scripts/sql-correctness/get-dolt-correctness-job-json.sh: change correctness precision
  • 6784: go: sqle: cluster: Make users and grants replication respect dolt_cluster_ack_writes_timeout_secs.
  • 6783: go: sqle: cluster: Fix a bug when replicating users and grants and branch control where updates two updates which came together quickly could fail to replicate the second update.
  • 6782: go: sqle: cluster: Make branch control replication respect dolt_cluster_ack_writes_timeout_secs
  • 6771: integration-tests/go-sql-server-driver: Add a test for undropping a cluster replicated database.
  • 6762: Feature: New stored procedure dolt_undrop() to restore dropped databases
    This change introduces a new stored procedure, dolt_undrop(<database name>) that can be used to restore a dropped database. Dropped databases are now moved to a new dolt_dropped_databases directory until they are either restored or permanently purged with the new dolt_purge_dropped_databases stored procedure.
    Usage:
    # Use dolt_undrop() with the database name to restore a dropped database
    drop database mydb;
    call dolt_undrop('mydb');
    ...
    # Use dolt_purge_dropped_databases() with no arguments to permanently delete all dropped databases
    drop database mydb;
    call dolt_purge_dropped_databases;
    
    Required Privileges: Calling dolt_undrop() does not require any special privileges. Calling dolt_purge_dropped_databases() requires SUPER privileges.
    Testing Notes: Tests are in BATS, since Dolt enginetests use InMemoryBlobStore instead of writing db data to a filesystem, so there isn't currently a good way to undrop databases in enginetests. If the noms data files were stored through the InMemFS, then the same code would work in our tests to undrop databases. Currently the repo state files are able to be undropped, but not the actual database data, since it disappears with the InMemoryBlobStore reference when we drop the database.
    Fixes: https://github.com/dolthub/dolt/issues/6484
    Documentation Updates: https://github.com/dolthub/docs/pull/1768

go-mysql-server

  • 2065: Remove redundant information_schema creation in the driver example
    In the driver's example, information_schema is duplicated.
    It is created in two places.
    1. factory.Resolve()
    2. Catalog in Analyzer in Driver.OpenConnector()
      There is no need to create it in the factory.Resolve().
      I checked databases using this:
    diff --git a/driver/_example/main.go b/driver/_example/main.go
    index 34e0580ed..adbc1a249 100644
    --- a/driver/_example/main.go
    +++ b/driver/_example/main.go
    @@ -35,6 +35,14 @@ func main() {
    rows, err := db.Query("SELECT * FROM mytable")
    must(err)
    dump(rows)
    +
    +       rows, err = db.Query("SHOW DATABASES")
    +       must(err)
    +       for rows.Next() {
    +               var db string
    +               must(rows.Scan(&db))
    +               fmt.Println("db:", db)
    +       }
    }
    func must(err error) {
    
  • 2064: Implement fast join ordering for large joins that can be implemented as a series of lookups.
    Fixes https://github.com/dolthub/dolt/issues/6713
    This PR is built on top of https://github.com/dolthub/go-mysql-server/pull/2063 and includes its changes, plus two additional commits.
    The purpose of this PR is to attempt to construct a left-deep join ordering where every join is possible to optimize into a lookup. It uses Functional Dependencies to identify the column set that determines every other set in the final join. If a "lookup only join order" exists, then these columns must be in the innermost join. We use that as the basis and pick join filters one at a time to create a new join plan.
    Currently this only handles the case where a single column determines every other column, and only the case where at each juncture, only a single possible choice can be made for the next table in the join. This is sufficient to handle the queries used in sqllogictests
    Right now, this is intentionally limited in scope: it's designed to only affect the extremely large (up to 64 tables) joins used in sqllogictests, and won't currently be used outside of large joins. But there's no reason that this can't be used for other joins. In fact, in some cases I observed this generate correct join plans that our current brute-force reordering misses. (edge.applicable claims that these ordering violate a conflict rule even though they should be correct. We may be overly conservative in some places.)
  • 2063: Improve functional dependency analysis for joins.
    This is a prerequisite for fixing https://github.com/dolthub/dolt/issues/6713.
    Basically, this PR does two things:
    • Adds additional bookkeeping to FDS in order to track "partial FDS keys", that is, column sets that determine some (but not all) of the other columns in a relation. Before this PR, we attempted to only track keys that determined the entire relation. This would cause us to lose some information and prohibit some optimizations. (There were also a couple of cases in nested joins where we accidentally added partial keys to FDS anyway (by adding a key from a child table to the FDS of the parent.) If we ever used these keys it could have caused correctness issues, but it doesn't look like we ever did.
    • Improves the simplifyCols method in FDS to use partial keys in order to improve analysis.
      Overall, these changes allow us to compute much better FDS keys for complicated joins, by allowing us to remember and reuse keys derived from child tables to improve the computed key for tables later in the join.
  • 2062: "CREATE TABLE" fails with a "database not found" error when using the driver
    In the main branch (commit 4cc2f2c), executing the following SQL in the driver's example (driver/_example) failes with the error database not found: mydb:
    diff --git a/driver/_example/main.go b/driver/_example/main.go
    index 34e0580ed..64fe11a2b 100644
    --- a/driver/_example/main.go
    +++ b/driver/_example/main.go
    @@ -35,6 +35,9 @@ func main() {
    rows, err := db.Query("SELECT * FROM mytable")
    must(err)
    dump(rows)
    +
    +       _, err = db.Exec("CREATE TABLE table2 (id integer, primary key (id))")
    +       must(err)
    }
    func must(err error) {
    
    output:
    John Doe [email protected] ["555-555-555"] 2023-10-09 21:29:48.750044594 +0900 JST m=+0.016306123
    John Doe [email protected] [] 2023-10-09 21:29:48.750060666 +0900 JST m=+0.016322195
    Jane Doe [email protected] [] 2023-10-09 21:29:48.750067418 +0900 JST m=+0.016328947
    Evil Bob [email protected] ["555-666-555", "666-666-666"] 2023-10-09 21:29:48.750073628 +0900 JST m=+0.016335158
    2023/10/09 21:29:48 database not found: mydb
    exit status 1
    
    This is because the DefaultSessionBuilder of the driver is initialized with an empty DBProvider.
    driver/session.go:22
    In this PR, I have fixed this by passing the DatabaseProvider returned by factory.Resolve() to the SessionBuilder.
  • 2061: More histogram support, redo stats table
    Reshape statistics interfaces to better support custom Dolt implementation.
    • Add ANALYZE TABLE <table> [UPDATE/DROP] HISTOGRAM ON <column,...> USING <json blob> support and a few tests
    • Replace use of update information schema cardinality updates with histogram updates
    • Default catalog has an in-memory histogram
    • New memory histogram impl
    • Delete old statistics table/histogram code
      The only prod difference is that the new update path can overwrite the default table count, while it was a testing-only thing before.
      companion: https://github.com/dolthub/vitess/pull/279
      Dolt bump seems OK
  • 2060: Fixed various bugs in last_insert_id
    Fixes https://github.com/dolthub/dolt/issues/6776
    Also adds better test coverage for LAST_INSERT_ID() and the INSERT_ID field in the ok response for updates.

vitess

  • 280: Update the default server version to 8.0.33
    The connection handshake was advertising a server version of 5.7.9-Vitess and some clients were using that info and trying and speak MySQL-5.7 to Dolt (example issue)
    This change updates the default advertised server version to 8.0.33-Dolt.
    Dolt CI tests are running at: https://github.com/dolthub/dolt/pull/6798
  • 279: Add ANALYZE HISTOGRAM support
  • 277: Allow parsing of CREATE TABLE t AS (...) UNION (...)
    This allows parsing of CREATE TABLE AS statements when the expression being used to create the table is a set operation like UNION, INTERSECT, or EXCEPT.
    The "AS" keyword is typically optional. But this change only allows set ops to be used with CREATE ... AS when the AS is explicit. This is to avoid an ambiguity in the current grammar when attempting to parse CREATE TABLE t (, where what follows could be a set op or a table definition. Fully matching MySQL's spec here would require rewriting our grammar to avoid this ambiguity, which is outside the scope of the PR. However, this PR makes us strictly more correct than we were before.

Closed Issues

  • 6484: Drop Database is destructive. Could make it more safe.
  • 5820: Dolt works poorly with prepared emulation off in PHP PDO
  • 6641: CREATE TABLE {name} SELECT ... UNION SELECT ... fails to parse.
  • 6797: Laravel example app requires environment variable Dolt does not support
  • 6713: Dolt takes forever to generate join plans for a join of many tables.
  • 6776: SELECT LAST_INSERT_ID() returns BIGINT SIGNED for BIGINT UNSIGNED ids
dolt - 1.18.1

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6781: /go/utils/publishrelease/buildbinaries.sh: fix golang tag;
  • 6777: go/go.mod: Require Go 1.21 in our go.mod.
  • 6767: go: sqle: cluster: Improvements for DROP DATABASE replication.
    Fix a bug in session handling for the replication api endpoint which would prevent a dropped database from being recreated on a replica.
    Fix a race condition when a database is recreated after it is dropped. In that case, we stop attempting to replicate the drop, so that it does not replicate after the new database does.
  • 6764: update fetch context handling
    Updates fetch context handling to properly exit with an error if force cancelled.
  • 6738: migrate dolt push to use sql queries
    This change updates dolt push to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 6703: When a three-way merge contains an altered column, convert values to the new column type in order to prevent conflicts.
    Fixes https://github.com/dolthub/dolt/issues/6660
    I thought this was going to be a simple tweak, but it turns out that three-way merges in the presence of altered columns has some interesting subtleties.
    First off, we don't just compare the bytes of cells when merging, we actually de-serialize them and check for equality. I suspect this doesn't matter in most cases, but is done because:
    • Some types may have multiple ways to represent the same value, and
    • In the event of an altered column, two distinct values (of different types) may have the same representation.
      This leads to some subtle edge cases: for example, if once branch alters a column type, and then modifies a row so the cell as the same representation as the original, unaltered value, we still need to detect that as a modification, otherwise we might silently drop it during the merge.
      On the other side of the coin, if one branch alters a column and the other branch modifies that column, that's not necessarily a conflict even though the representations of both cells have changed.
      The logic for merging cells is more complicated now than it was before, but I added comments that should make it easy to follow. In brief, we detect when columns have been altered and convert values as appropriate before checking to see if they've been modified by the branch.
      In slightly more detail, if we have a cell with value base (of type A), a branch right which alters the column to have type B, and a branch left which modifies the cell (but is still type A):
    • We convert base to type B in order to detect whether the branch right also modified the cell.
    • We also convert the value on left to type B in order to detect whether a conflict exists with the value on right.
      I'm still writing extra tests, and the Convert method doesn't support every type yet, but the design is ready for feedback.

go-mysql-server

  • 2060: Fixed various bugs in last_insert_id
    Fixes https://github.com/dolthub/dolt/issues/6776
    Also adds better test coverage for LAST_INSERT_ID() and the INSERT_ID field in the ok response for updates.
  • 2059: fix db qualified column names in order by
    We did not handle the case where database qualified column names would be in the order by clause.
    This PR fixes that issue and adds a variety of tests with database qualified column names and database qualified table names.
    There is still a case where joining two tables with the same name from two different databases results in an error; there's a skipped test for this, and a workaround is to alias the tables.
    fixes https://github.com/dolthub/dolt/issues/6773
  • 2054: prevent projections to be pushed past lateral joins
    pruneTables will pushdown projections as low as possible to avoid passing around columns that won't make it to the final result.
    In the case of lateral joins, a column that isn't referenced in the topmost projection may still be referenced by subqueries, which causes problems. This PR prevents the projection pushdown optimization from applying to the children of lateral joins.
    A better solution might be to determine which columns are referenced, and trim out all others. However, that seems hard.
    fixes https://github.com/dolthub/dolt/issues/6741
  • 2051: Trigger view errors don't prevent writes on database with preexisting trigger views
    PR https://github.com/dolthub/go-mysql-server/pull/2034 made it so that we can not DDL triggers on views. But past versions of Dolt permitted creating those triggers. After this change, databases with a trigger view will be have all writes blocked with the trigger view error. We should probably not try to parse and bind all triggers for every write, but as long as we do this adds a warning rather than an error for a non-DDL trigger parse.
  • 2050: Fix binding of time.Time via driver
    When using the driver in v0.17.0, an error occurs when binding a variable of type time.Time as a query parameter.
    The error message looks like this:
    type time.Time not supported as bind var: 2023-10-01 17:37:49.382855116 +0900 JST m=+0.017340108
    
    This issue did not occur in v0.16.0.
    While it's possible to resolve this by modifying sqltypes.BuildBindVariable() in vitess,
    this PR resolves it within the driver itself by converting the variable into a timestamp literal string.
  • 2049: adding gosql prepared tests
    We are lacking in tests that use the gosql driver to connect and run queries against an in-memory GMS server. This PR creates a new test suite so it's easier to write these tests. Additionally, there are some tests targeting a bug involving unsigned integers being read as signed.
  • 2048: Prevent identifiers longer than 64 characters
    Also fixed a bug where we allowed multiple column names with the same case-insensitive name
    Fixes https://github.com/dolthub/dolt/issues/6611

Closed Issues

  • 6776: SELECT LAST_INSERT_ID() returns BIGINT SIGNED for BIGINT UNSIGNED ids
  • 6773: Column could not be found when using fully qualified table name in ORDER BY
  • 4798: Table functions need unique names in queries, like tables
  • 6195: Error resolving union'ed recursive CTE query
  • 6660: Dolt revert panics after changing column from VARCHAR to TEXT
  • 6194: LATERAL joins
  • 6758: SECURITY ERROR v1.70.x
  • 6741: LATERAL join bugs

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.14 2.76 1.3
groupby_scan 13.22 17.32 1.3
index_join 1.32 4.49 3.4
index_join_scan 1.25 2.14 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.4 2.4
oltp_read_only 3.3 7.17 2.2
select_random_points 0.32 0.68 2.1
select_random_ranges 0.38 0.92 2.4
table_scan 34.33 55.82 1.6
types_table_scan 74.46 158.63 2.1
reads_mean_multiplier 2.0
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.57 5.37 1.2
oltp_insert 2.26 2.71 1.2
oltp_read_write 6.67 13.7 2.1
oltp_update_index 2.3 2.76 1.2
oltp_update_non_index 2.35 2.66 1.1
oltp_write_only 3.3 6.79 2.1
types_delete_insert 4.74 5.77 1.2
writes_mean_multiplier 1.4
Overall Mean Multiple 1.7
dolt - 1.18.0

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6760: dolt backup: Fix dropped error on the final stage of updating the destination with the newly pushed backup.
  • 6754: go/libraries/doltcore/sqle/cluster: Replicate DROP DATABASE statements so that a dropped database is also dropped on replicas.
  • 6753: adds message column to dolt_push output
    Adds a message column to the output of dolt_push() to provide additional information.
  • 6752: go/libraries/doltcore/sqle/cluster: Fix a possible deadlock in permissions replication.

go-mysql-server

  • 2047: Don't reorder joins that are too large to efficiently analyze.
    The current implementation of the join order builder scales poorly if there are too many joins. It's likely possible to improve it, but in the meantime, I'm disabling join reordering on joins that have too many tables (currently defined to be more than 20.)
    In these situations, the analyzer takes longer to run the reordering than it does to actually execute any of our test cases, so running the analysis in this case can only slow us down.
    I expect this is unlikely to adversely affect users because joins this large are rare, and when they do occur they are often written in a way that the explicit order is good enough.
    For example, this test from sqllogictests:
    SELECT x63,x53,x62,x52,x11,x5,x40,x64,x27,x28,x21,x41,x22,x30,x16,x14,x56,x32,x46,x50,x1,x34   FROM t46,t34,t1,t32,t53,t21,t63,t11,t30,t62,t27,t50,t16,t64,t40,t56,t22,t28,t52,t5,t41,t14  WHERE a21=b5    AND b30=a52    AND a62=b46    AND a14=3    AND b52=a28    AND b53=a14    AND a63=b28    AND b40=a56    AND a11=b64    AND a53=b22    AND b1=a34    AND b32=a41    AND a50=b63    AND a64=b62    AND b11=a30    AND b27=a40    AND a22=b56    AND b21=a46    AND a1=b50    AND b34=a16    AND a27=b16  AND a5=b41;
    
    takes 30 minutes to reorder, and 15 seconds to run when reordering is disabled.
    MySQL runs the query in under a second, demonstrating that reordering can still massively improve performance if we can make the algorithm more efficient. But this is a good stopgap measure.
  • 2044: use session builder from harness in the server engine
    Small fixes for memory harness for enginetest:
    • use sessionBuilder from the harness instead of DefaultSessionBuilder
    • convert row result for SHOW queries

Closed Issues

  • 6611: Creating attribute with more than 64 character is allowed but causes error.
  • 6724: dolt merge doesn't produce deterministic hashes

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 3.02 1.5
groupby_scan 12.98 17.95 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.23 2.22 1.8
index_scan 32.53 58.92 1.8
oltp_point_select 0.14 0.39 2.8
oltp_read_only 2.71 7.3 2.7
select_random_points 0.31 0.72 2.3
select_random_ranges 0.37 0.97 2.6
table_scan 32.53 58.92 1.8
types_table_scan 74.46 170.48 2.3
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.82 5.57 1.2
oltp_insert 2.43 2.71 1.1
oltp_read_write 5.99 13.95 2.3
oltp_update_index 2.35 2.71 1.2
oltp_update_non_index 2.43 2.76 1.1
oltp_write_only 3.36 6.79 2.0
types_delete_insert 4.74 5.99 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 1.9
dolt - 1.17.1

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6744: go/store/nbs: store.go: Clean up how we update hasCache so that we only update it after successfully writing the memtable.
  • 6740: Support for DOLT_COMMITTER_DATE and DOLT_AUTHOR_DATE environment vars
    These two new environment vars perform the same function as GIT_COMMITTER_DATE and GIT_AUTHOR_DATE, respectively. They set the two timestamps associated with newly created commits on dolt commit and dolt merge.
    Setting these two environment variables allows the deterministic creation of commit hashes, as requested in https://github.com/dolthub/dolt/issues/6724

go-mysql-server

  • 2041: not panic on Star.IsNullable()
    This reverts https://github.com/dolthub/go-mysql-server/pull/2039 because the fix was not correct choice for the issue https://github.com/dolthub/dolt/issues/6659.
  • 2039: AliasedExpr.InputExression should be compared case insensitive
  • 2038: error msg for invalid reference to non-existent table or column in existing view
    It catches invalid reference to non-existent table or column in existing view. This includes SELECT queries on a view that references table or column that was removed or renamed.
    Note: For now, It does not catch references to invalid functions or users without appropriate privilege cases and queries other than SELECT queries.
    Fixes: https://github.com/dolthub/dolt/issues/6691
  • 2032: fix order by on unioned schemas
    When unioning two SELECT statements that have different column types, we would get -1 during assignExecIndexes, resulting in a panic.
    This PR fixes the issue by matching on unqualified column names when we don't have an exact match.
    We don't find these matches because the second table has an unqualified alias over the column name because it is wrapping it in a convert node.
  • 2030: unskipping fixed tests
    We have many tests that are marked skip/broken, but they are working now.
    This PR unskips and cleans up some of these skipped tests.
  • 2022: TPC-X query plan tests
    Added schemas, stats, query plans for:
    • TPC-H
    • TPC-DS
    • IMDB join planning benchmark
      Added plangen to auto-update the tests.
      We cannot parse all of the TPC-DS query plans yet. I saw some ROLLUP and aggregation validation errors.
      Excluding data ops benchmark because the plans are not interesting.
  • 1786: support event execution
    This PR adds event execution logic implementing EventScheduler interface in the engine.
    Notes:
    • Event Scheduler status cannot be updated at run-time.
    • Event DISABLE ON SLAVE status is not supported. It will be set to DISABLE by default.
      Corresponding Dolt changes: https://github.com/dolthub/dolt/pull/6108

vitess

  • 278: fix unsigned flag for COM_STMT_EXECUTE when new_params_bind_flag is set
    In the previous implementation, we assumed that the way the MySQL Protocol specifies Column Definitions is the same as how it specifies parameter types for COM_STMT_EXECUTE. The difference lies specifically in the flags that come after the field type.
    When reading/writing a field type (for a Column Definition), MySQL expects/writes a 1 byte wide enum_field_type followed by a 2 byte wide Column Definition Flag.
    However, when reading a COM_STMT_EXECUTE payload (that specifies parameters through new_params_bind_flag), MySQL indicates parameter_types with the same 1 byte wide enum_field_type followed by a 1 byte wide flag that indicates signedness.
    So basically, read 0x80 for COM_STMT_EXECUTE parameters, but read/write 0x20 for field_types/column definitions.
    I'm assuming MySQL does it this way because the majority of the Column Definition Flags are nonsensical/meaningless when paired up with parameters to prepared statements. Regardless, this was a subtle bug, and we should have tests for parsing COM_STMT_EXECUTE with new_params_bind_flag.
    Fixes https://github.com/dolthub/dolt/issues/6728
  • 277: Allow parsing of CREATE TABLE t AS (...) UNION (...)
    This allows parsing of CREATE TABLE AS statements when the expression being used to create the table is a set operation like UNION, INTERSECT, or EXCEPT.
    The "AS" keyword is typically optional. But this change only allows set ops to be used with CREATE ... AS when the AS is explicit. This is to avoid an ambiguity in the current grammar when attempting to parse CREATE TABLE t (, where what follows could be a set op or a table definition. Fully matching MySQL's spec here would require rewriting our grammar to avoid this ambiguity, which is outside the scope of the PR. However, this PR makes us strictly more correct than we were before.
  • 276: Allow parsing of SECONDARY_ENGINE = NULL
    This is a simple change to allow parsing a NULL value for the SECONDARY_ENGINE attribute for CREATE TABLE and ALTER TABLE statements.

Closed Issues

  • 6724: dolt merge doesn't produce deterministic hashes
  • 6728: Out of Range for bigint unsigned with question mark
  • 6691: Renaming a table breaks views using that table
  • 5498: Support CREATE EVENT statement
  • 6393: Handle schema merge for column and FK drop automatically
  • 6406: FOUND_ROWS() returns incorrect results
  • 6343: Produce a diff of two arbitrary queries
  • 6572: Prepared statements cache AST nodes
  • 1782: Error 1105: -128 out of range for BIGINT UNSIGNED

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.97 1.4
groupby_scan 12.98 18.28 1.4
index_join 1.3 4.82 3.7
index_join_scan 1.25 2.26 1.8
index_scan 33.12 59.99 1.8
oltp_point_select 0.14 0.4 2.9
oltp_read_only 2.71 7.3 2.7
select_random_points 0.31 0.72 2.3
select_random_ranges 0.37 0.97 2.6
table_scan 33.12 58.92 1.8
types_table_scan 75.82 173.58 2.3
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.18 5.88 1.1
oltp_insert 2.71 2.86 1.1
oltp_read_write 6.32 14.21 2.2
oltp_update_index 2.61 2.91 1.1
oltp_update_non_index 2.76 2.86 1.0
oltp_write_only 3.75 7.04 1.9
types_delete_insert 5.28 6.21 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.9
dolt - 1.17.0

Published by github-actions[bot] about 1 year ago

This release contains backwards incompatible changes:

  • Scheduled Events Executor – An event executor thread is started up by default to execute Scheduled Events.

Per Dolt’s versioning policy, this is a minor version bump because applications using previous versions of Dolt were able to create events, but they would not be executed. This release changes that behavior – when a dolt sql-server is running, created events are now executed on their defined schedule. This behavior can be disabled by starting the dolt sql-server with the --event-scheduler=OFF parameter.

Merged PRs

dolt

  • 6733: Moved all environment variables relevant to a customer into a common file so we can collect and document them more easily
    This is in a new package with no other dependencies so it can be referenced anywhere without causing dependency cycles.
    Also deleted altertests, which have been disabled for over a year.
  • 6702: reduce release sizes and speed up release time
    • Uses pigz instead of gzip for faster gzipping
    • Uses 7zip to create zip and 7z archives for windows (7z archive is ~49% smaller than zip archive)
    • add ldflags='-s' to go build to reduce binary size by strip (reduces
      to 89MB)
    • Upload 7z to releases
    • tested action on fork, works
      https://github.com/phanirithvij/dolt/releases/tag/v1.16.2
      If pigz seems unnecessary feel free to remove it, it works well though.

Closed Issues

  • 6659: HEX(*) fails with panic
  • 6691: Renaming a table breaks views using that table
  • 6730: Read-Only databases should still allow calling dolt_checkout()
  • 6712: unix_timestamp(date) doesn't respect a session time zone
dolt - 1.16.5

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

go-mysql-server

  • 2035: Tests for errors during insert not fouling a session
  • 2034: prevent creating trigger on view
    fixes https://github.com/dolthub/dolt/issues/6432
  • 2033: Add VERBOSE_ANALYZER environment variable.
    This environment variable will make the analyzer output the current optimization plan after each analyzer rule.
  • 2031: UNIX_TIMESTAMP() respects session time zone
    • For UNIX_TIMESTAMP() function, it converts the time value to be in the current session TZ instead of UTC TZ before returning the final value because the initial value is parsed as in UTC TZ, which is incorrect.
    • The default value of system_time_zone global variable will be set to the system TZ instead of UTC.
  • 1786: support event execution
    This PR adds event execution logic implementing EventScheduler interface in the engine.
    Notes:
    • Event Scheduler status cannot be updated at run-time.
    • Event DISABLE ON SLAVE status is not supported. It will be set to DISABLE by default.
      Corresponding Dolt changes: https://github.com/dolthub/dolt/pull/6108

Closed Issues

  • 6730: Read-Only databases should still allow calling dolt_checkout()
  • 5498: Support CREATE EVENT statement
  • 6706: Dolt sql-server hangs weirdly when used as the backing store to Altium Designer
  • 6432: Prevent creating triggers on views
  • 6712: unix_timestamp(date) doesn't respect a session time zone
  • 4190: Stitchdata requires innodb_lock_wait_timeout be defined to connect to a Dolt database through MySQL connector
  • 4477: Indexes on tables in non-current database aren't used
  • 2145: SQL method to examine schema diffs
  • 2395: Precision of int-typed columns lost in information schema
  • 3518: Support Parquet as an output format in the -r option
  • 3700: Implement dolt patch and dolt apply
  • 3797: Suboptimal join query plan for TPC-C "new order" transaction
  • 4304: UpdateJoins don't honor check constraints on all tables
  • 5324: Old versions of dolt not available through homebrew
  • 6502: Support information_schema.columns AS OF in parser
  • 5996: Leak in hosted instance causing a crash
dolt - 1.16.4

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

go-mysql-server

  • 2029: Return innodb_lock_wait_timeout=1 always
    See: https://github.com/dolthub/dolt/issues/4190
  • 2025: Failure to push filter causes dropped filter
    We make a hard assumption during join planning that there are no errant filters in the join tree. Every filter is either a join edge, or sitting on its relation. When this is not true, the memo can generate a transitive edge between two relations that loses track of the original filter. The process for triggering this bug is 1) filter in an ON condition gets moved to the middle of the tree, 2) the filter fails to get pushed to its join edge/relation, 3) we generate a transitive join edge that loses track of that filter, and then 4) we choose the transitive join edge in costing.
    You'll see the filter restored in the integration query plans in the PR. I added a minimal repro with the appropriate ON conditions and forced a transitive edge that drops the filter if pushdown regresses in the future.

vitess

  • 276: Allow parsing of SECONDARY_ENGINE = NULL
    This is a simple change to allow parsing a NULL value for the SECONDARY_ENGINE attribute for CREATE TABLE and ALTER TABLE statements.
  • 275: Adding parser support for VISIBLE and INVISIBLE modifiers for indexes
    Fixes: https://github.com/dolthub/dolt/issues/6690

Closed Issues

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.91 1.4
groupby_scan 12.98 17.95 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.21 2.22 1.8
index_scan 33.12 57.87 1.7
oltp_point_select 0.14 0.39 2.8
oltp_read_only 2.66 7.17 2.7
select_random_points 0.3 0.72 2.4
select_random_ranges 0.37 0.95 2.6
table_scan 33.12 57.87 1.7
types_table_scan 74.46 170.48 2.3
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.77 5.88 1.0
oltp_insert 2.76 3.02 1.1
oltp_read_write 6.43 14.21 2.2
oltp_update_index 2.81 2.97 1.1
oltp_update_non_index 2.91 2.91 1.0
oltp_write_only 3.82 7.17 1.9
types_delete_insert 5.47 6.32 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.8
dolt - 1.16.3

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6699: go/libraries/doltcore/sqle: cluster: When a database is dropped, shutdown the replication goroutines and stop returning its status in the dolt_cluster_status table.
  • 6694: Support for fulltext index rewrites inline with table rewrites
    Also includes changes required from always closing table editors, even on an error.
    Companion PR with support for GMS table rewrites, and similar changes to GMS fulltext index management:
    https://github.com/dolthub/go-mysql-server/pull/2008

go-mysql-server

  • 2026: Do not error for SHOW view indexes
    Until we support view indexes, return nil for show view keys/indexes.
    fixes https://github.com/dolthub/dolt/issues/6705
  • 2025: Failure to push filter causes dropped filter
    We make a hard assumption during join planning that there are no errant filters in the join tree. Every filter is either a join edge, or sitting on its relation. When this is not true, the memo can generate a transitive edge between two relations that loses track of the original filter. The process for triggering this bug is 1) filter in an ON condition gets moved to the middle of the tree, 2) the filter fails to get pushed to its join edge/relation, 3) we generate a transitive join edge that loses track of that filter, and then 4) we choose the transitive join edge in costing.
    You'll see the filter restored in the integration query plans in the PR. I added a minimal repro with the appropriate ON conditions and forced a transitive edge that drops the filter if pushdown regresses in the future.
  • 2018: Avoid corrupting the privileges file
    Currently, the last of these commands results in a panic due to the revoke inserting a database with an empty string for a name. This is fairly awkward to test in GMS, so I'm going to take the easy route and create a bats test in dolt.
    lcl:~/Documents/data_dir_1/db3$ dolt init
    Successfully initialized dolt data repository.
    lcl:~/Documents/data_dir_1/db3$ dolt sql
    # Welcome to the DoltSQL shell.
    # Statements must be terminated with ';'.
    # "exit" or "quit" (or Ctrl-D) to exit.
    db3> CREATE USER 'testuser'@'localhost' IDENTIFIED BY 'password';
    Query OK, 0 rows affected (0.00 sec)
    db3> CREATE DATABASE foo;
    db3> GRANT INSERT ON foo.* TO 'testuser'@'localhost';
    Query OK, 0 rows affected (0.00 sec)
    db3> REVOKE INSERT ON foo.* FROM 'testuser'@'localhost';
    Query OK, 0 rows affected (0.00 sec)
    db3> SHOW GRANTS FOR testuser@localhost;
    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x2 addr=0x40 pc=0x101940894]
    
  • 2008: Rewrote memory table editors to enable table rewrites
    This also fixes a number of bugs that were discovered during this process, notably not properly closing table editors in all instances.

Closed Issues

  • 6705: SHOW KEYS FROM only works for views
  • 6700: incorrect commit timestamps
  • 6648: Allow hex values for statement parameters that expect int
  • 6682: No support for TINYINT(1)?
  • 6690: support (IN)VISIBLE index feature
dolt - 1.16.2

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6688: Retain display width for TINYINT(1)
    MySQL allows integer fields to specify a display width (e.g. TINYINT(1)). Dolt currently parses that information, but doesn't retain it anywhere. This PR changes that behavior to match MySQL and retain the display width setting so that it can be passed back to callers.
    As of MySQL 8.1.0, the display width setting is ONLY retained for signed TINYINT fields and ONLY when the display width is set to 1.
    Fixes: https://github.com/dolthub/dolt/issues/6682
    GMS PR: https://github.com/dolthub/go-mysql-server/pull/2017

go-mysql-server

vitess

  • 275: Adding parser support for VISIBLE and INVISIBLE modifiers for indexes
    Fixes: https://github.com/dolthub/dolt/issues/6690
  • 274: Allow CREATE TABLE and ALTER TABLE to accept hexnum and float values when integers are expected.
    Fixes https://github.com/dolthub/dolt/issues/6644 and https://github.com/dolthub/dolt/issues/6648
    MySQL is permissive in what it expects for DDL statement parameters: in many places where ints are expected, the parser will also accept a hex number (and convert) or a float number (and truncate.)
    None of these values are currently used by GMS, so we don't need to add any additional processing logic. But the parser needs to accept them when they appear.
  • 272: Set character set IDs to current MySQL version
    The character set values were set for MySQL 5.0, so they've been updated to the correct values for 8.0.
  • 271: parsing intersect and except
    This PR adds support for parsing the keywords INTERSECT and EXCEPT.
    These work similar to UNION and work with DISTINCT and ALL keywords.
    Additionally, there are new precedence tests; INTERSECT has a higher precedence than UNION and EXCEPT.
    The rest are parsed left to right.
    syntax for https://github.com/dolthub/dolt/issues/6643

Closed Issues

  • 6644: Cast floating point values for statement parameters that expect int
  • 6643: Implement EXCEPT and INTERSECT
  • 6683: Selecting from a view shows column names as all lowercase

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.86 1.4
groupby_scan 12.98 17.95 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.21 2.22 1.8
index_scan 33.12 58.92 1.8
oltp_point_select 0.14 0.4 2.9
oltp_read_only 2.66 7.17 2.7
select_random_points 0.31 0.72 2.3
select_random_ranges 0.37 0.95 2.6
table_scan 33.12 57.87 1.7
types_table_scan 75.82 167.44 2.2
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.47 5.67 1.0
oltp_insert 2.52 3.02 1.2
oltp_read_write 6.09 14.21 2.3
oltp_update_index 2.52 3.02 1.2
oltp_update_non_index 2.66 2.91 1.1
oltp_write_only 3.62 7.17 2.0
types_delete_insert 5.47 6.21 1.1
writes_mean_multiplier 1.4
Overall Mean Multiple 1.9
dolt - 1.16.1

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

go-mysql-server

  • 2016: Bug fix: Preserve column name case for views
    Fixes https://github.com/dolthub/dolt/issues/6683
    Dolt CI Checks: https://github.com/dolthub/dolt/pull/6684
  • 2015: GROUP BY identifiers should prefer binding to table columns over projections.
    This also means an expression is allowed to project from the GROUP BY column multiple times.
    Fixes https://github.com/dolthub/dolt/issues/6676
  • 2012: Insert on dup col ordinal bug
    A certain set of conditions causes an error for indexing on duplicate update expressions:
    • The source is a SELECT statement (not a VALUES row)
    • All columns are specified by the INSERT (not sure why, but partial columns seems to get rearranged correctly. I think we must insert a compensating projection to handle column defaults)
    • The source columns are not the same order as the destination table schema
    • On duplicate update expression references a column from the new row
      For the query below, we were indexing the on duplicate expression in the wrong order, causing the output row to be two zero types:
    create table xy (x int primary key, y datetime);
    insert into xy (y,x)
    select * from (select cast('2019-12-31T12:00:00Z' as date), 0) dt(a,b)
    on duplicate key update x=dt.b+1, y=dt.a;
    
    The way we resolve inserts is still a bit weird. We resolve the source, and then afterwards add a projection to rearrange columns to match the target schema. I ran into a lot of problems trying to rearrange that ordering (first add projection, then analyze), mostly due to our inability to fix indexes on the source node's projection (VALUE nodes don't have a schema, and it isn't obvious when walking a tree that a given projection is going to be special). When we add the projection afterwards, however, it avoids the indexing rule so we can inline the values safely.
    My current fix is to mimic the projection mapping inside indexing. Index the duplicate expression values based on the ordinal of the destination schema. LOAD DATA for some reason needs its insert columns to not be specified, which will probably the source of different issues at some point.
    fixes: https://github.com/dolthub/dolt/issues/6675

Closed Issues

  • 6676: MySQL allows duplicate column names, but Dolt doesn't
  • 6675: dolt table import -u fails to properly parse order of columns
  • 6663: Dolt uses incorrect type for result of SUM and AVG

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.91 1.4
groupby_scan 12.98 17.95 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.21 2.22 1.8
index_scan 32.53 57.87 1.8
oltp_point_select 0.14 0.4 2.9
oltp_read_only 2.71 7.3 2.7
select_random_points 0.31 0.72 2.3
select_random_ranges 0.37 0.95 2.6
table_scan 33.12 57.87 1.7
types_table_scan 74.46 167.44 2.2
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.74 5.57 1.2
oltp_insert 2.35 2.76 1.2
oltp_read_write 5.99 13.95 2.3
oltp_update_index 2.3 2.81 1.2
oltp_update_non_index 2.3 2.71 1.2
oltp_write_only 3.3 7.04 2.1
types_delete_insert 4.65 6.09 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 1.9
dolt - 1.16.0

Published by github-actions[bot] about 1 year ago

This release contains backwards incompatible changes:

  • Branch control permissions are now replicated from the primary to all standby servers when using cluster replication.

Merged PRs

dolt

  • 6678: Hash-qualified DB errors on first reference if hash doesn't exist
  • 6669: go: sqle: cluster: First pass at replicating branch control permissions.
  • 6657: Delete docstring for push
  • 6652: migrates fetch to use sql queries
    This change updates dolt fetch to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922

go-mysql-server

  • 2014: Improve message from json_extract() when json path doesn't start with a '$'
  • 2013: Change Project nodes so they can't return negative zero.
    In some cases we differ from MySQL in what types we use for intermediate results. This doesn't usually affect the final output, and can be more performant.
    But if the final result of an expression is a float when MySQL deduces it to be a decimal, and we don't do anything else to value that would cause it to be coerced (such as inserting it into a table with a schema), then we could end up displaying a result of the wrong type to the user. Usually this doesn't matter, unless that result is the float value -0 when the user expects a decimal.
    Ideally we'd prefer to detect the expected type and do a cast, but this is an acceptable stopgap measure.
  • 2012: Insert on dup col ordinal bug
    A certain set of conditions causes an error for indexing on duplicate update expressions:
    • The source is a SELECT statement (not a VALUES row)
    • All columns are specified by the INSERT (not sure why, but partial columns seems to get rearranged correctly. I think we must insert a compensating projection to handle column defaults)
    • The source columns are not the same order as the destination table schema
    • On duplicate update expression references a column from the new row
      For the query below, we were indexing the on duplicate expression in the wrong order, causing the output row to be two zero types:
    create table xy (x int primary key, y datetime);
    insert into xy (y,x)
    select * from (select cast('2019-12-31T12:00:00Z' as date), 0) dt(a,b)
    on duplicate key update x=dt.b+1, y=dt.a;
    
    The way we resolve inserts is still a bit weird. We resolve the source, and then afterwards add a projection to rearrange columns to match the target schema. I ran into a lot of problems trying to rearrange that ordering (first add projection, then analyze), mostly due to our inability to fix indexes on the source node's projection (VALUE nodes don't have a schema, and it isn't obvious when walking a tree that a given projection is going to be special). When we add the projection afterwards, however, it avoids the indexing rule so we can inline the values safely.
    My current fix is to mimic the projection mapping inside indexing. Index the duplicate expression values based on the ordinal of the destination schema. LOAD DATA for some reason needs its insert columns to not be specified, which will probably the source of different issues at some point.
    fixes: https://github.com/dolthub/dolt/issues/6675
  • 2009: Re-enable query logging by default for DEBUG log level
  • 2007: Fixed character set IDs
    Character set IDs should correlate with their default collation's ID. Previously, they were arbitrarily assigned by sorting their names alphabetically. This should not be a breaking change for anyone, as the comment on the CharacterSetID mentions that the ID may change, and should not be persisted. Dolt, the largest integrator, abides by this rule.
  • 2006: Semi join and FDs bug
    Returning no projections from a table causes "column not found errors" when we try to reference those expressions higher in the tree. This fixes the semi join transform to creating empty projections.
    This fixes two bugs. The first is that we were too conservative checking whether index keys were strict FDs for a join relation. When a relation has a constant applied to a primary key, we can assume all of the columns returned by that join will be constant. Fixing that made it easer to test certain semi -> right lookup join transforms which were buggy. For the same case, when we are doing a lookup into table where a constant filter satisfies an index key, we need to still return a projection set that covers non-pruneable columns used in higher-level nodes.
  • 2004: Enable use of slices in tuples for HashLookups
    Currently FusionAuth crashes Dolt with the following error:
    panic: runtime error: hash of unhashable type []uint8
    
    All FusionAuth IDs are binary(16), and join on those values in a HashLookup was resulting in using two []uint8 being used as a key to a hashtable. Nested arrays in tuples were tripping on an optimization made for short arrays. We've verified that optimization doesn't actually made a difference, so this change simplifies the code and makes it more generic.
  • 2000: adding new ways to say varchar
    This PR makes to so the engine recognizes more ways to specify that a column is of type VARCHAR
    Companion PR: https://github.com/dolthub/vitess/pull/270
    Fixes https://github.com/dolthub/dolt/issues/6650
  • 1996: Move join indexing after all
    This should put almost all indexing logic into one rule that runs once at the end of analysis for a given query. It should require one walk of the tree, be much more correct for nested join and subquery indexing, and allow us to add nodes with special indexing logic much more easily.
    Summary:
    • new rule fixupIndexes replaces all of the other default indexing code
    • FixFieldIndexes still exists for insert source projection wrapping and LOAD DATA, but both of these can be easily rewritten to remove
    • CheckConstraintTable interface for Checks() and WithChecks() helpers
      The way the new fixup works is to traverse the tree in 3 phases. Each phase has a default mode but can be handled case-by-case for special node logic:
      (1) Descend child nodes of the current node. Collect children and child "scopes" that contain schema info
      (2) Index the current node's expressions.
      (3) Re-build the current node, and fix its "schema" to pass upwards into a parent
      The walk and scopes used for indexing should mirror exactly what we do on the execution side. As a result, some of the nutty logic for nodes could be standardized by changing what we do at execution time.
      note: The initial implementation is not particularly memory-efficient. Currently trying to filter for Dolt-side uses of old indexing functions that would need to be refactored.
  • 1995: Added option to change protocol listener
    This allows plugging in other protocol listeners besides the default MySQL one.
  • 1993: JSON Array Mutation
    The json_array_insert() and json_array_append() methods are in the family of other mutation functions (set,insert,replace,remove).
  • 1989: Improve the efficiency of newIndexAnalyzerForNode by avoiding visiting Filter nodes.
    This prevents super-linear runtime in generateIndexScans

vitess

Closed Issues

  • 6675: dolt table import -u fails to properly parse order of columns
  • 6627: Amending commits on a primary breaks remote based replication
  • 6650: Support all the different ways to specify character types
dolt - 1.15.0

Published by github-actions[bot] about 1 year ago

This release contains backwards incompatible changes:

  • Servers running as remote replicas now pull divergent remote heads (e.g. those pushed with dolt push -f) by default. This setting is controlled by the server variable @@dolt_read_replica_force_pull. This variable previously had a default of 0, and now defaults to 1. To restore the previous behavior on remote replicas, run set @@persist.dolt_read_replica_force_pull = 0 and restart the server.
  • Certain stored procedures have a different result schema. dolt_gc and dolt_backup previously returned 1 for success and 0 for failure, and this has been reversed. Additionally, dolt_backup, dolt_fetch, dolt_gc, and dolt_push have changed the name of the column in their result schema, from success to status.

Per Dolt’s versioning policy, this is a minor version bump because applications written for older versions of dolt may need changes to continue functioning the same.

Merged PRs

dolt

  • 6654: Standardize stored procedure success output messages
    Standardizes stored procedures that report a success/failure message to print Status of 0 on success and Status of 1 on failure. Changes from the old output of printing Success of 1 on success and Success of 0 on failure.
  • 6651: fixes log panic
    Fixes panic when calling log in non-dolt repo by adding missing return statement.
  • 6647: fix formatting issue in status
    fixes missing line break in dolt status when up to date with remote
  • 6635: Change default of replication pull to force
    Fixes https://github.com/dolthub/dolt/issues/6627
    This change alters the default of @@dolt_read_replica_force_pull from 0 to 1. By default, read replicas will now always pull from the remote, even when the current head has diverged (such as in the case of a commit --amend or push -f to the remote).
    Also changes dolt pull to automatically perform merges on the remote tracking branch as needed -- previously we required that this be a fast-forward merge only, which means that you couldn't pull down a force-pushed commit without a lot of other commands and headache.
    Also fixes a bug where boolean values couldn't be persisted with on or off.

Closed Issues

  • 6653: Parse unique parameters for MySQL's MERGE engine.
  • 6627: Amending commits on a primary breaks remote based replication
  • 6645: Support TABLE_CHECKSUM as an alias for CHECKSUM
  • 6632: Parse SECONDARY_ENGINE parameter in queries
  • 6642: Support MySQL TABLE statement.
  • 6585: Support for JSON modification functions

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.91 1.4
groupby_scan 13.22 18.28 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.21 2.22 1.8
index_scan 33.12 58.92 1.8
oltp_point_select 0.14 0.39 2.8
oltp_read_only 2.71 7.17 2.6
select_random_points 0.31 0.7 2.3
select_random_ranges 0.37 0.95 2.6
table_scan 33.12 58.92 1.8
types_table_scan 74.46 167.44 2.2
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.77 5.99 1.0
oltp_insert 2.71 3.13 1.2
oltp_read_write 6.43 14.21 2.2
oltp_update_index 2.66 3.07 1.2
oltp_update_non_index 2.97 2.97 1.0
oltp_write_only 3.82 7.17 1.9
types_delete_insert 6.43 6.32 1.0
writes_mean_multiplier 1.3
Overall Mean Multiple 1.8
dolt - 1.14.1

Published by github-actions[bot] about 1 year ago

Merged PRs

dolt

  • 6637: Bug fix: Check constraint merging
    We had a typo in the check constraint merging code that was only triggered when an existing check constraint was altered on the "source" side of the merge and the "destination" side of the merge had the exact same definition as the common ancestor. The new test shows a concrete example. Note that the schema merge tests run in both directions, from left to right, and from left to right, so they are better at catching bugs like this that only appear in one merge direction.
  • 6631: go: sqle: async replication: Fix a bug in the logic which decides what to push to stop pushing deleted branches after we push them once.
  • 6630: go: sqle/cluster: mysqldb_persister.go: Add exponential backoff to users and grants replication to avoid spamming logs.
  • 6622: allows dolt log outside dolt repos
    Allows dolt log to be called outside of dolt repos
  • 6621: fixes profiles not working with multiple profiles added
    Fixes an issue with adding and using multiple profiles

go-mysql-server

  • 1994: Fixes for subquery indexing/correlation tracking
    I missed a place where we were using getField indexes to do a correlation check. This adds the changes necessary to replace that check with one that uses a subquery's tracked correlation column set. This also adds an expression id column to GetField expression that preserves the id tracking information between iterative passes of the analyzer. It would probably be preferable to avoid all cases where we unnecessarily re-run rules, but this is more near at hand.
  • 1993: JSON Array Mutation
    The json_array_insert() and json_array_append() methods are in the family of other mutation functions (set,insert,replace,remove).
  • 1992: Delete fixidx from pushdown
    Summary:
    • delete fixidx usages in pushdownFilters and generateIndexScans
    • use a mapping on subquery aliases when pushing filters through subqueries
    • better max1Row memo now
    • misc changes to re-indexing to compensate for work previously done during pushdown
  • 1991: Fix show processlist panic
    close https://github.com/dolthub/dolt/issues/6625
  • 1990: Cast limit/offset to int type for prepared path
    fixes: https://github.com/dolthub/dolt/issues/6610
  • 1988: JSON Mutation functions implemented as stored procedures
    Earlier PR (https://github.com/dolthub/go-mysql-server/pull/1983) enabled us to modify json document objects, but none of that functionality was exposed as actual JSON_* functions. This change ties it together. The following functions will behave identically to MySQL (to the best of my knowledge).
  • 1987: Move JSON Functions into their own sub directory
    Purely mechanical refactor done by the IDE. This is in preparation for adding support for several more JSON functions.
  • 1986: When pushing down filters, ensure removal of original filter.
    In cases where the filter expression changed during push down because the column IDs changed, we were accidentally checking for expressions with the new column IDs, not the old ones, so the old filter expressions weren't being removed.
  • 1985: Fix panic in merge join when using custom Indexes that don't allow range lookups.
    For instance, dolt has Commit indexes for tables that use commit hash as an index, but ranges don't make sense for those.
    There's no equivalent in GMS, so I created "point_lookup_table" table function for use in tests.
  • 1984: Partially reorder indexing rules; use unique ids rather than execution ids
    Name binding stores caching information upfront. Rule interdependencies pushed me into fixing a bunch of other rules before tests would pass. All together I think most of the changes are simplifications that I was planning on doing related to the fixidx refactor. I was hoping to make it more piecemeal. Hopefully this gets us ~50% of the way towards removing those dependencies.
    fixidx is mostly contained to reorderJoins and fixAuxiliaryExpressions now, both near the end of analysis. If we move the indexing in reorderJoins into fixAuxiliaryExpressions, all indexing will happen at the end of analysis. That would let us index complicated joins with subqueries correctly and all queries more reliably.
    summary:
    • rewrite cacheability to use correlated column references
    • volatile functions now prevent caching
    • rewrite moveFiltersOutOfJoinConditions to put filters below join when appropriate
    • subquery decorrelation uses (and updates) correlated column references
    • alias subquery strings simplified to use the query string, not the plan string
    • fix jsonTable and lateral join analysis
    • fixAuxiliaryExpresssions at end of analysis
    • recursive analyzer rules (insert, trigger, procedure) are all at end of analysis now
  • 1983: JSON Mutation
    Add JSON Value mutation operations and tests. These changes do not alter the operation of dolt in anyway yet - that will come in a second PR which updates the JSON_SET procedure, and adds support for the JSON_REPLACE, JSON_INSERT, and JSON_REMOVE procedures at the same time. This is laying the foundation for that work.
  • 1982: Inline flatten aliases
    This will skip a tree walk for most queries, inlining the rule in the places where nested table aliases can occur during binding.
  • 1981: Make the IntSequence test function consistently use int64 for its generated values
    The previous implementation had an issue where it assumed the type used in the received IndexLookup, but this type can actually depend on exactly how the lookup was generated (and whether the bounds value was parsed from the query or generated internally.) This caused a panic if it was used in Lookup joins.
    This makes no such assumptions and adds extra tests.
  • 1934: New Merge Join planner
    This should fix https://github.com/dolthub/dolt/issues/6020 once finished, and then some.
    The killer new feature in this new join planner is "Multi-Column Merge Joins", that is, merges where the comparison used for merges incorporates multiple filter conditions. This allows us to, in some cases, choose a much more selective index for merge joins. This improves both memory usage and performance because there will be fewer cases where the join iterator needs to keep multiple secondary rows in memory and cross-join them with multiple primary rows.
    The algorithm goes like this:
    • For each index on the left table:
    • Compute the max set of filter expressions that match that index
    • Check to see if any indexes on the right table match that same set of filters in the same order.
    • If so, use this set of filter expressions to generate a Merge Join plan. If there are multiple expressions, we combine them into a comparison on tuples.
    • Remove the last filter expression and check again; repeating until the "matched filters" list is empty.
      I added a test in join_planning_tests that demonstrates the potential of this new algorithm, allowing us to select a better index that otherwise allowed.

Closed Issues

  • 6633: DBeaver loses connection after calling DOLT_GC()
  • 6536: Dolt performance comparison to postgres and mysql
  • 6625: SHOW FULL PROCESSLIST; causes panic
  • 6610: invalid type: double since 1.13.5

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.97 1.4
groupby_scan 13.22 17.95 1.4
index_join 1.27 4.65 3.7
index_join_scan 1.21 2.22 1.8
index_scan 33.12 57.87 1.7
oltp_point_select 0.14 0.39 2.8
oltp_read_only 2.71 7.17 2.6
select_random_points 0.31 0.7 2.3
select_random_ranges 0.37 0.95 2.6
table_scan 33.12 57.87 1.7
types_table_scan 75.82 167.44 2.2
reads_mean_multiplier 2.2
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.82 5.67 1.2
oltp_insert 2.3 2.81 1.2
oltp_read_write 5.99 13.95 2.3
oltp_update_index 2.43 2.86 1.2
oltp_update_non_index 2.52 2.81 1.1
oltp_write_only 3.36 6.91 2.1
types_delete_insert 4.65 5.88 1.3
writes_mean_multiplier 1.4
Overall Mean Multiple 1.9
dolt - 1.14.0

Published by github-actions[bot] about 1 year ago

This release contains backwards incompatible changes:

  • dolt conflicts resolve and dolt_conflicts_resolve() have been temporarily changed to disallow automatically resolving schema conflicts until a new merge workflow is released that ensures all data is properly merged after schema conflicts are resolved. Until then, when there are schema conflicts, the merge can be aborted, schemas manually brought into sync, and then re-merged. For more details, see tracking issue: https://github.com/dolthub/dolt/issues/6616

Per Dolt’s versioning policy, this is a minor version bump because previous versions of the dolt conflicts resolve CLI comm and the dolt_conflicts_resolve() stored procedure would allow merging schema changes without merging data changes.

Merged PRs

dolt

  • 6619: Temporarily restrict dolt_conflicts_resolve() from resolving schema conflicts
    Until we fix the workflow around merging data after schema conflicts have been resolved, customers may be surprised by the results of dolt_conflicts_resolve() when there are schema changes, so putting this block in out of extra caution.
    For more details, see issue https://github.com/dolthub/dolt/issues/6616
  • 6607: sqle: cluster: Set the engine to read-only when a replica is in standby mode. Set it back to read-write when it becomes primary.
    This prevents standby replicas from running some DDL which they were previously erroneously allowed to run, including CREATE USER, GRANT, CREATE DATABASE and DROP DATABASE.

Closed Issues

  • 6020: Choose most selective index for Merge Join.

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 2.97 1.4
groupby_scan 12.98 17.95 1.4
index_join 1.3 4.74 3.6
index_join_scan 1.23 2.26 1.8
index_scan 33.12 58.92 1.8
oltp_point_select 0.14 0.4 2.9
oltp_read_only 2.66 7.3 2.7
select_random_points 0.3 0.7 2.3
select_random_ranges 0.37 1.03 2.8
table_scan 33.12 58.92 1.8
types_table_scan 74.46 170.48 2.3
reads_mean_multiplier 2.3
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.28 5.77 1.1
oltp_insert 2.76 2.86 1.0
oltp_read_write 6.55 14.21 2.2
oltp_update_index 2.86 3.02 1.1
oltp_update_non_index 2.86 2.97 1.0
oltp_write_only 3.96 7.17 1.8
types_delete_insert 5.18 6.09 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.9