dolt

Dolt – Git for Data

APACHE-2.0 License

Downloads
2.4K
Stars
17.1K
Committers
143

Bot releases are visible (Hide)

dolt - 1.29.5

Published by github-actions[bot] 10 months ago

Merged PRs

dolt

  • 7159: Bug fix: dolt_blame_<tablename> view should backtick quote identifiers
    Fixes the dolt_blame_<tablename> view definitions so that table names and primary key column names are backtick quoted. Without this, customers can't use the dolt_blame_<tablename> views if the table name or primary key columns contain characters (e.g. "-") that require identifier quoting.
  • 7151: go: Migrate to always use Try accessors on flatbuffer submessage access.
  • 7145: fix column_diff_table for modifed json columns
    JSON types are represented as map[string]interface{}, which panics in Golang when used with != operator.
    This PR changes the logic to use our gmstypes.Compare() logic instead,
    fixes https://github.com/dolthub/dolt/issues/7140
  • 7143: Poorly formatted version doesn't error
    Fixes an issue where bad data in version_check.txt returns an error. This change makes version overwrite any bad data in this file instead.
    Fixes: https://github.com/dolthub/dolt/issues/7138

go-mysql-server

Closed Issues

  • 7139: panic: interface conversion: sql.Node is *plan.Filter, not *plan.RangeHeap
  • 7138: dolt version "broken" (?) in 1.29.3
  • 7147: Unexpected Results of In expressions
  • 7140: Panic when trying to filter dolt_column_diff table

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.14 2.86 1.3
groupby_scan 12.98 17.63 1.4
index_join 1.34 5.0 3.7
index_join_scan 1.25 2.14 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.7 2.3
select_random_points 0.32 0.73 2.3
select_random_ranges 0.38 0.87 2.3
table_scan 34.33 55.82 1.6
types_table_scan 74.46 158.63 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 5.28 5.88 1.1
oltp_insert 2.66 2.86 1.1
oltp_read_write 7.17 14.73 2.1
oltp_update_index 2.66 3.02 1.1
oltp_update_non_index 2.71 2.97 1.1
oltp_write_only 3.89 7.17 1.8
types_delete_insert 5.28 6.43 1.2
writes_mean_multiplier 1.4
Overall Mean Multiple 1.7
dolt - 1.29.4

Published by github-actions[bot] 10 months ago

Merged PRs

dolt

go-mysql-server

  • 2196: Prepend subquery scope to sql.TableFunction nodes
    Most but not all table functions implement sql.Table. Table functions that do not implement sql.Table still need to return prepended rows to maintain indexing rules.
  • 2195: fix type promotion for in expressions
    TODO: check type promotion for int -> float/decimal for all expressions
    fixes https://github.com/dolthub/dolt/issues/7120
  • 2193: Set the original_name field in response metadata in addition to the name field
    A customer reported that the MySQL C++ Connector library was unable to retrieve column name information from a Dolt sql-server. After looking at the two wire captures between MySQL and Dolt, this is because the MySQL C++ Connector library pulls the column name from the original_name field, not from the name field.
    I've updated the unit tests that assert the expected response metadata fields are populated, and I'll follow up next with some changes in the Dolt repo to our C++ Connector library acceptance tests so that they use response metadata and assert that it is filled in.
    After that, it would be good to proactively look at any other response metadata fields that we aren't setting. For example, the Flags field seems important to fill in correctly for tooling to use.

Closed Issues

  • 7133: dolt login should give better error message if it can't open a browser
  • 7120: Unexpected Results when Using IN for Floating-Point
  • 7040: Support unique indexes on TEXT fields without a prefix length (MariaDB compatibility)

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.18 2.86 1.3
groupby_scan 12.98 17.32 1.3
index_join 1.37 5.0 3.6
index_join_scan 1.27 2.14 1.7
index_scan 33.72 55.82 1.7
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.7 2.3
select_random_points 0.32 0.72 2.2
select_random_ranges 0.38 0.86 2.3
table_scan 34.33 55.82 1.6
types_table_scan 74.46 161.51 2.2
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 5.57 5.99 1.1
oltp_insert 2.76 2.91 1.1
oltp_read_write 7.3 14.73 2.0
oltp_update_index 2.76 3.07 1.1
oltp_update_non_index 2.81 2.97 1.1
oltp_write_only 3.96 7.17 1.8
types_delete_insert 5.37 6.55 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.7
dolt - 1.29.3

Published by github-actions[bot] 11 months ago

Merged PRs

dolt

  • 7126: fixes dolt version out of date warning
    Changes dolt version out of date warning to do a check against GitHub if the current build version is ahead of the stored latest release version. Fixes inconsistencies where the current build is running a version that was released within the past week.
  • 7125: Clear out reflog contents consistently after GC
    When GC is executed, the in-memory reflog data buffer was being cleared out so that only the one, most recent entry was kept. For a sql-server, this means you can log back in and see one entry in the reflog. For a sql CLI command, since it's a new process running now, it doesn't have the reflog data buffer in memory anymore, so it has an empty reflog. This meant there was a slight behavior difference between using GC and checking the reflog depending on whether you are connecting to a sql-server or using the sql CLI (or silently connecting to a sql-server through the sql CLI command when running in local-remote mode).
    To smooth this small inconsistency out, the reflog data buffer is now completely cleared out during GC.
  • 7123: dolt table import: json,csv: Support BOM file headers.
    The semantics are as follows:
    For CSV files, the default import is an uninterpreted character encoding where newline has to match 0xa and the delimeters have to match. In general Dolt expects UTF8, but non-UTF8 characters in string fields can make it through to the imported table for encodings which are close enough to ASCII, for example. If there is a UTF8, UTF16LE or UTF16BE BOM header, then character decoding of the input stream switches to the indicated encoding.
    For JSON files, the default import is UTF8 character encoding. If there is a UTF8, UTF16LE or UTF16BE BOM header, then character decoding of the input stream switches to the indicated encoding.
  • 7118: Allow automatic merging in the presence of collation changes.
    This allows automatic merging in the case where:
    • One branch changes the collation of the column.
    • The other branch modifies cells in that column.
      It's still a requirement that only one branch is allowed to modify the column definition. So for instance, if one branch changes the collation, and the other branch widens the column, that will still be a schema merge conflict. There's no reason we can't allow it, but the logic is more complicated so I'm saving it for a follow-up PR.
  • 7104: Feature: Support BLOB/TEXT columns in unique indexes, without requiring a prefix length
    Allows TEXT and BLOB columns to be used in unique keys, without requiring that a prefix length be specified. This causes the secondary index to store a hash of the content, which is used to enforce the uniqueness constraint. This is useful to enforce uniqueness over very long fields without having to specify a threshold with a prefix length.
    This feature is supported by MariaDB and PostgreSQL, but not by MySQL. A new SQL system variable strict_mysql_compatibility is also introduced in case customers want to opt-out of extensions like this and stick to the exact behavior of MySQL. The default value of strict_mysql_compatibility is false.
    Unique secondary indexes using content-hashed fields have several restrictions, such as not being eligible for use in range scans or in any scans that require a specific order.
    There are two remaining tasks to wrap up this feature. Neither one is a correctness issue that would cause incorrect data to be added to the index, so they seemed like good candidates for follow-up PRs.
    • Use the real content value in uniqueness constraint error messages – When an unique key violation error is thrown from a content-hashed secondary index, the hashed content value is used in the error message, instead of the real content value. This makes the error message difficult to use, but doesn't affect correctness, or errors from unique indexes that don't use content-hashed fields.
    • Validate real content value on hash collision – When a hash collision occurs, we should fallback to look at the full content and make sure it's not a false positive, but this is not implemented yet. This should be extremely unlikely, does not affect unique indexes that don't use content-hashed fields, and without this check, we're still enforcing uniqueness, there's just a small risk of a false positive where we'd incorrectly identify two values as the same if their SHA1 hash is the same.
      Depends on: https://github.com/dolthub/go-mysql-server/pull/2186
      Related to: https://github.com/dolthub/dolt/issues/7040

go-mysql-server

  • 2193: Set the original_name field in response metadata in addition to the name field
    A customer reported that the MySQL C++ Connector library was unable to retrieve column name information from a Dolt sql-server. After looking at the two wire captures between MySQL and Dolt, this is because the MySQL C++ Connector library pulls the column name from the original_name field, not from the name field.
    I've updated the unit tests that assert the expected response metadata fields are populated, and I'll follow up next with some changes in the Dolt repo to our C++ Connector library acceptance tests so that they use response metadata and assert that it is filled in.
    After that, it would be good to proactively look at any other response metadata fields that we aren't setting. For example, the Flags field seems important to fill in correctly for tooling to use.
  • 2186: Feature: Support BLOB/TEXT columns in unique indexes, without requiring a prefix length
    Allows TEXT and BLOB columns to be used in unique keys, without requiring that a prefix length be specified. This causes the secondary index to store a hash of the content, instead of the content itself, and then that hash is used to enforce the uniqueness constraint. This is useful to enforce uniqueness over very long fields without having to specify a threshold with a prefix length.
    This feature is supported by MariaDB and PostgreSQL, but not by MySQL. A new SQL system variable strict_mysql_compatibility is also introduced in case customers want to opt-out of extensions like this and stick to the exact behavior of MySQL. The default value of strict_mysql_compatibility is false.
    Unique secondary indexes using content-hashed fields have several restrictions, such as not being eligible for use in range scans or in any scans that require a specific order.
    The GMS in-memory secondary index implementation takes a simple approach – it doesn't actually hash encode the content-hashed fields, and instead includes the full column value. This is consistent with how the GMS in-memory index implementation handles other features, such as prefix lengths, which are also a no-op and the full content is stored in the secondary index.
    Dolt integration: https://github.com/dolthub/dolt/pull/7104
    Related to: https://github.com/dolthub/dolt/issues/7040

Closed Issues

  • 6709: dolt_merge() MySql return content missing column names since Dolt 1.11.1
  • 7116: Dolt Checkout: -B Support

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.86 1.4
groupby_scan 13.22 17.32 1.3
index_join 1.34 5.0 3.7
index_join_scan 1.27 2.14 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.7 2.3
select_random_points 0.32 0.72 2.2
select_random_ranges 0.39 0.87 2.2
table_scan 34.33 55.82 1.6
types_table_scan 75.82 161.51 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 5.67 5.99 1.1
oltp_insert 2.86 2.97 1.0
oltp_read_write 7.43 14.73 2.0
oltp_update_index 2.86 3.07 1.1
oltp_update_non_index 2.91 2.97 1.0
oltp_write_only 4.03 7.3 1.8
types_delete_insert 5.67 6.55 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.7
dolt - 1.29.2

Published by github-actions[bot] 11 months ago

Merged PRs

dolt

go-mysql-server

  • 2189: Upgraded xxhash to v2
  • 2187: fix round() handling of scale, precision, and nulls
    This PR has ROUND() behavior match MySQL more closely specifically when handling NULLs.
    Additionally, it refactors the function to no longer use custom logic, and rely on decimal.Decimal library for conversions.
    The slowness from the original issues stems from the math.Pow() function that is attempting to raise precision to some huge negative number. This PR solves that problem by constraining the precision values to our DecimalMaxScale (30).
    A better constraint would be -308 since that's the max scale of a float supported by MySQL. However, we don't go nearly as far. If we knew the scale of the passed in value we could also constrain the precision that way.
    This PR also rewrites the ROUND() unit tests to be structured like other unit tests for functions, and fixes their handling of null arguments.
    fixes: https://github.com/dolthub/dolt/issues/7073
  • 2182: fix IN_SUBQUERY projection bugs
    Correctness regression fix. With a bit more work this could probably be a smaller query:
    CREATE VIEW view_2_tab1_157 AS SELECT pk, col0 FROM tab1 WHERE NOT ((col0 IN (SELECT col3 FROM tab1 WHERE ((col0 IS NULL) OR col3 > 5 OR col3 <= 50 OR col1 < 83.11))) OR col0 > 75);
    
    The CREATE panicked because the top-level projections get pushed into the source node, and my recent refactors failed to map projections onto the reported table output column sets.
  • 2178: Improve handling of charset and collate in column options.
    https://github.com/dolthub/vitess/pull/293 should be merged before this.
    This PR does two main things:

    Parse and validate the collate option, even on binary columns.

    Currently the collate option is ignored on columns of binary type, an we just assume binary collation because it's the only one allowed. This is usually correct but causes some problems.
    CREATE TABLE t (pk varbinary(10) collate utf8mb4_0900_bin); shouldn't parse, because it's attempting to use an illege collation for column pk. However, we currently ignore the option and parse it anyway.
    CREATE TABLE t (pk varbinary(10)) collate utf8mb4_0900_bin; on the other hand, needs to succeed. Binary columns do not inherit the default collation of the table.

    Reject the charset option except on columns that allow it.

    According to MySQL, only text, sets, and enums are allowed to have character sets. Attempting to specify a character set for any other column type is an error. Before this PR, we were simply ignoring the character set where it didn't make sense.
    A good way to think of it is that varbinary is like a shorthand for varchar charset binary. In fact, you're even allowed to write CREATE TABLE t (pk varchar(10) collate binary); and MySQL will generate a varbinary(10) column. Since the column already has a specified char set, it doesn't default to the table charset/collation. And you can't supply an explicit charset to the column because it already has one implicitly.

vitess

  • 294: Allow SqlType to parse "CHARACTER" and add tests for every other possible type that could be passed in.
    SqlType is a function in Vitess for normalizing every type name. It was missing an entry for the "CHARACTER" keyword.
    I added tests that should verify every single valid type keyword in the grammar, so this shouldn't happen again.
  • 293: Add additional types to sqlparser.SQLType()
    This function is used when the parser needs to map type names to underlying types in order to judge the validity of certain queries. Some types are aliases for others, (like REAL is an alias for float64) but they weren't included in SQLType(), so certain expressions that used these types could panic.
  • 292: Parse COLLATE BINARY on individual columns.
    We should be able to parse statements like:
    create table test (pk varchar(255) collate binary)
    This particular example will eventually get rewritten as create table test (pk varbinary(255)), but that doesn't happen during parsing, so the added vitess tests still expect varchar.
  • 291: round trip SHOW PLUGINS

Closed Issues

  • 3417: dolt version should tell me when I'm out of date
  • 7121: dolt init freezes under windows with 1.29.0
dolt - 1.29.1

Published by github-actions[bot] 11 months ago

Merged PRs

dolt

  • 7107: go/libraries/doltcore/sqle: DatabaseProvider: Fix a bug where a database created with call dolt_clone() would not be dropped until the server was restarted.
    Fixes #7106.
  • 7094: fix inf loop for creds search on windows
    There was an infinite loop for dolt init and dolt sql when there isn't a .dolt directory in current directory or any children directory. This issue was specific to windows, because we were only checking for / and not \.
  • 7092: migrate dolt gc to use sql queries
    This change updates dolt gc to use the appropriate sql engine to generate results.
    Related: https://github.com/dolthub/dolt/issues/3922
  • 7091: minor doc fixes
    Updates doc strings for query-diff and ls.
  • 7090: Added a periodic heartbeat event metric to sql-server
    Also adds new constants for Doltgres as an application
  • 7088: add --single-branch flag for dolt clone
    Adds --single-branch flag for dolt clone to match git behavior more closely.
    Fixes: https://github.com/dolthub/dolt/issues/3873
  • 7065: Cache the computation of default columns during three-way merges.
    This is a slow operation because it actually generates the CREATE TABLE string for the merged schema and uses yacc to parse it so we can correctly resolve default column expressions.
    But we should only need to do this once, not once per row.
    This also contains a rough performance regression test. Without this fix, the test should time out on GitHub's CI.
  • 7032: CLI command for reflog
    Adds the CLI command dolt reflog which displays results form the dolt_reflog() table function

go-mysql-server

  • 2185: fix panic of concurrent map writes, when using in memory mode
    Replaces #2179
  • 2182: fix IN_SUBQUERY projection bugs
    Correctness regression fix. With a bit more work this could probably be a smaller query:
    CREATE VIEW view_2_tab1_157 AS SELECT pk, col0 FROM tab1 WHERE NOT ((col0 IN (SELECT col3 FROM tab1 WHERE ((col0 IS NULL) OR col3 > 5 OR col3 <= 50 OR col1 < 83.11))) OR col0 > 75);
    
    The CREATE panicked because the top-level projections get pushed into the source node, and my recent refactors failed to map projections onto the reported table output column sets.
  • 2181: Improve IN_SUBQUERY table disambiguation
    When unnesting and IN_SUBQUERY into a parent scope with a table name clash, rename the child table and update its references to the new name. Prevent EXISTS subqueries from unnesting if it doesn't full decorrelate the child scope.
  • 2177: Properly round IndexAccess Bounds for float/decimal type filters over integer columns
    When generating indexes ranges, we don't convert the range bounds type to the index bound type until later on.
    Additionally, when we do convert (specifically floats to ints) we round the floats based on the halfway point, leading to indexes skipping over rows depending on how it rounds.
    This PR changes that to convert the types earlier (keeping the key type and index type consistent), and rounding floating point bounds correctly to not leave out any rows.
    fixes https://github.com/dolthub/dolt/issues/7072
  • 2175: Fix existant typo
  • 2174: Having aggregate alias bug
    fixes: https://github.com/dolthub/dolt/issues/7082

Closed Issues

  • 7106: Attempting to drop a database which was created with dolt_clone fails.
  • 7073: Potential Issue Using ROUND
  • 3873: dolt clone -branch should omit remote tracking branches for other branches
  • 7082: HAVING clause is handled incorrectly when it references a name that matches both a column and an alias.
  • 1035: dolt version needs --verbose flag that includes commit hash and timestamp
  • 6983: Issue saying the database is readonly
  • 7072: Unexpected Results when Querying with COT
dolt - 1.29.0

Published by github-actions[bot] 11 months ago

This release contains backwards incompatible changes:

  • Deprecated dolt sql-client – dolt sql-client has been fully deprecated and will not be accessible via the CLI anymore. sql-client previously served as a built-in MySQL client that could be used to connect to running servers. Now, dolt sql can serve the same function with the appropriate global arguments.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 7077: remove dolt sql-client command
    Fully deprecates dolt sql-client by removing the CLI command.
    Resolves: https://github.com/dolthub/dolt/issues/6886
  • 7069: go/cmd/dolt: Implement new semantics for sql-server.lock.
    sql-server.lock is changed to sql-server.info and it is only used to get connection information to a running sql-server. Exclusive access to the databases is mediated through the file system locks, which sql-server now asserts it has before running against a database.
  • 7054: remove dolt stored procedure aliases
  • 7042: Run nullness checking on merged rows during three-way merges.
    This fixes one of the failures from https://github.com/dolthub/dolt/issues/7034, but it's the more important one (the panic)
    Before, we would panic when attempting to compute a merged row if that row would contain a null in a non-null column.
    Now, we allow this row to be generated, and we validate it before merging it into the table.
  • 7035: Detect data conflicts even when one branch has the same binary representation as the ancestor.
    Fixes https://github.com/dolthub/dolt/issues/6746
    This fixes a bug where the following situation fails to detect a conflict:
    Branch Value
    Ancestor (1, 2)
    Left (1, 2, NULL)
    Right (1, 2, 3)
    Both branches add a column but add different values to it. This should be a data conflict, but we currently don't detect it because Ancestor and Left have the same bytes in storage.

go-mysql-server

  • 2171: Bug fixes for type handling in IF and COALESCE functions
  • 2170: have flush binary logs be noop
    fixes https://github.com/dolthub/dolt/issues/7055
  • 2169: Unique table and column ids
    The motivation for this PR is making the costing/exploration phase consistent with the data structures for index costing. That means switching memo ScalarExpr back to sql.Expression. Moving the previously join-specific ids in ScalarExpr to tables and column refs lets us preserve most memo data structures and join transformation logic.
  • 2168: fix panic in math funcs
    fixes https://github.com/dolthub/dolt/issues/7060
    Additionally, fixes POW() to not have the same panic and returns warnings instead of errors for certain inputs to LOG().
  • 2166: prevent panic on nil cast for time functions
    fixes https://github.com/dolthub/dolt/issues/7056
  • 2165: fix update <table> set <column> = default
    This PR fixes a bug where attempting to update a column to its default would throw an unresolved error.
  • 2158: have DayName() return NULL on bad DATETIME conversions
    fixes https://github.com/dolthub/dolt/issues/7039
  • 2156: Decorrelate IN_SUBQUERY refactor
    Decorrelating IN_SUBQUERY into subquery aliases adds new relations to the query plans in a way that makes table and column identifier tracking difficult. So instead of converting select * from xy where x in (select u from uv) into a subquery alias join:
    Filter(
    Tablescan(xy),
    InSubquery(
    x,
    Subquery(
    Project(u, Tablescan(uv))))))
    =>
    Join(
    (x = u),
    Tablescan(xy),
    SubqueryAlias(
    'scalarSubq0',
    Tablescan(uv)))
    
    we disambiguate the subquery and create a table-table join:
    Filter(
    Tablescan(xy),
    InSubquery(
    x,
    Subquery(
    Project(u, Tablescan(uv))))))
    =>
    Join(
    (x = u),
    Tablescan(xy),
    Tablescan(uv))
    
    We disambiguate table names and column references in the process of unnesting scopes to avoid table name clashes.
    This change is better in most places, but worse in instances when we cannot unnest now because the RHS equality expression is not a valid get field reference, for example when the subquery returns the value of a GROUP_BY or WINDOW or a synthesized alias (anything without a source column Id). This is fixable in the future.
    This also fixes a few ANTI_JOIN problems that this refactor exposed.

Closed Issues

  • 6062: Dolt interactive sql-client cli: Support database name for connection
  • 6886: Deprecate dolt sql-client
  • 7029: sql-server.lock file preventing docker instance from running
  • 7055: Support flush binary logs
  • 6746: Three Way Merging won't consider a row to have a data conflict if either side has the same binary rerpresentation as the base.
  • 7060: Crash by SQRT
  • 7056: Crash by Time Functions
  • 7048: Similar queries return different results with different orderings of the where clause
  • 2159: VSCode debug Build Error

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.14 2.86 1.3
groupby_scan 12.98 17.63 1.4
index_join 1.32 5.09 3.9
index_join_scan 1.25 2.14 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.7 2.3
select_random_points 0.32 0.72 2.2
select_random_ranges 0.38 0.87 2.3
table_scan 34.33 55.82 1.6
types_table_scan 75.82 158.63 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 15.27 1.8
oltp_update_index 3.82 3.49 0.9
oltp_update_non_index 3.82 3.43 0.9
oltp_write_only 5.37 7.7 1.4
types_delete_insert 7.7 7.43 1.0
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.28.2

Published by github-actions[bot] 11 months ago

Merged PRs

dolt

go-mysql-server

vitess

  • 289: add partial support for ':=' assignment operator
    This PR adds support for set expressions and assignment expressions.
    Does not include support for select expressions, as it's deprecated on MySQL

Closed Issues

  • 7039: Crash by Function DAYNAME
  • 7046: Crash by ACOS
  • 7049: Can't use MYSQL workbench Data Import/Restore function with Dolt
  • 7038: Unexpected Results about Decimal-Boolean Casting in Filters
dolt - 1.28.1

Published by github-actions[bot] 11 months ago

Merged PRs

dolt

  • 7044: Track whether or not rows actually need to be remapped to the result schema.
    This prevents a bunch of expensive computation in places where it isn't necessary.
    Basically when computing the schema diff, we now compute:
    • Whether the left side needs to be remapped to the final schema
    • Whether the right side needs to be remapped to the final schema
    • Whether we should invalidate all secondary indexes because a column changed its type.
      The last one's the odd one out, since it's less a fact about the merge and more this derived property that could change later if we figure out a good way to update secondary indexes in the presence of schema change. But we can bikeshed the specifics once users aren't blocked by this.
      We store all this data in a new MergeInfo struct, which gets passed around the merge logic.

Closed Issues

  • 7026: Crashing by Division Operators
  • 7033: MariaDB default values for LONGTEXT types don't work in Dolt

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.81 1.3
groupby_scan 13.22 17.63 1.3
index_join 1.34 5.09 3.8
index_join_scan 1.25 2.18 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.36 7.7 2.3
select_random_points 0.32 0.73 2.3
select_random_ranges 0.39 0.87 2.2
table_scan 34.33 55.82 1.6
types_table_scan 75.82 158.63 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 15.27 1.8
oltp_update_index 3.82 3.43 0.9
oltp_update_non_index 3.82 3.36 0.9
oltp_write_only 5.37 7.7 1.4
types_delete_insert 7.7 7.43 1.0
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.28.0

Published by github-actions[bot] 11 months ago

This release contains backwards incompatible changes:

  • Stopped Replacing Spaces in DB Names – Dolt databases take their names from the name of their directory on disk. Previously, any spaces in directory names were automatically replaced with underscores to form the name of that database. This behavior has been changed to preserve spaces in database names, in order to match MySQL's database naming rules. To re-enable the older behavior, set the DOLT_DBNAME_REPLACE environment variable to any value before starting dolt sql or dolt sql-server.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 7024: Concurrent branches map
    This PR converts all instances of the RepoState.Branches to a concurrent map (implementation found in utils). Avoids issues when the branches field is accessed concurrently from different routines.
    This PR is waiting for https://github.com/dolthub/dolt/pull/7019 to be merged first to integrate those changes.
  • 7023: fix dolt fetch and dolt pull
  • 7020: use database name that is the same as directory name on disk
    As of Dolt version 1.27.0, we allow hyphens in database name and replace space characters with underscores and reduce multiple underscore characters side by side into single underscore character.
    This PR resolves:
    • Using the database name the same as directory name on disk allows round-trip to the database location on disk.
      Note:
    • A potential issue might arise when the directory name has trailing space characters. MySQL do not allow database name with trailing space characters such as mydb , whereas Dolt allows. This will cause failing import of dolt dump to MySQL.
  • 7019: Concurrent backups map
    This PR converts all instances of the RepoState.Backups to a concurrent map (implementation found in utils). Avoids issues when the backups field is accessed concurrently from different routines.
  • 7016: Allow dolt config to run in folders without write permissions
    This change allowsdolt config to run in folders without write permissions.
    Fixes: https://github.com/dolthub/dolt/issues/2313
  • 7015: Update docs for merge's --abort param
    The merge CLI command and SQL stored procedure both validate that the working set is clean before it will start a merge, so the warning about uncommitted changes isn't relevant.
  • 7008: fix panic on foreign key references dropped table for dolt diff
  • 7004: cmd/dolt/commands/sqlserver: Restructure the start up sequence for sql-server.
    We explicitly model Services, which can have an Init step, a Run step and a Stop step. Every registered service get initialized in the order they were registered in, then they all run concurrently until Stop is called, when they all get Stopped in reverse order. It's possible for clients to wait for init to be completed and be delivered any errors encountered on startup. They can also wait for stop, to be delivered any errors encountered on shutdown.
  • 6980: Correctly identify and report data conflicts when deleting rows and/or columns during a three-way merge.
    Fixes https://github.com/dolthub/dolt/issues/6747
    Also fixes https://github.com/dolthub/dolt/issues/6766
    https://github.com/dolthub/dolt/issues/6747 is important to fix because ignoring data conflicts means that a merge could put a user in a state that wouldn't be possible from a linear sequence of transactions. Issues like this (especially ones that don't display any warning to the user) need to be addressed before they cause problems.

go-mysql-server

  • 2155: Allow BLOB/JSON/TEXT columns to have literal default values (MariaDB compatibility)
    Technically, MySQL does NOT allow BLOB/JSON/TEXT columns to have a literal default value, and requires them to be specified as an expression (i.e. wrapped in parens). We diverge from this behavior and allow it, for compatibility with MariaDB.
    While testing with a binary literal, I noticed that SQLVal was converting that value to "BLOB" instead of the actual binary content, so I fixed that one, too.
    Related to: https://github.com/dolthub/dolt/issues/7033
    Dolt CI Checks: https://github.com/dolthub/dolt/pull/7036
  • 2154: Use max prec and scale for decimal oob
  • 2153: null in-tuple bugs
    fixes: https://github.com/dolthub/dolt/issues/7025
  • 2151: add decimal type to convert functions
    fixes https://github.com/dolthub/dolt/issues/7018
  • 2149: updating RangeTree node MaxUpperBound again
    A previous fix involved updating the MaxUpperBound in the RangeTree when traversing the right node, turns out we need to do that when creating a new node as well.
    To better catch overlapping range expressions, we now verify that the resulting ranges do not overlap (an operation which isn't too expensive). This fixes some plans from an index refactor.
    Additionally, this also fixes a skipped test where the ranges were not overlapping but different than the brute force approach.
  • 2148: fix for foreign key that references dropped table in information_schema.referential_constraint table
  • 2147: Fix mod bool conversion
    fixes: https://github.com/dolthub/dolt/issues/7006
  • 2146: Update MaxUpperBound when inserting into RangeTree
    Recent changes to index costing exposed a bug in RangeTree. This bug is responsible for a small regression in the sqllogictests, involving a complicated filter.
    When generating ranges over an index to satisfy a filter, we have to ensure that the resulting ranges do not overlap; we accomplish this efficiently through the use of a RangeTree (a multi-dimensional red-black tree) to split up and merge ranges.
    Given a Range, the RangeTree returns a set of Ranges that overlap it, and we replace these with a set of ranges that cover the same area, minus the overlap.
    However, during insertion (specifically when the new node is a right child), we do not update the parent's MaxUpperBound. This is important for deciding to search subtrees during FindConnection(). As a result, the RangeTree did not find all overlapping ranges, and would produce a set of Ranges that still contained overlaps.
    Additionally, this PR adds a slightly better method of checking if the resulting ranges have overlaps.
    There is also a skipped test; it's possible that there are multiple sets of non overlapping ranges that satisfy the same sets of filters.
    There will be a follow up PR that has a better method of verifying the correctness of these ranges.
  • 2145: mysql server handler intercept support
    Split from the PR https://github.com/dolthub/go-mysql-server/pull/2036.
    Add mysql server handler intercept support.
  • 2144: Push filters insensitive to table name
    Filter pushing bug that is specific to 1) table names with capital letters, and 2) filters that need to move through joins. The problem is not indexing specifically, but checking for an index is the easiest way to test this.
    dolt bump: https://github.com/dolthub/dolt/pull/7001
  • 2125: Minimal index searchable interface
    re: https://github.com/dolthub/go-mysql-server/pull/2036

Closed Issues

  • 7030: Error running dolt_diff() table function through ODBC
  • 7025: Unexpected Results when Querying with NULL values
  • 6766: Merging an altered column with a deleted row currently results in a merge conflict.
  • 6747: Three Way Merging won't detect a data conflict in a dropped column.
  • 7009: stop renaming database names with blank spaces
  • 7018: Unexpected Results about Floating-point Type Casting
  • 2313: dolt config needs write access in current directory for dolt config --add --global
  • 7006: Unexpected Results when Using '%' operator
  • 7005: Dropping a table referenced by a foreign key breaks some information_schema tables, dolt diff

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.18 2.81 1.3
groupby_scan 13.22 17.63 1.3
index_join 1.37 5.09 3.7
index_join_scan 1.27 2.18 1.7
index_scan 34.33 54.83 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.7 2.3
select_random_points 0.32 0.72 2.2
select_random_ranges 0.38 0.86 2.3
table_scan 34.33 54.83 1.6
types_table_scan 74.46 155.8 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 15.0 1.8
oltp_update_index 3.82 3.43 0.9
oltp_update_non_index 3.82 3.36 0.9
oltp_write_only 5.37 7.56 1.4
types_delete_insert 7.56 7.43 1.0
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.27.0

Published by github-actions[bot] 11 months ago

This release contains backwards incompatible changes:

  • Databases in a dolt server take their names from the name of their directory on disk. Previously, any hyphens (-) in such directory names were automatically replaced with underscores (_) when exposing the name of that database. This behavior has been changed so that hyphens in database names are preserved. To re-enable the older behavior, set the DOLT_DBNAME_REPLACE_HYPHENS environment variable.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6995: allow hyphen in db name to match its dir name
    The PR adds support for allowing hyphens,-, in a database name.
    Before we replaced hyphens with underscores when creating database using directory name. E.g. fortest-db directory name, database name will be test_db. Now, the database name for this directory will be test-db, the same as directory name. This change will be the default behavior, but it can be disabled(switch to previous behavior of replacing hyphens in database name with underscore) by ENV variable DOLT_DBNAME_REPLACE_HYPHENS set to non empty value.
  • 6994: Concurrent remotes map
    This is a first stab at converting the remotes map to a concurrent map. A new generic concurrent map has been added to the utils folder, as I couldn't find a widespread generic implementation of a concurrent map yet.
    I am creating the PR to get some feedback, even though the following still need to be done:
    • tests for the concurrent map
    • probably a conversion of backups and branches needs to be done as well
      This completely fixes the issue I've seen in my project tests when high concurrency is involved.
      This PR attempts to fix https://github.com/dolthub/dolt/issues/6965. @zachmu when you have some time, let me know what you think.

Closed Issues

  • 6491: Stop renaming databases with hyphens
  • 6965: RepoState.Remotes map not safe for concurrent use

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.76 1.3
groupby_scan 12.98 17.63 1.4
index_join 1.32 5.09 3.9
index_join_scan 1.25 2.18 1.7
index_scan 34.33 54.83 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.56 2.3
select_random_points 0.32 0.72 2.2
select_random_ranges 0.38 0.86 2.3
table_scan 34.33 55.82 1.6
types_table_scan 74.46 158.63 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 15.0 1.8
oltp_update_index 3.82 3.43 0.9
oltp_update_non_index 3.82 3.36 0.9
oltp_write_only 5.37 7.56 1.4
types_delete_insert 7.56 7.43 1.0
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.26.1

Published by github-actions[bot] 11 months ago

Merged PRs

dolt

  • 7003: Print an error when server args don't validate
    We aren't printing any errors from sql-server when there are unparsable flags.
    Previous Behavior:
    $ dolt sql-server --rasdjdsdlsdk
    $ echo $?
    1
    
    New Behavior:
    $ dolt sql-server --rasdjdsdlsdk
    error: sql-server does not take positional arguments, but found 1: asdjdsdlsdk
    $ echo $?
    1
    
  • 6996: go: sqle: cluster: commithook: When shutting down the server, the cluster replication commithook could deadlock after the replication thread missed a wakeup signal.
    Fix it so that the wakeup thread is guaranteed to see the canceled context or the wakeup signal.
  • 6992: go/libraries/doltcore/remotesrv: Ensure we stop the gRPC server and cancel inflight requests when we are multiplexing one port for HTTP and gRPC traffic.
  • 6977: Improve the persistence of the file that stores USERs and GRANTs.
    The USERs and GRANTs on a sql-server instance are stored in a separate file from the Merkle DAG table data that makes up the Dolt databases themselves. Previously, the contents of this file were not written in a crash resistant way, and they could be lost or corrupted after a crash or when taking a block-device snapshot.
    This change also changes the file's permission bits to be 0600, instead of 0777, which was much more permissive than intended.
  • 6974: Support merging schemas with virtual / generated columns
    Substantially addresses https://github.com/dolthub/dolt/issues/6945, although there are a couple remaining edge cases that are hard to crack, in this PR as skipped tests.

go-mysql-server

  • 2144: Push filters insensitive to table name
    Filter pushing bug that is specific to 1) table names with capital letters, and 2) filters that need to move through joins. The problem is not indexing specifically, but checking for an index is the easiest way to test this.
    dolt bump: https://github.com/dolthub/dolt/pull/7001
  • 2142: Idx histogram manipulation
    Add simple histogram mutators for filter types. Use histogram costs for index selection when available. Added stats docs.
    Dolt enginetests seem to be passing. Companion here: https://github.com/dolthub/dolt/pull/6997
    TODO:
    • I'd like to block statistics when only partially provided
    • TPCC plans are changed and I want to revert. Blocking partial statistics might fix those, I'm trying to get all of the actual index statistics for those tables as a better enginetest/blog example.
  • 2141: Fixing field metadata for JSON and geometry types
    JSON and geometry types should always report a binary collation in MySQL's field metadata. While debugging https://github.com/dolthub/dolt/issues/6970, I noticed that MySQL was sending a binary collation for these types, but GMS was sending back the default collation.
  • 2140: Respect character_set_results when emitting field metadata
    For non-binary types, we need to respect the value for the character_set_results session var (when not NULL) and use that for the field metadata returned in the MySQL wire protocol.
    The unexpected charset/collation metadata is causing DataGrip to be unable to work with some types in the table editor ( see https://github.com/dolthub/dolt/issues/6970 for more details).
    I've validated this behavior with MySQL by inspecting packets with Wireshark. For testing, we were already testing the behavior of character_set_results and the charset translation, but we weren't testing the field metadata. I added support to check against the expect charset field metadata, but had to use reflection to get that data from the MySQL driver, since it's not exposed through the standard golang sql database APIs.
  • 2135: Resolve indexes of columns in CREATE TABLE statements early

Closed Issues

  • 6945: virtual column merges don't work
  • 6970: Regression: Pasting into Datagrip TEXT field puts null

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.71 1.3
groupby_scan 12.98 17.63 1.4
index_join 1.37 5.0 3.6
index_join_scan 1.27 2.18 1.7
index_scan 34.33 54.83 1.6
oltp_point_select 0.17 0.43 2.5
oltp_read_only 3.3 7.56 2.3
select_random_points 0.32 0.72 2.2
select_random_ranges 0.38 0.86 2.3
table_scan 34.33 54.83 1.6
types_table_scan 74.46 155.8 2.1
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 15.0 1.8
oltp_update_index 3.82 3.36 0.9
oltp_update_non_index 3.82 3.36 0.9
oltp_write_only 5.28 7.56 1.4
types_delete_insert 7.56 7.43 1.0
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.26.0

Published by github-actions[bot] 12 months ago

This release contains backwards incompatible changes:

  • Geometry Address Encoding – Previously, Geometry Types were stored inline, which subjected them to a 65K size limit. To support larger Geometries, we now store them as BLOBs. This required an entirely new encoding as well as storage layer changes. As a result, we have bumped the feature version. For now, newer versions of the dolt client can still read the inline geometries, but older dolt clients will not be able to read newer databases. Eventually, we will deprecate the old geometry encoding entirely so we recommend doing a dolt dump and import if you have any tables with spatial types.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6978: Change to use a crash resistant, atomic file update protocol for a number of file writes.
    Makes updates to branch_control, repo_state.json and some persisted config files atomic and crash resistant.
  • 6976: go/store/nbs: file_table_persister: fsync whole table file writes.
    Dolt sometimes persists whole table files, instead of appending blocks to the chunk journal. In order to be as crash resistent as the chunk journal, we should fsync the writes before we add the new table file to the database.
    Cases which use this code path and could have seen reduced durability across crashes include: dolt_fetch, dolt_pull, dolt_clone, dolt_gc, and dolt_push/dolt_backup to a file remote.
  • 6973: make GeomAddrEnc known as an address encoding
    The addition of the new GeomAddrEnc needs to be explicitly known as an address type, so that dolt gc doesn't delete the chunks created.
    fixes https://github.com/dolthub/dolt/issues/6969
  • 6942: indexing refactor bump

go-mysql-server

  • 2140: Respect character_set_results when emitting field metadata
    For non-binary types, we need to respect the value for the character_set_results session var (when not NULL) and use that for the field metadata returned in the MySQL wire protocol.
    The unexpected charset/collation metadata is causing DataGrip to be unable to work with some types in the table editor ( see https://github.com/dolthub/dolt/issues/6970 for more details).
    I've validated this behavior with MySQL by inspecting packets with Wireshark. For testing, we were already testing the behavior of character_set_results and the charset translation, but we weren't testing the field metadata. I added support to check against the expect charset field metadata, but had to use reflection to get that data from the MySQL driver, since it's not exposed through the standard golang sql database APIs.
  • 2138: Fixed SET working for invalid charsets/collations
    Fices https://github.com/dolthub/dolt/issues/6972
  • 2137: fix nested subquery filter in exists
    We do not explore the children of subquery when attempting to decorrelate filters for exists queries; we now do this through the use of subquery.Correlated().
    We should also avoid using uncacheable subqueries as keys for IndexLookups.
    fixes https://github.com/dolthub/dolt/issues/6898
  • 2136: Fix nil range correctness bug
    We accidentally treated an OR conjunction that contributes no index filters as equivalent to an empty range. As a result, we were dropping ranges in index scans.
    For example, this is the flattened version of the test query added:
    (1: and
    (3: tab3.col0 > 80)
    (4: or
    (5: and
    (6: tab3.col0 >= 87)
    (7: tab3.col0 <= 9))
    (8: tab3.col0 IS NULL))
    (9: or
    (10: tab3.col1 >= 71.7)
    (11: and
    (12: tab3.col0 >= 94)
    (13: or
    (15: tab3.col0 < 66)
    (16: and
    (17: tab3.col0 = 85)
    (18: tab3.col1 <= 42.15))
    (19: tab3.col0 = 30)))))
    
    Filters 3 and 4 are supported by the chosen index. We want to exclude 9. We build the range collection ORs before leaves, so we attempt to collect 4->9->3. So when we convert 9 to a range collection, we expect nil, but need that nil to not zero out the 3 ranges we have already accumulated.
  • 2134: Costed index scan framework
    This is the accumulation of the following PRs:
    https://github.com/dolthub/go-mysql-server/pull/2093
    https://github.com/dolthub/go-mysql-server/pull/2104
    https://github.com/dolthub/go-mysql-server/pull/2112
    https://github.com/dolthub/go-mysql-server/pull/2124
  • 2131: Fixed Full-Text defaults
    Fixes https://github.com/dolthub/dolt/issues/6941
    Full-Text schema validation also considered the default expressions, which do not matter for the pseudo-index tables. The default expression evaluation occurs before the table editors are involved, so they're always receiving fully-realized rows. This means that we can skip the default check, since it doesn't matter whether the tables have default expressions or not.
    The root cause for the bug was due to comparing pointers. Same contents for the expressions, but different pointers to different objects, so defaults would always cause failure.
  • 2126: fix inner join filter pushdown, and alias matching
    This PR fixes an issue we have with InnerJoins where the filter condition uses a function and references columns from both tables. We need to properly search the expressions in a SubqueryExpression.
    Additionally, addresses an getIdx bug involving unqualifying aliases.
    partially fixes https://github.com/dolthub/dolt/issues/6898

vitess

  • 289: add partial support for ':=' assignment operator
    This PR adds support for set expressions and assignment expressions.
    Does not include support for select expressions, as it's deprecated on MySQL
  • 288: allow unquoted non reserved keywords for drop and rename column ddl
    fixes https://github.com/dolthub/dolt/issues/6950

Closed Issues

  • 6969: Upload to dolthub won't work for new GEOMETRY columns
  • 6972: Panic caused by invalid result character set
  • 6927: "value exceeded max field size of 65kb" when inserting large geometry
dolt - 1.25.0

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

Closed Issues

  • 6898: Nested subqueries with EXISTS and functions / aggregations produce wrong results
  • 6941: FULLTEXT index creation fails with column C1 has an incorrect definition error
  • 6964: Support := for variable assignment
dolt - 1.24.3

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6957: Avoid deserializing value on FK check
  • 6956: fix sql_mode column access of dolt_schemas table
    The databases that were created before sql_mode column was added to dolt_schemas table does not have this column, so access this column needs to be check beforehand.
  • 6933: store GEOMETRY types as BLOBs
    To get around the 65k limit on our fields, we should store GEOMETRY types as BLOBs.
    Additionally, MySQL stores these as out of band BLOBs.
    This will also probably worsen the performance of GEOMETRY types as it just takes longer to reference.
    Fixes https://github.com/dolthub/dolt/issues/6927

go-mysql-server

  • 2133: Add query plans for index/join sysbench queries
  • 2132: ReferenceChecker interface
    re: https://github.com/dolthub/dolt/pull/6957 It is expensive and unnecessary to deserialize blobs during FK reference check lookups.
  • 2130: More TPC-C tests, fix the slow HASH_JOIN
    The randIO parameter for LOOKUP_JOIN costing was perhaps too strict, since that cost is already stacked on top of the sequential cost. This isn't a replacement for better costing, but boosts TPC-C perf a bit and isn't less correct than the previous version.
    This was the motivating query, executed as a HASH_JOIN before:
    sbt> explain SELECT COUNT(DISTINCT (s_i_id)) FROM order_line3, stock3 WHERE ol_w_id = 1 AND ol_d_id = 5 AND ol_o_id < 3003 AND ol_o_id >= 2983 AND s_w_id= 1 AND s_i_id=ol_i_id AND s_quantity < 18;
    +------------------------------------------------------------------------------------------------------------+
    | plan                                                                                                       |
    +------------------------------------------------------------------------------------------------------------+
    | Project                                                                                                    |
    |  ├─ columns: [countdistinct([stock3.s_i_id])]                                                              |
    |  └─ GroupBy                                                                                                |
    |      ├─ SelectedExprs(COUNTDISTINCT([stock3.s_i_id]))                                                      |
    |      ├─ Grouping()                                                                                         |
    |      └─ LookupJoin                                                                                         |
    |          ├─ IndexedTableAccess(order_line3)                                                                |
    |          │   ├─ index: [order_line3.ol_w_id,order_line3.ol_d_id,order_line3.ol_o_id,order_line3.ol_number] |
    |          │   ├─ filters: [{[1, 1], [5, 5], [2983, 3003), [NULL, ∞)}]                                       |
    |          │   └─ columns: [ol_o_id ol_d_id ol_w_id ol_i_id]                                                 |
    |          └─ Filter                                                                                         |
    |              ├─ ((stock3.s_w_id = 1) AND (stock3.s_quantity < 18))                                         |
    |              └─ IndexedTableAccess(stock3)                                                                 |
    |                  ├─ index: [stock3.s_w_id,stock3.s_i_id]                                                   |
    |                  ├─ columns: [s_i_id s_w_id s_quantity]                                                    |
    |                  └─ keys: 1, order_line3.ol_i_id                                                           |
    +------------------------------------------------------------------------------------------------------------+
    
  • 2121: fix panic when calling ST_POINTFROMWKB() with no arguments

vitess

Closed Issues

  • 6950: Cannot DROP column if it's named geometry
  • 6927: "value exceeded max field size of 65kb" when inserting large geometry
  • 6910: No value for column when attempting to insert a record without providing a value for a column with a default
  • 6946: Error when running large dump
  • 6951: Slow merge when new index was added (re-indexing every merge?)

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.71 1.3
groupby_scan 13.22 17.01 1.3
index_join 1.34 5.09 3.8
index_join_scan 1.27 2.18 1.7
index_scan 34.33 54.83 1.6
oltp_point_select 0.17 0.41 2.4
oltp_read_only 3.36 7.3 2.2
select_random_points 0.32 0.68 2.1
select_random_ranges 0.39 0.94 2.4
table_scan 34.33 55.82 1.6
types_table_scan 74.46 158.63 2.1
reads_mean_multiplier 2.0
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 14.73 1.8
oltp_update_index 3.82 3.43 0.9
oltp_update_non_index 3.82 3.3 0.9
oltp_write_only 5.37 7.56 1.4
types_delete_insert 7.7 7.3 0.9
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.24.2

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6955: Optimize when secondary indexes need to be fully rebuilt after a merge
    At the end of a merge, Dolt was rebuilding some secondary indexes that did not need to be rebuilt. https://github.com/dolthub/dolt/issues/6951 shows one example where the destination side of a merge has an existing index that does not exist on the source side of the merge. Dolt's merge code correctly updates that secondary index in place as the merge diffs are processed, but we were still doing an unnecessary full rebuild on the secondary index after processing all the diffs for that table.
    Making this optimization uncovered another bug where column default values weren't being applied when merging a new row into a secondary index. This PR also fixes that bug, by ensuring that we remap a tuple and apply the column default value, before updating the secondary index with the new tuple.
  • 6953: Fix import bats test for existing table with FK constraints
    Fixes the bats test for overwriting data in an existing table with FK constraints to properly test the intended behavior.
    Related: https://github.com/dolthub/dolt/issues/2281
  • 6952: Importing NULL doesn't violate foreign key constraints
    Adds a test verifying that importing a NULL value doesn't violate foreign key constraints.
    Related: https://github.com/dolthub/dolt/issues/2108
  • 6947: Switch reflog data to be stored in a ring buffer
    Changes the in-memory reflog buffer to be a ring buffer, to limit on how much memory is used.
    Although we do have an async.RingBuffer type already, it has slightly different use case. This implementation is more tailored to the reflog buffer's needs, including supporting iteration, and less locking and synchronization.
  • 6912: add warning message to dolt sql-client
    Adds a warning message to dolt sql-client that this command will be deprecated. Command will be fully deprecated in 2 weeks.
    Related: https://github.com/dolthub/dolt/issues/6886

go-mysql-server

  • 2121: fix panic when calling ST_POINTFROMWKB() with no arguments

Closed Issues

  • 2281: Overwriting an existing table using import -c -f does not remove foreign key constraints
  • 2108: NULL entries treated as foreign key constraint violations on import
  • 6950: Cannot DROP column if it's named geometry
  • 6900: support int1, int2, ... types
  • 4787: Feature Request: Include csv as a dolt diff format option.
  • 4930: Output 'conflicts cat' to csv
  • 5824: Dolt SQL REPL sometimes eats up preceding lines in terminal when query is being edited
  • 5849: Cloning database with new format tables to instance with old format tables causes errors
  • 6012: ERROR 1: MySQL error message:value exceeded max field size of 65kb
  • 6118: \G does not work with dolt sql-client
  • 6220: Differences between dolt sql results and dolt sql-server results
  • 6805: Dolt doesn't log any errors with PHP and Laravel when expecting a MySQL 5.7 communication protocol
  • 6453: Export to parquet terminates with SIGKILL on large repo
  • 6527: dolt version panics in some directories
  • 6709: dolt_merge() MySql return content missing column names since Dolt 1.11.1

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.14 2.71 1.3
groupby_scan 12.98 17.32 1.3
index_join 1.32 5.0 3.8
index_join_scan 1.27 2.18 1.7
index_scan 33.72 55.82 1.7
oltp_point_select 0.17 0.41 2.4
oltp_read_only 3.3 7.3 2.2
select_random_points 0.32 0.69 2.2
select_random_ranges 0.38 0.94 2.5
table_scan 33.72 55.82 1.7
types_table_scan 74.46 161.51 2.2
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 14.73 1.8
oltp_update_index 3.82 3.43 0.9
oltp_update_non_index 3.82 3.36 0.9
oltp_write_only 5.37 7.56 1.4
types_delete_insert 7.7 7.43 1.0
writes_mean_multiplier 1.1
Overall Mean Multiple 1.6
dolt - 1.24.1

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6944: Enforce a configurable max size for reflog data
    I'll replace this with a ring buffer very soon, but I wanted to get something in place to prevent the slice from growing too large.
  • 6943: Adding an env var to disable reflog data tracking
    Allows customers to set DOLT_DISABLE_REFLOG to disable reflog data from being kept in memory.
  • 6940: Mention Doltgres in the README

Closed Issues

  • 3472: Indexing data inside JSON fields
  • 3089: Support Generated Columns
dolt - 1.24.0

Published by github-actions[bot] 12 months ago

This release contains backwards incompatible changes:

  • Chunk Journal Format – To support the new dolt_reflog() feature, a new timestamp field was added to Dolt's chunk journal format. This means Dolt databases stored on disk using Dolt versions 1.24.0 or higher cannot be opened with older Dolt versions. If you need to downgrade from Dolt 1.24.0 to an older version, you can stop any dolt sql-server that is running, then run dolt gc with Dolt 1.24.0 or higher, and then you will be able to use an older Dolt version.
  • Schema storage format - To support virtual and generated columns, Dolt now writes different information to its binary table schema format. This change isn't backwards incompatible, but it means that databases edited with this version of Dolt or later will not be usable with older versions of Dolt. Impacted users will need to upgrade to the latest Dolt binary.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6937: Adding support for listing all entries in dolt_reflog()
    Allows dolt_reflog() to be called without any arguments to list the contents of the reflog/chunk journal. Internal ref types and ref types that don't always point to a commit (i.e. working set refs) are still excluded.
    While testing this, I realized that we needed to change to iterate the chunk journal roots in order (instead of in reverse order), otherwise we don't identify the correct, first root where a ref was set to a new value and the ref updates don't appear in the right order in the reflog.
  • 6935: Remove bulk_insert CI benchmark
  • 6919: Partial support for virtual columns
    Merges don't work yet, coming in a separate PR.
    But storage format details should be final.
  • 6882: Feature: Dolt reflog support
    A new dolt_reflog() table function allows you to query Dolt's chunk journal to view the history of named refs, such as branches and tags.
    Usage Example:
    > select * from dolt_reflog('branch1');
    +--------------------+---------------------+----------------------------------+-------------------------------+
    | ref                | ref_timestamp       | commit_hash                      | commit_message                |
    +--------------------+---------------------+----------------------------------+-------------------------------+
    | refs/heads/branch1 | 2023-10-25 20:54:37 | v531ptpmv2tquig8v591tsjghtj84ksg | inserting row 42              |
    | refs/heads/branch1 | 2023-10-25 20:53:12 | rvt34lqrbtdr3dhnjchruu73lik4e398 | inserting row 100000          |
    | refs/heads/branch1 | 2023-10-25 20:53:06 | v531ptpmv2tquig8v591tsjghtj84ksg | inserting row 42              |
    | refs/heads/branch1 | 2023-10-25 20:52:43 | ihuj1l7fmqq37sjhtlrgpup5n76gfhju | inserting row 1 into table xy |
    +--------------------+---------------------+----------------------------------+-------------------------------+
    4 rows in set (0.00 sec)
    
    Documentation: https://github.com/dolthub/docs/pull/1815
    Compatibility:
    This change needs to be released as a minor version bump because it changes the chunk journal record format for root hash records to include a timestamp field. Older versions of Dolt's chunk journal reader aren't forward compatible to read this new format (but the chunk journal reader is backwards compatible to still read older journals). If a customer needs to downgrade after using a version with this support, they can stop their sql-server, run dolt gc to move the chunk journal content into oldgen, delete the chunk journal file if necessary (it should be removed automatically by dolt gc), and then restart their sql-server with an older Dolt version.
    Future Extensions:
    • root author – I started looking at recording the SQL user who created the new root value, but was hesitant to put too much data into the chunk journal records. This does seem like useful information to track though, so I'm still thinking about whether this is worth adding now or later.
    • reflog rows for deletion – Currently, dolt_reflog() results show when a ref is changed to a new value, but there is not a row in the result set marking when the ref is deleted. This would be shown as a NULL commit_hash and commit_message. Combining this with the point above would allow the reflog to show who deleted a ref.
    • show all refs – The current implementation requires a ref name and filters results on that ref. We should allow viewing the full reflog without filtering on a ref.

Closed Issues

dolt - 1.23.0

Published by github-actions[bot] 12 months ago

This release contains backwards incompatible changes:

  • The behavior of the filter-branch command has changed. It requires a new option flag, --query. It no longer ignores some errors unless you provide the new --continue flag. Its output has been altered.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6936: More flexibility for filter-branch command
    Fixes https://github.com/dolthub/dolt/issues/6895
    The command can now process multiple queries by default, and will ignore errors executing them with a --continue flag. This is a breaking change:
    1. Previously, certain errors were detected and ignored by default. Now this behavior is controlled by the --continue flag.
    2. Previously, the query to execute was specified as an argument. Now it's a flag, --query, or can be read via STDIN.
  • 6932: /.github/scripts/performance-benchmarking/get-mysql-dolt-job-json.sh: revert precision 3
  • 6923: fix empty Field in prolly.Range
    When creating prolly.Ranges for pointLookupPartitions, we never set the Field member variable. As a result, point lookups wouldn't return correctly.
    This impacts specifically keyless multi-arity lookups. Indexes on regular tables and single key keyless lookups were not impacted by this bug.
    Companion PR: https://github.com/dolthub/go-mysql-server/pull/2118

go-mysql-server

  • 2120: Prevent virtual columns from being used in primary keys
    Also tests for stored generated columns in primary keys
  • 2116: Grant Options privs need the AdminOnly treatment too
    This addresses a gap discovered while writing dolt tests - Grant Option on procedures is not currently validated correctly, resulting in only super users being able to set grants on procedures. This should address that.
  • 2115: Fix ExistsSubquery with functions
    Our hoistSelectExists optimization incorrectly generates SemiJoins when there are OuterScope column references in projections in subqueries. In the future, a possible optimization could be to have SemiLateralJoins that properly grant this visibility.
    Also contains small refactoring and extra debug information for coalesce function.
    Fixes one of the queries here: https://github.com/dolthub/dolt/issues/6898

Closed Issues

  • 6895: More flexibility for dolt filter-branch
  • 4840: Support PostgreSQL

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.11 2.76 1.3
groupby_scan 13.22 17.32 1.3
index_join 1.32 5.0 3.8
index_join_scan 1.25 2.18 1.7
index_scan 33.72 55.82 1.7
oltp_point_select 0.17 0.42 2.5
oltp_read_only 3.36 7.43 2.2
select_random_points 0.32 0.69 2.2
select_random_ranges 0.39 0.94 2.4
table_scan 34.33 55.82 1.6
types_table_scan 74.46 161.51 2.2
reads_mean_multiplier 2.1
Write Tests MySQL Dolt Multiple
bulk_insert 0.0 0.0 0.0
oltp_delete_insert 7.98 6.79 0.9
oltp_insert 3.75 3.36 0.9
oltp_read_write 8.28 14.73 1.8
oltp_update_index 3.82 3.36 0.9
oltp_update_non_index 3.82 3.3 0.9
oltp_write_only 5.28 7.56 1.4
types_delete_insert 7.7 7.17 0.9
writes_mean_multiplier 1.0
Overall Mean Multiple 1.6
dolt - 1.22.0

Published by github-actions[bot] 12 months ago

This release contains backwards incompatible changes:

  • Dolt stored procedures which are administrative in nature or require remote authentication are no longer available to SQL users with only the Execute permission on the database. Users who could previously run dolt_fetch(), dolt_pull(), and dolt_push() will need to have explicit grants to the procedures they need access to. See Documentation for more specifics.

Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.

Merged PRs

dolt

  • 6917: Dolt admin procedures
    Currently we encourage our users to grant execute permissions at the DB level, which gives users the ability to run somewhat privileged operations such as dolt_gc and dolt_push. This change marks sensitive procedures such that explicit grants are required for sensitive features.
    Related: https://github.com/dolthub/go-mysql-server/pull/2110
  • 6916: dependabot squash
    GRPC upgrades, from dependabot.
  • 6909: Remove schema overlap check for tables with same name
    This change takes out the schema check for tables with the same name that get modified. Previously, only tables that get dropped and re-added with the same name and same schema would get labelled as modified. Now, tables that get dropped and re-added with the same name regardless of schema will get labelled as modified.
    Fixes: https://github.com/dolthub/dolt/issues/5738
  • 6907: change CLI reference header
    Updates the header generated by dump-docs

go-mysql-server

  • 2116: Grant Options privs need the AdminOnly treatment too
    This addresses a gap discovered while writing dolt tests - Grant Option on procedures is not currently validated correctly, resulting in only super users being able to set grants on procedures. This should address that.
  • 2113: Allow ScriptTestAssertion to specify when an assertion needs a new session
    As part of testing Dolt's reflog feature, I need to call dolt_gc() and then check the reflog behavior. Because dolt_gc() invalidates the session, I needed to add this hook that signals the test framework to create a new session for a ScriptTestAssertion.
  • 2110: Procedure privledges
    Adds support for respecting procedure and function permissions. Also added AdminOnly flag for external procedures to indicate that they should not have their privileges evaluated in the standard MySQL hierarchical way. This will allow us to tighten dolt procedures access.
    The ability to grant access is still blocked behind an environment variable. That will remain until dolt changes have been released.
  • 2109: adding join and subquery tests
    Convert many of the sqllogictests into enginetests for visibility
  • 2108: Push not filters
    De Morgan's laws and leaf filter inversions to get NOT expressions as low in filter trees as possible. This will make index costing NOT filters easier.

Closed Issues

  • 5738: dolt table import with create and force option fails to modify table in some instances
  • 6891: Dolt panics querying dolt_history_$table
dolt - 1.21.5

Published by github-actions[bot] 12 months ago

Merged PRs

dolt

  • 6893: Bug fix: for dolt_history_ system tables
    When filtering on an indexed column from the primary index (i.e. a pk column) if a column in the underlying table has been modified and changed its tag in previous versions, we weren't able to create the lookup builder and caused a segfault. This PR changes to use the primary index as a covering index, since it's the closest we have. This is also consistent with how we had implemented Noms range lookups.
    Related to: https://github.com/dolthub/dolt/issues/6891
  • 6883: Remotes: AWS: Fix a bug where uploading tables files to S3 could have unbounded memory usage.
  • 6871: Remove skip on now passing test for multi-db joins on tables with the same name
    Unskipped a bats test I added for a join across multiple databases on tables of the same name that is now fixed.

go-mysql-server

  • 2102: view aliasing bug
    The View.ViewExpr AST object seems to drop aliasing information that we depend on for query schema presentation. I want to circle back to have the view AST expression be consistent with the view definition, but this forces a re-parsing to get the correct view schema.
  • 2100: Refactor or the NewPrivilegedOperation method
    Adding support for routines is coming, and and a bunch or string arguments is cumbersome.
    Note to reviewer: start at sql/privileges.go!
  • 2099: Add the mysql.procs_privs table
    In order to support procedure and function permissions, the mysql.procs_priv needs to be supported. This change add support, but gates the ability to create these grants because they are not being used for actual permission checks yet. That will come next.
    I have engine tests on another branch due to the gate. I'll add them when the feature is complete. There are also bats tests in flight in the dolt repo which can be seen here:
    https://github.com/dolthub/dolt/compare/4d8fe2a8757ef97503a172138b659980bdd2eaac...macneale4/privs_tests_wip
  • 2098: Logging improvements for prepared statements
    Updating from mysql.Handler changes in 7da194ad69efa0e4cc3992f860ecb92550482d62 – adding debug logging for ComPrepare and including params count, other minor logging cleanup.
    Related Vitess PR: https://github.com/dolthub/vitess/pull/286
  • 2095: give child SQA isLateral if parent SQA isLateral
    If a parent SubqueryAlias is marked IsLateral, then mark its child SubqueryAliases with IsLateral as well.
    This essentially gives the child SubqueryAlias visibility to the left subtree all the time, which fixes the linked issue.
    We should be able to differentiate between different scopes of lateral joins and only grant visibility to those columns, which could be a subset of the parent/left columns. However, I don't think we are close to getting that working.
    fixes https://github.com/dolthub/dolt/issues/6843
  • 2094: Add support for view column clause
    re: https://github.com/dolthub/vitess/pull/285
    closes: https://github.com/dolthub/dolt/issues/6859
  • 2083: Virtual column index support
    Implements index support for virtual columns, and fixes several bugs related to generated columns for certain statements.
    Changes the memory table index implementation to actually store secondary indexes separately, rather than fake it via the primary index. The indexes are stored via sorted slices for now. I'll do another pass and replace them, and primary storage, with btrees for speed now that I have this proof of concept working.
    Also introduces a new interface for rebuilding a single index, rather than doing a complete table rewrite every time an index is created.

vitess

  • 285: Add create view with columns
    ex:
    create view v_today(today) as select CURRENT_DATE()
    
  • 284: Include the With clause in walked subtrees for Select statements
    We weren't walking the With clause for select statements, which caused us to not find any bind vars in use there.
    Related to: https://github.com/dolthub/dolt/issues/6852

Closed Issues

  • 6590: Enhancement suggestion (low priority): add an equivalent to git bundle
  • 6843: lateral recursive cte join error
  • 6711: VisualDB client errors
  • 2279: In multiple database mode, two tables having the same name is an issue on JOINs

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.14 2.76 1.3
groupby_scan 13.22 17.32 1.3
index_join 1.32 4.41 3.3
index_join_scan 1.25 2.18 1.7
index_scan 34.33 55.82 1.6
oltp_point_select 0.17 0.41 2.4
oltp_read_only 3.36 7.3 2.2
select_random_points 0.32 0.68 2.1
select_random_ranges 0.39 0.92 2.4
table_scan 34.33 55.82 1.6
types_table_scan 74.46 164.45 2.2
reads_mean_multiplier 2.0
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 5.37 5.99 1.1
oltp_insert 2.71 2.86 1.1
oltp_read_write 7.3 14.21 1.9
oltp_update_index 2.76 2.97 1.1
oltp_update_non_index 2.86 2.91 1.0
oltp_write_only 3.82 7.04 1.8
types_delete_insert 5.28 6.32 1.2
writes_mean_multiplier 1.3
Overall Mean Multiple 1.7