polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

OTHER License

Downloads
9.7M
Stars
26.3K
Committers
213

Bot releases are visible (Hide)

polars - Python Polars 0.16.5

Published by github-actions[bot] over 1 year ago

🚀 Performance improvements

  • speedup quantile/median ~2x (#6861)
  • remove unneeded series allocations in groupby aggs (#6855)

✨ Enhancements

  • restore dataframe class (#6869)
  • add include_index option on init from pandas frames (#6847)
  • properly implement null array (#6817)
  • avoid panic error in strftime with invalid format (#6810)

🐞 Bug fixes

  • fix crash in write_csv when mixed tz-naive and tz-aware datetimes are present (#6828)
  • accept more types in groupby.agg (#6709)
  • Fix pl.from_dataframe() as pyarrow.interchange was not i… (#6844)
  • fix schema of functions: (#6845)
  • stabilize integer operation to minimal required dtype (#6841)
  • use explicit type-arg for PythonDataType (#6481)
  • fix numpy/datetime regression (#6835)
  • implement to_list for null dtype (#6834)
  • Raise ValueError on passing multiple expressions Numpy ufunc (#6821)
  • respect schema in ndjson (#6819)

🛠️ Other improvements

  • Fail tests on warning (#6868)
  • further improve struct expr docstrings (#6852)
  • Deprecate non-keyword args for some functions (#6851)
  • un-skip passing test (#6854)
  • parenthesise col type signature to improve hint interaction with PyCharm (#6850)
  • Deprecate positional join args (#6826)
  • Rename argsort/argsort_by to arg_sort/arg_sort_by (#6829)
  • Update dprint config excludes (#6822)
  • Fix some broken noqa comments (#6823)
  • Run mypy as part of the lint workflow (#6820)
  • various minor docstring rendering fixes (#6818)
  • fix lazy groupby docstring/rendering (#6816)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @ghuls, @josh, @kngwyu, @oysols, @ritchie46, @stinodego and @zundertj

polars - Python Polars 0.16.4

Published by github-actions[bot] over 1 year ago

🚀 Performance improvements

  • remove PySequence downcast (#6803)
  • optimize arg_min/arg_max (#6799)

✨ Enhancements

  • boolean Series broadcast comparison (eq/neq) against scalar True/False (#6797)

🐞 Bug fixes

  • ensure join frame types are consistent (#6798)
  • enable empty DataFrame (and Series) init from List of Structs/Lists schema/dtype (#6795)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @igmriegel and @ritchie46

polars - Rust Polars 0.27.0

Published by github-actions[bot] over 1 year ago

🏆 Highlights

  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)

⚠️ Breaking changes

  • error on string <-> date cmp (#6498)
  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • show where error messages originated (#6482)
  • str.strip with multiple chars (#5929)

🚀 Performance improvements

  • update string replacement codepaths following new benchmarking (#6777)
  • improve dynamic groupby performance on sorted keys (#6599)
  • faster frame-init from list of dicts (when omitting fields), and ensure fields are read according to the declared schema (#6472)
  • Improve rechunk check (#6268)
  • reuse allocated scratches in ipc writer (#6287)
  • use dedicated writer thread for sink_parquet (#6285)
  • first check rev-map on categorical equality check (#6085)
  • ensure set_at_idx is O(1) (#5977)
  • use iterator instead of loop polars_io::csv::parser::skip_condition (#5157)

✨ Enhancements

  • accept separator for pivot and to_dummies (#6780)
  • feat(rust, python) rename 'tz' to 'time_zone' in convert_time_zone and replace_time_zone (#6784)
  • rename with_time_zone to convert_time_zone and cast_time_zone to replace_time_zone (#6768)
  • support timezone in csv writer (#6722)
  • implement series abstractions for Int128Type (#6679)
  • parse timezone from Datetime (#6766)
  • formally support duration division (#6758)
  • add argmin/max for utf8 data (#6746)
  • Support an ignore_nulls param for EWM calculations. (#5749) (#6742)
  • deprecate tz_localize (#6693)
  • guarantee schema-stable col(dtype) selection (#6674)
  • better-characterise NotFound exceptions (#6670)
  • disallow with_time_zone from/to tz-naive (#6659)
  • let cast_time_zone work on tz-naive and deprecate tz-localize (#6649)
  • implement fill_null for list data (#6635)
  • expression functions should be nullable (#6629)
  • add streamable udfs (#6614)
  • is_first for struct dtype (#6595)
  • Added from_str_radix method to StringNameSpace that allows to parse strings from any base to i32 (#6570)
  • improve predicate pushdown (#6579)
  • raise error on invalid binary cmp (#6564)
  • let cast_time_zone accept None (#6539)
  • add utc parameter to strptime (#6496)
  • add meta 'has_multiple_outputs', 'is_regex_projec… (#6500)
  • error on string <-> date cmp (#6498)
  • show where error messages originated (#6482)
  • faster frame-init from list of dicts (when omitting fields), and ensure fields are read according to the declared schema (#6472)
  • allow expr in str.contains (#6443)
  • add float formatting option (#6432)
  • allow expressions as arguments in str.ends_with (#6361)
  • accept expr in str.starts_with (#6355)
  • add strict parameter to decoding expressions (#6342)
  • allow unordered struct creating from anyvalues (#6321)
  • parse abbrev month name (#6314)
  • add dt.combine for combining date and time components (#6121)
  • add sink_ipc (#6286)
  • ensure ooc sort works ooc with all-constant values (#6235)
  • The 1 billion row sort (#6156)
  • optionally treat missing UTF8 values as the empty string at CSV parse-time (#6203)
  • When moving error out of LogicalPlan, leave behind String with error message instead of None (#6199)
  • generalize the cloud storage builders (#5972)
  • Implement DataFrame.unique(keep="none") (#6169)
  • add arr.take expression (#6116)
  • allow extend_constant to work with date literals (#6114)
  • allow nested categorical cast (#6113)
  • add a rounded_corners modifier to pl.Config.set_tbl_formatting (#6108)
  • Get polars to compile to wasm target (#6050)
  • add search_sorted for arrays and utf8 dtype (#6083)
  • improve error message when writing nested data to… (#6040)
  • updated default table format from "UTF8_FULL" to "UTF8_FULL_CONDENSED" (#5967)
  • str.strip with multiple chars (#5929)
  • support glob in parquet object_storage (#5928)
  • read decimal as f64 (#5938)
  • improve query plan scan formatting (#5937)
  • allow all null cast (#5933)
  • truncate by calendar weeks (#5759)
  • merge sorted dataframes (#5817)
  • impl hex and base64 for binary (#5892)
  • streaming parquet from object_stores (#5871)

🐞 Bug fixes

  • always rechunk if n_chunks > n_rows (#6786)
  • fix ndjson empty array parsing (#6785)
  • make some list expressions aware of groupby context (#6776)
  • use explicit drop function node (#6769)
  • don't set sorted flag if we reverse sort the left … (#6772)
  • handle edge-case with string-literal replacement when the replace value looks like a capture group (#6765)
  • respect skip_rows in glob parsing csv (#6754)
  • Improve error message in DataFrame constructor (#6715)
  • arrow map dtype conversion (#6732)
  • dedicated rename implementation. (#6688)
  • return correct display/repr names for NaN-related expressions (#6721)
  • strftime with time zone directive (#6673)
  • improve error message in date_range with invalid units (#6671)
  • remove uses of rayon global thread pool (#6682)
  • true-divide output type (#6665)
  • fix(rust, python) cast to and from fixed offsets (#6602)
  • raise error on string numeric arithmetic (#6601)
  • partially assert sortedness in groupby dynamic (#6593)
  • fix(rust, python); raise oob if negative index given to take (#6590)
  • fix predicate pushdown key check (#6577)
  • fix schema of apply with many inputs on empty df (#6571)
  • let lhs determine struct order in supertype (#6572)
  • fix(rust, python) validate utc, fmt, and tz-aware in strptime (#6550)
  • add strptime to filter boundary (#6560)
  • list eval all null array (#6545)
  • implement ser/de for BinaryChunked (#6543)
  • raise if tz_localize called on UTC-aware (#6526)
  • make concat_list group aware (#6527)
  • error on invalid expanding expression (#6521)
  • create from dicts directly as struct categorical (#6520)
  • fix oob in arr.get by expressions (#6519)
  • fix cse schema (#6518)
  • panic when max_len -1 is reached (#6494)
  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • fix(rust, python) validate tz in with_time_zone (#6417)
  • faster frame-init from list of dicts (when omitting fields), and ensure fields are read according to the declared schema (#6472)
  • use consistent floor division for floats/ints (#6460)
  • split semi/anti join optimization (#6459)
  • fix doc comment in ParallelStrategy (#6444)
  • fix projection pushdown on double semi join (#6440)
  • cumulative_eval ensure output dtype is respected (#6435)
  • auto-detect %+ as tz-aware (#6434)
  • correct error message in cast_time_zone (#6411)
  • only use float simd on specific alignment (#6427)
  • no early escape when window is equal to len in rolling_float (#6408)
  • raise error on invalid sort_by argument (#6382)
  • take offset into account with str.explode (#6384)
  • Return empty batch for pl.read_csv_batched().next_… (#6381)
  • implement ser/de for StructChunked (#6359)
  • series of empty structs (#6347)
  • don't cast nulls before trying normal cast (#6339)
  • expand all nested wildcards in functions (#6334)
  • fix groupby rolling by_key if groups are empty (#6333)
  • parse abbrev month name (#6314)
  • disallow alias in inline join expressions (#6312)
  • feature flag "get_sink" ipc (#6306)
  • block proj-pd and pred-pd on swapping rename (#6303)
  • convert nested dictionary with i64 keys (#6299)
  • fix panic dynamic_groupby on empty dataframe (#6294)
  • Parse negative dates with polars parser (#6256)
  • Add list inner dtype when printing Series (#6233)
  • fix when then otherwise with arity and aggregation… (#6224)
  • pass name to value counts in aggregation (#6221)
  • don't set fast_explode on list of structs (#6220)
  • explode of empty nullable list (#6190)
  • fix empty streaming joins (#6149)
  • fix streaming joins where the join order has been … (#6143)
  • write tz-aware datetimes to csv (#6135)
  • Print error message on mmap IPC file only in verbose mode (#6098)
  • fix invalid dtype in chunked array after struct cast (#6093)
  • don't run cse cache_states if no projections found (#6087)
  • Update read_csv error message (#6082)
  • propogate nulls in binary arithmetic/aggregation (#6076)
  • deal with unnest schema expansion in projection pd (#6063)
  • correct output dtype for cummin/cumsum/cummax (#6062)
  • block streaming on literal series/range (#6058)
  • ndjson struct inference (#6049)
  • deal with empty structs (#6039)
  • fix aggregation that filters out all data (#6036)
  • fix diff overflow (#6033)
  • keep column names in is_null/is_not_null (#6032)
  • keep name when sorting categorical in lexial order (#6029)
  • properly set null anyvalue if categorical is neste… (#6025)
  • make weekday tz-aware (#5989)
  • fix categorical in struct anyvalue issue (#5987)
  • fix invalid boolean simplification (#5976)
  • allow empty sort on any dtype (#5975)
  • properly deal with categoricals in streaming queries (#5974)
  • don't panic on ignored context (#5958)
  • don't allow named expression in arr.eval (#5957)
  • fix panic in join expressions (#5954)
  • block ordered predicates before explode (#5951)
  • adhere to schema in arr.eval of empty list (#5947)
  • fix arrow nested null conversion (#5946)
  • allow None in arr.slice length (#5934)
  • fix time to duration cast (#5932)
  • error on addition with datetime/time (#5931)
  • don't create categoricals in streaming (#5926)
  • object filter should keep single chunk (#5913)
  • csv, read escaped "" as missing (#5912)
  • fix pivot of signed integers (#5909)
  • fix latest oob in streaming convertion (#5902)
  • fix date + duration offsets outside of nanosecond datetime bounds (#5889)
  • adapt k to len in topk (#5888)

🛠️ Other improvements

  • propagate error in date_range with invalid time zone (#6759)
  • update arrow to 0.16 (#6748)
  • remove unreachable path in write_anyvalue (#6727)
  • add groupby_dynamic to docs (#6725)
  • chore(rust) disallow chunked datetime with_time_zone on tznaive, remove unnecessary with_time_zone (#6681)
  • update Required Rust version to 1.58->1.62 (#6680)
  • add test for raising error in apply (#6664)
  • Minor documentation fix (#6657)
  • Add release flow info to contributing guide (#6480)
  • address todo and use regex in tz_aware check (#6479)
  • Address chrono deprecation warnings (#6478)
  • fix doc comment in ParallelStrategy (#6444)
  • move binary to polars-ops (#6401)
  • fix a typo in csv read example (#6389)
  • remove roundtrip to builder (#6383)
  • update rustc to 2023-01-19 (#6341)
  • run cse optimization only if joins and caches… (#6337)
  • update base64 requirement from 0.13 to 0.21 (#6249)
  • Remove benches and criterion dependency (#6273)
  • update chrono-tz requirement from 0.6 to 0.8 (#6255)
  • Enable Dependabot (#5036)
  • Add missing feature attributes for csv-file (#6229)
  • don't set aggregated flag on null propagated aggregation. (#6191)
  • Revert "Use auto_doc_cfg" (#6164)
  • remove vertical take (#6112)
  • add single threaded sort internally (#6103)
  • mark from_chunks as unsafe (#6094)
  • replace exact instances of Option/Result combinators (#6088)
  • ensure reverse indices exist in global string cache (#5970)
  • refactored describe (#5922)
  • don't decode into utf8 (#5910)
  • remove unused deps (#5903)

Thank you to all our contributors for making this release possible!
@2-5, @AnatolyBuga, @ChayimFriedman2, @MarceColl, @MarcoGorelli, @MatveyF, @abalkin, @alexander-beedie, @c-peters, @cannero, @chitralverma, @cojmeister, @dannyvankooten, @dependabot, @dependabot[bot], @flowlight0, @gab23r, @gam-phon, @ghuls, @gitkwr, @huitseeker, @jgmartin, @jjerphan, @johngunerli, @josh, @jvanbuel, @n8henrie, @ozgrakkurt, @papparapa, @phaile2, @plaflamme, @rben01, @ritchie46, @romanovacca, @ropoctl, @sorhawell, @stinodego, @universalmind303, @winding-lines, @yuntai and @zundertj

polars - Python Polars 0.16.3

Published by github-actions[bot] over 1 year ago

✨ Enhancements

  • add update method to ldf/df (#6787)
  • accept separator for pivot and to_dummies (#6780)
  • feat(rust, python) rename 'tz' to 'time_zone' in convert_time_zone and replace_time_zone (#6784)
  • Allow other expressions for default arg in map_dict (#6781)
  • minor ergonomic affordance; allow pl.concat from generator expression (#6779)
  • rename with_time_zone to convert_time_zone and cast_time_zone to replace_time_zone (#6768)
  • Add map_dict expression. (#5899)
  • support timezone in csv writer (#6722)
  • default to 1d interval in date_range (#6771)
  • parse timezone from Datetime (#6766)
  • Add option to use PyArrow backed-extension arrays when … (#6756)
  • formally support duration division (#6758)
  • add argmin/max for utf8 data (#6746)
  • Improve numpy support: conversion of numpy arrays with … (#6738)
  • Improved assert equal messages (#6737)
  • Support an ignore_nulls param for EWM calculations. (#5749) (#6742)
  • scan_ds predicate pushdown for string cmp (#6734)
  • don't require pyarrow for utf8 -> numpy conversion (#6733)
  • More ergonomic with_columns args (#6686)
  • feat(python):Add return_as_string arg to DF.glimpse; default=False (#6678)
  • better-characterise NotFound exceptions (#6670)
  • disallow with_time_zone from/to tz-naive (#6659)
  • More ergonomic select args (#6667)
  • let cast_time_zone work on tz-naive and deprecate tz-localize (#6649)
  • improved exceptions on attempt to use invalid schema/dtypes (#6653)

🐞 Bug fixes

  • always rechunk if n_chunks > n_rows (#6786)
  • fix ndjson empty array parsing (#6785)
  • make some list expressions aware of groupby context (#6776)
  • use explicit drop function node (#6769)
  • don't set sorted flag if we reverse sort the left … (#6772)
  • handle edge-case with string-literal replacement when the replace value looks like a capture group (#6765)
  • respect skip_rows in glob parsing csv (#6754)
  • Improve error message in DataFrame constructor (#6715)
  • arrow map dtype conversion (#6732)
  • respect 'None' in from_dicts (#6726)
  • dedicated rename implementation. (#6688)
  • return correct display/repr names for NaN-related expressions (#6721)
  • strftime with time zone directive (#6673)
  • typing for Series methods that can return None (#6690)
  • ensure that iter_rows always returns all values from all chunks/batches in accelerated codepath (#6708)
  • Support numpy ufunc when expression not first arg (#6675)
  • Raise ValueError on adding float to Series of dtype date (#6677)
  • remove uses of rayon global thread pool (#6682)
  • true-divide output type (#6665)
  • improve behaviour of dict-expansion (scalars) when mixed with numpy arrays (#6663)
  • Preserve Expr name in is_between (#6661)
  • Tiny improvement of Field repr (#6640)

🛠️ Other improvements

  • Update mypy to version 1.0.0 (#6744)
  • integrate ignore_nulls into EWM parametric tests (#6751)
  • redirect tz_localize (#6749)
  • Reorganize benchmark test folder (#6695)
  • Split long test modules (namespaces) (#6668)
  • Use pytest marker for slow tests (#6642)
  • unify nan_to_null and nan_to_none parameter names, expose to DataFrame init, add test coverage (#6637)
  • update extend_constant docs/typing (and test coverage) (#6646)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @MarcoGorelli, @MatveyF, @alexander-beedie, @ghuls, @jgmartin, @phaile2, @plaflamme, @ritchie46, @sorhawell, @stinodego, @yuntai and @zundertj

polars - Python Polars 0.16.2

Published by github-actions[bot] over 1 year ago

🚀 Performance improvements

  • improve dynamic groupby performance on sorted keys (#6599)

✨ Enhancements

  • implement fill_null for list data (#6635)
  • expression functions should be nullable (#6629)
  • Implement unary plus operation on exprs and series (#6517)
  • add streamable udfs (#6614)
  • is_first for struct dtype (#6595)
  • Added from_str_radix method to StringNameSpace that allows to parse strings from any base to i32 (#6570)
  • Implement DataFrame Interchange Protocol through pyarrow (#6581)
  • improve predicate pushdown (#6579)
  • raise error on invalid binary cmp (#6564)

🐞 Bug fixes

  • make string_repr private (#6636)
  • treat literal values consistently in select context, improve related typing (#6628)
  • Fix _repr_html_ double-height rows (#5645) (#6534)
  • fix(rust, python) cast to and from fixed offsets (#6602)
  • raise error on string numeric arithmetic (#6601)
  • don't convert "ns"-precision temporal types via pyarrow (#6592)
  • partially assert sortedness in groupby dynamic (#6593)
  • fix(rust, python); raise oob if negative index given to take (#6590)
  • fix predicate pushdown key check (#6577)
  • fix schema of apply with many inputs on empty df (#6571)
  • let lhs determine struct order in supertype (#6572)
  • ensure consistent handling of 1D numpy arrays with respect to other sequences (#6569)
  • fix(rust, python) validate utc, fmt, and tz-aware in strptime (#6550)
  • add strptime to filter boundary (#6560)

🛠️ Other improvements

  • make string_repr private (#6636)
  • add example of using is_between with string bounds, and extend test coverage for the same (#6627)
  • provide additional examples for diff methods (#6630)
  • Consistent handling of env vars (#6626)
  • make structify behaviour experimental, while also extending it to aliased expressions (#6615)
  • Disallow clippy borrow deref ref (#6605)
  • Update ruff version and some settings (#6588)
  • Add release flow info to contributing guide (#6480)
  • Use assert_series_equal instead of s.series_equal(...) (#6582)
  • cleanup last vestiges of experimental kwargs setting (#6568)
  • Use assert_frame_equal instead of assert df.frame_equal(...) (#6553)
  • Update to PyO3 to 0.18.0 (#6531)

Thank you to all our contributors for making this release possible!
@2-5, @MarcoGorelli, @abalkin, @alexander-beedie, @cojmeister, @dependabot, @dependabot[bot], @jjerphan, @plaflamme, @ritchie46 and @stinodego

polars - Python Polars 0.16.0

Published by github-actions[bot] over 1 year ago

🏆 Highlights

  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • automagically upconvert with_columns kwarg expressions with multiple output names to struct; extend **named_kwargs support to select (#6497)

⚠️ Breaking changes

  • error on string <-> date cmp (#6498)
  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • show where error messages originated (#6482)
  • Remove deprecated paths from Series.__getitem__ (#6048)
  • change behaviour of named rows (#6302)
  • Remove deprecated read/write_json arguments (#5990)
  • make schema, schema_overrides, and orient consistent on all user-facing interfaces (#6387)
  • Groupby iteration now returns tuples of (name, data) (#6350)
  • Remove Groupby.pivot (#6016)
  • Remove deprecated argument aliases (#5993)
  • Change Series.shuffle default behaviour (#5991)
  • Change Expr.is_between default behaviour (#5985)
  • Restrict certain function parameters to be keyword-only (#6464)

✨ Enhancements

  • let cast_time_zone accept None (#6539)
  • automagically upconvert with_columns kwarg expressions with multiple output names to struct; extend **named_kwargs support to select (#6497)
  • add some missing type annotation in series dispatch methods (#6523)
  • better errors in get_ptr and a probability on a boolean… (#6522)
  • add utc parameter to strptime (#6496)
  • add meta 'has_multiple_outputs', 'is_regex_projec… (#6500)
  • error on string <-> date cmp (#6498)
  • ~30% faster iter_rows(named=True) and to_dicts(), if pyarrow available (#6493)
  • show where error messages originated (#6482)
  • Remove deprecated paths from Series.__getitem__ (#6048)
  • change behaviour of named rows (#6302)
  • Remove deprecated read/write_json arguments (#5990)
  • Groupby iteration now returns tuples of (name, data) (#6350)
  • Remove Groupby.pivot (#6016)
  • Remove deprecated argument aliases (#5993)
  • Change Series.shuffle default behaviour (#5991)
  • Change Expr.is_between default behaviour (#5985)
  • Restrict certain function parameters to be keyword-only (#6464)

🐞 Bug fixes

  • implement ser/de for BinaryChunked (#6543)
  • on frame-init from generator, initial chunk_size cannot be smaller than infer_schema_length (#6541)
  • raise if tz_localize called on UTC-aware (#6526)
  • make concat_list group aware (#6527)
  • error on invalid expanding expression (#6521)
  • create from dicts directly as struct categorical (#6520)
  • fix oob in arr.get by expressions (#6519)
  • fix cse schema (#6518)
  • panic when max_len -1 is reached (#6494)
  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • fix(rust, python) validate tz in with_time_zone (#6417)

🛠️ Other improvements

  • Remove verify_series_and_expr_api util (#6524)
  • Disable some tests for Windows (#6532)
  • Remove unnecessary brackets in doc examples (#6332)
  • Enable some tests for Windows (#6511)
  • Fix test issue with tmp directory (#6508)
  • Fix some deprecation warnings (#6495)
  • added all missing examples for temporal expressions (#6488)
  • Utilize pytest-xdist for faster unittests (#6483)
  • test(python) I/O test improvements (#6475)
  • make schema, schema_overrides, and orient consistent on all user-facing interfaces (#6387)
  • improved error message from Expr on incorrect usage in boolean context (#6473)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @gab23r, @papparapa, @ritchie46, @romanovacca, @stinodego and @zundertj

polars - Python Polars 0.15.18

Published by github-actions[bot] over 1 year ago

✨ Enhancements

  • More precise pipe type annotation (#6457)

🐞 Bug fixes

  • use consistent floor division for floats/ints (#6460)
  • split semi/anti join optimization (#6459)

🛠️ Other improvements

  • Specify deltalake minimum version (#6363)
  • deprecate iterrows in favour of iter_rows, add new @redirect class decorator (#6461)
  • Improve IO test structure (#6453)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @josh, @ritchie46 and @stinodego

polars - Python Polars 0.15.17

Published by github-actions[bot] over 1 year ago

✨ Enhancements

  • allow expr in str.contains (#6443)
  • Deprecate with_column (#6128)
  • expose efficient iterator over DataFrame slices (#6414)
  • add float formatting option (#6432)
  • 10% speedup for to_dicts method (#6415)
  • add datetime/duration dtype selector groups covering the different timeunits (#6425)
  • allow internal api to get pointer to values buffer (#6385)
  • infer ISO8601 datetimes (#6357)
  • minor improvement to auto-detection of ambiguous data orientation (#6376)
  • allow expressions as arguments in str.ends_with (#6361)
  • Make groupby rolling/dynamic iterable (#6372)
  • accept expr in str.starts_with (#6355)
  • Move explode to namespaces (#6351)
  • Rename Series.struct.to_frame to .struct.unnest (#6352)
  • auto-detect %+ as tz-aware (#6434)

🐞 Bug fixes

  • fix projection pushdown on double semi join (#6440)
  • ensure column-exclusion works with the new dtype groups, and improve some related typing (#6442)
  • ensure from_dicts and DataFrame init from list of dicts behave consistently, update/improve related docstrings (#6431)
  • cumulative_eval ensure output dtype is respected (#6435)
  • allow from pandas null structs (#6430)
  • fixed interaction of schema_overrides with frame-init from list of dicts (#6424)
  • only use float simd on specific alignment (#6427)
  • no early escape when window is equal to len in rolling_float (#6408)
  • is_between typing with time in start and end (#6393)
  • dont incorrectly infer Zulu time (#6378)
  • raise error on invalid sort_by argument (#6382)
  • take offset into account with str.explode (#6384)
  • Return empty batch for pl.read_csv_batched().next_… (#6381)
  • ensure pyarrow.compute module is loaded (#6353)
  • implement ser/de for StructChunked (#6359)
  • series of empty structs (#6347)

🛠️ Other improvements

  • add explicit note about use of Config as a context manager (#6439)
  • ensure from_dicts and DataFrame init from list of dicts behave consistently, update/improve related docstrings (#6431)
  • Fix docstring of series.interpolate (#6399)
  • Remove duplicate test (#6390)
  • deprecate columns param for DataFrame init; transitioning to schema (#6366)
  • Add docs and tests to Expr.flatten (#6370)
  • Example of filtering partitioned delta tables (#6365)
  • Uppercase project URL refs (#6362)

Thank you to all our contributors for making this release possible!
@ChayimFriedman2, @MarcoGorelli, @alexander-beedie, @c-peters, @flowlight0, @gab23r, @gam-phon, @ghuls, @jgmartin, @josh, @ritchie46, @romanovacca, @stinodego, @universalmind303 and @zundertj

polars - Python Polars 0.15.16

Published by github-actions[bot] over 1 year ago

🚀 Performance improvements

  • Improve rechunk check (#6268)
  • reuse allocated scratches in ipc writer (#6287)
  • use dedicated writer thread for sink_parquet (#6285)

✨ Enhancements

  • add strict parameter to decoding expressions (#6342)
  • allow unordered struct creating from anyvalues (#6321)
  • allow pass_name in aggregation apply (#6318)
  • parse abbrev month name (#6314)
  • Add warning for new behaviour of named rows (#6300)
  • add dt.combine for combining date and time components (#6121)
  • improvements to dtype-based column selection (#6295)
  • add sink_ipc (#6286)
  • additional schema_overrides param for more ergonomic DataFrame init (#6230)

🐞 Bug fixes

  • don't cast nulls before trying normal cast (#6339)
  • properly dispatch categorical string comparison (#6336)
  • expand all nested wildcards in functions (#6334)
  • fix groupby rolling by_key if groups are empty (#6333)
  • Fix some type hints and bugs for groupby (#6329)
  • Reject None input for head/tail (#6326)
  • parse abbrev month name (#6314)
  • default to pyarrow for writing parquet (#6313)
  • disallow alias in inline join expressions (#6312)
  • block proj-pd and pred-pd on swapping rename (#6303)
  • convert nested dictionary with i64 keys (#6299)
  • fix(python) Print instantiated dtypes in glimpse (#6298)
  • infer y-m-d datetime even if single element (#6297)
  • fix panic dynamic_groupby on empty dataframe (#6294)
  • implement missing DataFrame __floordiv__ op (#6280)
  • Allow low and high in date_range to be str (#6275)
  • allow integer-compatible row indexes that are not strictly typed as int (#6266)
  • Parse negative dates with polars parser (#6256)

🛠️ Other improvements

  • run cse optimization only if joins and caches… (#6337)
  • Fix wrong description for variable_name argument in melt (#6331)
  • Fix random groupby test failure (#6327)
  • fixup test names, adjust test_struct (#6317)
  • simplify _from_pandas constructor (#6310)
  • Ignore hash doctests (#6304)
  • Fix docstring formatting for truncate (#6291)
  • Move package metadata to pyproject.toml (#6271)
  • Move io tests to the same folder (#6277)
  • Enable Dependabot (#5036)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @c-peters, @dependabot, @dependabot[bot], @ghuls, @n8henrie, @ritchie46, @stinodego and @universalmind303

polars - Python Polars 0.15.15

Published by github-actions[bot] almost 2 years ago

✨ Enhancements

  • ensure ooc sort works ooc with all-constant values (#6235)
  • The 1 billion row sort (#6156)
  • optionally treat missing UTF8 values as the empty string at CSV parse-time (#6203)
  • check file target is not an existing directory (#6187)
  • support -ve indexing for DataFrame head and tail methods (#6173)
  • Implement DataFrame.unique(keep="none") (#6169)
  • support use of explicit Struct dtypes on DataFrame/Series init (#6145)

🐞 Bug fixes

  • Add list inner dtype when printing Series (#6233)
  • strptime now respects pl.Datetime's time_unit (#6231)
  • fix when then otherwise with arity and aggregation… (#6224)
  • collect now uses the storage_options given to scan_parquet (#6223)
  • set_sorted keep schema (#6222)
  • pass name to value counts in aggregation (#6221)
  • don't set fast_explode on list of structs (#6220)
  • address a frame init/construction error, and expose infer_schema_length to frame init (#6210)
  • explode of empty nullable list (#6190)
  • fix oob arr.take (#6189)
  • Make with_columns in with_columns_kwargs mode compatible with more data types (#6126)
  • Update docstring with_columns to reflect a new dataframe is being returned (#6122)
  • fix empty streaming joins (#6149)
  • fix streaming joins where the join order has been … (#6143)
  • write tz-aware datetimes to csv (#6135)
  • add null behavior for oob indices (#6133)

🛠️ Other improvements

  • Create DataFrame from schema (#6225)
  • don't set aggregated flag on null propagated aggregation. (#6191)
  • undo cargo.toml change (#6219)
  • Improve drop_nulls docstrings (#6127)
  • Clarify docstrings for closed argument (#6198)
  • minor docs and typing updates (plus additional test coverage for related areas) (#6182)
  • explain n_field_strategy (#6158)

Thank you to all our contributors for making this release possible!
@MarceColl, @MarcoGorelli, @alexander-beedie, @gab23r, @ghuls, @jvanbuel, @n8henrie, @rben01, @ritchie46, @ropoctl, @sorhawell, @stinodego, @winding-lines and @zundertj

polars - Python Polars 0.15.14

Published by github-actions[bot] almost 2 years ago

🚀 Performance improvements

  • first check rev-map on categorical equality check (#6085)

✨ Enhancements

  • add arr.take expression (#6116)
  • allow extend_constant to work with date literals (#6114)
  • allow nested categorical cast (#6113)
  • add a rounded_corners modifier to pl.Config.set_tbl_formatting (#6108)
  • huge speedup of scalar-to-array expansion on frame init from dict (#6111)
  • extend existing fast range->Series init to lists of ranges in a Series (#6099)
  • additional (opt-in) options for assert_frame_equal (#6096)
  • add search_sorted for arrays and utf8 dtype (#6083)

🐞 Bug fixes

  • ensure multi-line type hints are parenthesised (#6100)
  • fix invalid dtype in chunked array after struct cast (#6093)
  • don't run cse cache_states if no projections found (#6087)
  • Support all datatypes in glimpse and align with head/tail (#6091)
  • Update read_csv error message (#6082)
  • propogate nulls in binary arithmetic/aggregation (#6076)

🛠️ Other improvements

  • Fix docstring with_context (#6118)
  • Use Dataframe.item internally and in tests (#6109)
  • Assert deprecation warning on check_column_names (#6110)
  • enable unused import autofix via ruff (#6102)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @gitkwr, @huitseeker, @ritchie46, @stinodego and @zundertj

polars - Python Polars 0.15.13

Published by github-actions[bot] almost 2 years ago

✨ Enhancements

  • Improve iterating over GroupBy (#6051)
  • much faster lazy type-checks (#6064)
  • support array-expansion of scalars on frame init from dict (#6034)
  • improve error message when writing nested data to… (#6040)

🐞 Bug fixes

  • bound complex type from 3.8 to 3.11 (#6071)
  • deal with unnest schema expansion in projection pd (#6063)
  • correct output dtype for cummin/cumsum/cummax (#6062)
  • block streaming on literal series/range (#6058)
  • improve handling of dict-type "columns" param on frame-init (#6045)
  • Fix typing for DataFrame.select (#6047)
  • ndjson struct inference (#6049)
  • fix stringcache. latest refactor introduced a hashing error (#6056)
  • allow mixed field order and availability in apply that r… (#6041)
  • deal with empty structs (#6039)
  • fix aggregation that filters out all data (#6036)
  • fix diff overflow (#6033)
  • keep column names in is_null/is_not_null (#6032)
  • keep name when sorting categorical in lexial order (#6029)
  • tweaked property/accessor behaviour (#6021)
  • properly set null anyvalue if categorical is neste… (#6025)
  • Fix from_epoch function signature (#6024)
  • Validate estimated_size parameter (#6018)

🛠️ Other improvements

  • suggest forward fill in cumsum/cummax (#6061)
  • Fix SIM105 issues. (#6042)
  • Remove trailing spaces in glimpse output (#6037)
  • Remove unnecessary noqa's (#6035)
  • Fix flake8-pytest-style errors in tests. (#6031)
  • update read_sql and row docstrings (#6028)
  • Enable the isort-style import autofix via ruff (#6020)
  • Update py-polars/Cargo.lock (#6013)
  • Refactor pivot tests (#6012)
  • Use ruff instead of isort, flake8 and pyupgrade (#5916)
  • Properly deprecate groupby.pivot (#6000)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ghuls, @ritchie46, @stinodego and @universalmind303

polars - Python Polars 0.15.11

Published by github-actions[bot] almost 2 years ago

🚀 Performance improvements

  • ensure set_at_idx is O(1) (#5977)

✨ Enhancements

  • allow eq,ne,lt etc (#5995)
  • Improve Expr.is_between API (#5981)
  • large speedup for df.iterrows (~200-400%) (#5979)
  • updated default table format from "UTF8_FULL" to "UTF8_FULL_CONDENSED" (#5967)
  • Access rows as namedtuples (#5966)
  • Improve assert_frame_equal messages (#5962)

🐞 Bug fixes

  • make weekday tz-aware (#5989)
  • fix categorical in struct anyvalue issue (#5987)
  • fix invalid boolean simplification (#5976)
  • allow empty sort on any dtype (#5975)
  • properly deal with categoricals in streaming queries (#5974)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ritchie46 and @stinodego

polars - Python Polars 0.15.9

Published by github-actions[bot] almost 2 years ago

🚀 Performance improvements

  • improve reducing window function performance ~33% (#5878)

✨ Enhancements

  • str.strip with multiple chars (#5929)
  • add iterrrows (#5945)
  • read decimal as f64 (#5938)
  • improve query plan scan formatting (#5937)
  • allow all null cast (#5933)
  • allow objects in struct types (#5925)
  • handle Series init from python sequence of numpy arrays (#5918)
  • merge sorted dataframes (#5817)
  • impl hex and base64 for binary (#5892)
  • Add datatype hierarchy (#5901)
  • Add .item() on DataFrame and Series (#5893)
  • make get_any_value fallible (#5877)
  • Add string representation for data types (#5861)
  • directly push all operator result into sink, prev… (#5856)

🐞 Bug fixes

  • don't panic on ignored context (#5958)
  • don't allow named expression in arr.eval (#5957)
  • error on invalid dtype (#5956)
  • fix panic in join expressions (#5954)
  • block ordered predicates before explode (#5951)
  • adhere to schema in arr.eval of empty list (#5947)
  • fix from_dict schema_inference=0 (#5948)
  • fix arrow nested null conversion (#5946)
  • allow None in arr.slice length (#5934)
  • fix time to duration cast (#5932)
  • error on addition with datetime/time (#5931)
  • don't create categoricals in streaming (#5926)
  • object filter should keep single chunk (#5913)
  • csv, read escaped "" as missing (#5912)
  • fix pivot of signed integers (#5909)
  • don't allow duplicate columns in read_csv arg (#5908)
  • fix latest oob in streaming convertion (#5902)
  • adapt k to len in topk (#5888)
  • fix lazy swapping rename (#5884)
  • fix window function with nullable values; regression due… (#5874)
  • improve equality consistency between types (#5873)
  • evaluate whole branch expression to determine if r… (#5864)
  • fix top_k on empty (#5865)
  • fix slice in streaming (#5854)
  • Fix type hint for IO *_options arguments (#5852)

🛠️ Other improvements

  • Fix docs for sink_parquet (#5952)
  • Fix misspelling in LazyFrame docstring (#5917)
  • add bin, series.is_sorted and merge_sorted (#5914)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @alexander-beedie, @cannero, @chitralverma, @dannyvankooten, @johngunerli, @ozgrakkurt, @ritchie46, @stinodego, @winding-lines and @zundertj

polars - Rust Polars 0.26.0

Published by github-actions[bot] almost 2 years ago

⚠️ Breaking changes

  • remove Series::append_array (#5681)
  • iso weekday (#5598)

🚀 Performance improvements

  • improve reducing window function performance ~33% (#5878)
  • impove performance reducing window functions with numeric output ~-14% (#5841)
  • set_sorted flag when creating from literal (#5728)
  • use sorted fast path in streaming groupby (#5727)
  • ensure fast_explode propagates (#5676)
  • fix quadratic time complexity of groupby in stream… (#5614)
  • Aggregate projection pushdown (#5556)
  • improve streaming primitve groupby (#5575)
  • vectorize integer vec-hash by using very simple, … (#5572)
  • specialized utf8 groupby in streaming (#5535)

✨ Enhancements

  • make get_any_value fallible (#5877)
  • directly push all operator result into sink, prev… (#5856)
  • add sink_parquet (#5480)
  • Support parsing more float string representations. (#5824)
  • implement mean aggregation for duration (#5807)
  • implement sensible boolean aggregates (#5806)
  • allow expression as quantile input (#5751)
  • accept expression in str.extract_all (#5742)
  • tz-aware strptime (#5736)
  • Add "fmt_no_tty" feature for formatting support without r… (#5725)
  • lazy diagonal concat. (#5647)
  • to_struct add upper_bound (#5714)
  • inversely scale chunk_size with thread count in s… (#5699)
  • add streaming minmax (#5693)
  • improve dynamic inference of anyvalues and structs (#5690)
  • support is_in for boolean dtype (#5682)
  • add a cache to strptime (#5628)
  • add nearest interpolation strategy (#5626)
  • make cast recursive (#5596)
  • add arg_min/arg_max for series of dtype boolean (#5592)
  • prefer streaming groupby if partitionable (#5580)
  • make map_alias fallible (#5532)
  • pl.min & pl.max accept wildcard similar to pl.sum (#5511)
  • add predicate pushdown to anonymous_scan (#5467)
  • make streaming work with multiple sinks in a sing… (#5474)
  • add streaming slice operation (#5466)
  • run partial streaming queries (#5464)
  • streaming left joins (#5456)
  • file statistics so we only (try to) keep smallest table in memory (#5454)
  • streaming inner joins. (#5400)
  • build_info() provides detailed information how polars was built (#5423)
  • add missing width property to LazyFrame (#5431)
  • allow regex and wildcard in groupby (#5425)
  • Streaming joins architecture and Cross join implementation. (#5339)
  • add support for am/pm notation in parse_dates read_csv (#5373)
  • add reduce/cumreduce expression as an easier fold (#5364)

🐞 Bug fixes

  • fix lazy swapping rename (#5884)
  • improve equality consistency between types (#5873)
  • evaluate whole branch expression to determine if r… (#5864)
  • fix top_k on empty (#5865)
  • fix slice in streaming (#5854)
  • correct invalid type in struct anyvalue access (#5844)
  • don't set fast_explode if null values in list (#5838)
  • duration formatting (#5837)
  • respect fetch in union (#5836)
  • keep f32 dtype in fill_null by int (#5834)
  • err on epoch on time dtype (#5831)
  • fix panic in hmean (#5808)
  • asof join by logical groups (#5805)
  • fix parquet regression upstream in arrow2 (#5797)
  • Fix lazy cumsum and cumprod result types (#5792)
  • fix nested writer (#5777)
  • fix(rust, python) Summation on empty series evaluates to Some(0) (#5773)
  • empty concat utf8 (#5768)
  • projection pushdown with union and asof join (#5763)
  • check null values in asof_join + groupby (#5756)
  • fix generic streaming groupby on logical types (#5752)
  • fix date_range on expressions (#5750)
  • fix dtypes in join_asof_by (#5746)
  • fix group order in binary aggregation (#5744)
  • implement min/max aggregation for utf8 in groupby (#5737)
  • fix all_null/sorted into_groups panic (#5733)
  • asof join 'by', 'forward' combination (#5720)
  • fix pivot on floating point indexes (#5704)
  • fix arange with column/literal input (#5703)
  • fix double projection that leads to uneven union d… (#5700)
  • Fix a bug in floating regex handling used in CSV type inference (#5695)
  • fix asof join schema (#5686)
  • fix owned arithmetic schema (#5685)
  • take glob into account in scan_csv 'with_schema_mo… (#5683)
  • fix boolean schema in agg_max/min (#5678)
  • fix boolean arg-max if all equal (#5680)
  • early error on duplicate names in streaming groupby (#5638)
  • fix streaming groupby aggregate types (#5636)
  • convert panic to err in concat_list (#5637)
  • fix dot diagram of single nodes (#5624)
  • fix dynamic struct inference (#5619)
  • keep dtype when eval on empty list (#5597)
  • fix ternary with list output on empty frame (#5595)
  • fix tz-awareness of truncate (#5591)
  • check chunks before doing chunked_id join optimiza… (#5589)
  • invert cast_time_zone conversion (#5587)
  • asof join ensure join column is not dropped when '… (#5585)
  • fix ub due to invalid dtype on splitting dfs (#5579)
  • fix(rust, python); fix projection pushdown in asof joins (#5542)
  • streaming hstack allow duplicates (#5538)
  • fix streaming empty join panic (#5534)
  • fix duplicate caches in cse and prevent quadratic … (#5528)
  • allow appending categoricals that are all null (#5526)
  • tz-aware strftime (#5525)
  • make 'truncate' tz-aware (#5522)
  • fix coalesce expreession expansion (#5521)
  • fix nested aggregatin in when then and window expr… (#5520)
  • fix sort_by expression if groups already aggregated (#5518)
  • fix bug in batched parquet reader that dropped dfs… (#5506)
  • fix bugs in skew and kurtosis (#5484)
  • compute correct offset for streaming join on multi… (#5479)
  • return error on invalid sortby expression (#5478)
  • add missing AnyValueBuffer specialisation for Duration dtype (#5436)
  • fix freeze/stall when writing more than 2^31 string values to parquet (#5366)
  • properly handle json with unclosed strings (#5427)
  • fix null poisoning in rank operation (#5417)
  • correct expr::diff dtype for temporal columns (#5416)
  • fix cse for nested caches (#5412)
  • don't set sorted flag in argsort (#5410)
  • explicit nan comparison in min/max agg (#5403)
  • Correct CSV row indexing (#5385)

🛠️ Other improvements

  • Update rustc and fix clippy (#5880)
  • update arrow (#5862)
  • move join dispatch to polars-ops (#5809)
  • Remove dbg statement from union (#5791)
  • Continue removing compilation warnings (#5778)
  • shrink anyvalue size (#5770)
  • update arrow (#5766)
  • chore(rust,python) Change allow_streaming to streaming (#5747)
  • remove rev-map from ChunkedArray (#5721)
  • simplify fast projection by schema (#5716)
  • Reindent df! docs code (#5698)
  • remove Series::append_array (#5681)
  • Remove unused symbols and uneeded mut qualifier (#5672)
  • Include license files in Rust crates (#5675)
  • Use NaiveTime::from_hms_opt instead of NaiveTime::from_hms (#5664)
  • use xxhash3 for string types (#5617)
  • iso weekday (#5598)
  • Improve contributing guide (#5558)
  • streaming improvements (#5541)
  • Refer to DataFrame::unique instead of distinct (#5482)
  • don't panic if part of query cannot run strea… (#5458)
  • make generic join builder more dry (#5439)
  • use IdHash for streaming groupby generic (#5435)
  • fix freeze/stall when writing more than 2^31 string values to parquet (#5366)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @CalOmnie, @Kuhlwein, @MarcoGorelli, @OneRaynyDay, @YuRiTan, @alexander-beedie, @andrewpollack, @ankane, @braaannigan, @chitralverma, @dannyvankooten, @ghais, @ghuls, @jjerphan, @matteosantama, @messense, @owrior, @pickfire, @ritchie46, @s1ck, @sa-, @slonik-az, @sorhawell, @stinodego, @universalmind303 and @zundertj

polars - Python Polars 0.15.7

Published by ritchie46 almost 2 years ago

🚀 Performance improvements

  • impove performance reducing window functions with numeric output ~-14% (#5841)

✨ Enhancements

  • allow more pyarrow literals (#5842)
  • add sink_parquet (#5480)
  • release GIL when writing (#5830)
  • Support parsing more float string representations. (#5824)
  • implement mean aggregation for duration (#5807)
  • implement sensible boolean aggregates (#5806)

🐞 Bug fixes

  • correct invalid type in struct anyvalue access (#5844)
  • don't set fast_explode if null values in list (#5838)
  • duration formatting (#5837)
  • respect fetch in union (#5836)
  • keep f32 dtype in fill_null by int (#5834)
  • fix(python): fix delta issues (#5802)
  • err on epoch on time dtype (#5831)
  • fix panic in hmean (#5808)
  • asof join by logical groups (#5805)

🛠️ Other improvements

  • lazily import connectorx (#5835)

Thank you to all our contributors for making this release possible!
@chitralverma, @ghuls and @ritchie46

polars - Python Polars 0.15.6

Published by github-actions[bot] almost 2 years ago

🐞 Bug fixes

  • fix struct dataset (#5798)
  • fix parquet regression upstream in arrow2 (#5797)

🛠️ Other improvements

  • remove unused cmake-rs patch (#5794)

Thank you to all our contributors for making this release possible!
@OneRaynyDay, @messense, @ritchie46 and @universalmind303

polars - Python Polars 0.15.3

Published by github-actions[bot] almost 2 years ago

🚀 Performance improvements

  • set_sorted flag when creating from literal (#5728)
  • use sorted fast path in streaming groupby (#5727)

✨ Enhancements

  • push down predicates to pyarrow datasets (#5780)
  • Support for reading delta lake tables (#5761)
  • Add DataFrame.glimpse() (#5622)
  • allow expression as quantile input (#5751)
  • accept expression in str.extract_all (#5742)
  • tz-aware strptime (#5736)
  • lazy diagonal concat. (#5647)
  • to_struct add upper_bound (#5714)

🐞 Bug fixes

  • fix(rust, python) Summation on empty series evaluates to Some(0) (#5773)
  • empty concat utf8 (#5768)
  • projection pushdown with union and asof join (#5763)
  • check null values in asof_join + groupby (#5756)
  • fix generic streaming groupby on logical types (#5752)
  • fix date_range on expressions (#5750)
  • fix dtypes in join_asof_by (#5746)
  • fix group order in binary aggregation (#5744)
  • implement min/max aggregation for utf8 in groupby (#5737)
  • fix all_null/sorted into_groups panic (#5733)
  • address several edge-cases found when asserting NaN equality (#5732)
  • asof join 'by', 'forward' combination (#5720)

🛠️ Other improvements

  • add DataFrame.pearson_corr to reference (#5772)
  • Parse fixed timezone offsets without pytz (#5769)
  • chore(rust,python) Change allow_streaming to streaming (#5747)
  • Remove pyarrow nightlies requirement. (#5719)
  • fix incorrect accepted type in df.write_csv (#5715)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @MarcoGorelli, @alexander-beedie, @andrewpollack, @braaannigan, @chitralverma, @ghuls, @ritchie46, @sa- and @zundertj

polars - Python Polars 0.15.2

Published by github-actions[bot] almost 2 years ago

🚀 Performance improvements

  • ensure fast_explode propagates (#5676)

✨ Enhancements

  • Series.get_chunks (#5701)
  • inversely scale chunk_size with thread count in s… (#5699)
  • add streaming minmax (#5693)
  • Support large page sizes on aarch64 linux builds (#5694)
  • improve dynamic inference of anyvalues and structs (#5690)
  • support is_in for boolean dtype (#5682)
  • add notebook html repr for Series (#5653)

🐞 Bug fixes

  • fix pivot on floating point indexes (#5704)
  • fix arange with column/literal input (#5703)
  • fix double projection that leads to uneven union d… (#5700)
  • Fix Series -> Expr dispatch for @property methods (#5689)
  • fix asof join schema (#5686)
  • fix owned arithmetic schema (#5685)
  • take glob into account in scan_csv 'with_schema_mo… (#5683)
  • fix boolean schema in agg_max/min (#5678)
  • fix boolean arg-max if all equal (#5680)
  • respect python objects read method even if filename is f… (#5677)
  • Fix DataFrame.n_chunks return type (#5650)

🛠️ Other improvements

  • Parametrize test_parquet_datetime (#5696)
  • Function and lazy function doctrings (#5657)
  • Fix formatting (#5658)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ankane, @braaannigan, @ghais, @ghuls, @jjerphan, @pickfire, @ritchie46, @stinodego and @zundertj

polars - Python Polars 0.15.1

Published by github-actions[bot] almost 2 years ago

⚠️ Breaking changes

  • Update Expr.sample signature and change random seeding (#4648)
  • rollup breaking changes (#5602)
  • iso weekday (#5598)
  • Change null_equal default to True for Series.series_equal (#5051)
  • rollup breaking changes (#5602)

🚀 Performance improvements

  • fix quadratic time complexity of groupby in stream… (#5614)
  • Improve performance of indexing operations on Series. (#5610)
  • Aggregate projection pushdown (#5556)

✨ Enhancements

  • add a cache to strptime (#5628)
  • add nearest interpolation strategy (#5626)
  • Update Expr.sample signature and change random seeding (#4648)
  • Change null_equal default to True for Series.series_equal (#5051)
  • make cast recursive (#5596)
  • add arg_min/arg_max for series of dtype boolean (#5592)

🐞 Bug fixes

  • early error on duplicate names in streaming groupby (#5638)
  • fix streaming groupby aggregate types (#5636)
  • convert panic to err in concat_list (#5637)
  • fix dot diagram of single nodes (#5624)
  • fix dynamic struct inference (#5619)
  • tz-aware filtering (#5603)
  • keep dtype when eval on empty list (#5597)
  • fix ternary with list output on empty frame (#5595)
  • fix tz-awareness of truncate (#5591)
  • check chunks before doing chunked_id join optimiza… (#5589)
  • invert cast_time_zone conversion (#5587)
  • asof join ensure join column is not dropped when '… (#5585)

🛠️ Other improvements

  • Remaining docstring examples for frame and lazyframe (#5630)
  • use xxhash3 for string types (#5617)
  • only trigger build.rs file if that file itself has cha… (#5618)
  • iso weekday (#5598)
  • Merge release workflows (#5564)
  • Fix broken lint workflow (#5584)

Thank you to all our contributors for making this release possible!
@Kuhlwein, @braaannigan, @ghuls, @matteosantama, @ritchie46 and @stinodego