Bot releases are hidden (Show)

polars - Rust Polars 0.40.0

Published by ritchie46 5 months ago

💥 Breaking changes

Remove incremental read based batched CSV reader (#16259)
separate rolling_*_by from rolling_*(..., by=...) in Rust (#16102)
Move CSV read options from CsvReader to CsvReadOptions (#16126)
Rename all 'Chunk's to RecordBatch (#16063)
prepare for join coalescing argument (#15418)
Rename to CsvParserOptions to CsvReaderOptions, use in CsvReader (#15919)
Add context trace to LazyFrame conversion errors (#15761)
Move schema resolving of file scan to IR phase (#15739)
Move schema resolving to IR phase. (#15714)
Rename LogicalPlan and builders to reflect their uses better (#15712)

🚀 Performance improvements

Use branchless uleb128 decoding for parquet (#16352)
Reduce error bubbling in parquet hybrid_rle (#16348)
use is_sorted in ewm_mean_by, deprecate check_sorted (#16335)
Optimize is_sorted for numeric data (#16333)
do not use pyo3-built (#16309)
Faster bitpacking for Parquet writer (#16278)
Avoid importing ctypes.util in CPU check script if possible (#16307)
Don't rechunk when converting DataFrame to numpy/ndarray (#16288)
use zeroed vec in ewm_mean_by for sorted fastpath (#16265)
use zeroable_vec in ewm_mean_by (#16166)
Improve cost of chunk_idx compute (#16154)
Don't rechunk by default in concat (#16128)
Ensure rechunk is parallel (#16127)
Don't traverse deep datasets that we repr as union in CSE (#16096)
Ensure better chunk sizes (#16071)
Don't rechunk in parallel collection (#15907)
Improve non-trivial list aggregations (#15888)
Ensure we hit specialized gather for binary/strings (#15886)
Limit the cache size for to_datetime (#15826)
skip initial null items and don't recompute slope in interpolate (#15819)
Fix quadratic in binview growable same source (#15734)

✨ Enhancements

Raise when joining on the same keys twice (#16329)
Don't require data to be sorted by by column in rolling_*_by operations (#16249)
Add struct.field expansion (regex, wildcard, columns) (#16320)
Faster bitpacking for Parquet writer (#16278)
Add struct.with_fields (#16305)
Handle implicit SQL string → temporal conversion in the BETWEEN clause (#16279)
Add new index/range based selector cs.by_index, allow multiple indices for nth (#16217)
Show warning if expressions are very deep (#16233)
Native CSV file list reading (#16180)
Register memory mapped files and raise when written to (#16208)
Raise when encountering invalid supertype in functions during conversion (#16182)
Add SQL support for GROUP BY ALL syntax and fix several issues with aliased group keys (#16179)
Allow implicit string → temporal conversion in SQL comparisons (#15958)
separate rolling_*_by from rolling_*(..., by=...) in Rust (#16102)
Add run-length encoding to Parquet writer (#16125)
add date pattern dd.mm.YYYY (#16045)
Add RLE to RLE_DICTIONARY encoder (#15959)
Support non-coalescing joins in default engine (#16036)
Move diagonal & horizontal concat schema resolving to IR phase (#16034)
raise more informative error messages in rolling_* aggregations instead of panicking (#15979)
Convert concat during IR conversion (#16016)
Improve dynamic supertypes (#16009)
Additional uint datatype support for the SQL interface (#15993)
Support Decimal read from IPC (#15965)
Add typed collection from par iterators (#15961)
Add by argument for Expr.top_k and Expr.bottom_k (#15468)
Add option to disable globbing in csv (#15930)
Add option to disable globbing in parquet (#15928)
Rename to CsvParserOptions to CsvReaderOptions, use in CsvReader (#15919)
Expressify dt.round (#15861)
Improve error messages in context stack (#15881)
Add dynamic literals to ensure schema correctness (#15832)
dt.truncate supports broadcasting lhs (#15768)
Expressify str.json_path_match (#15764)
Support decimal float parsing in CSV (#15774)
Add context trace to LazyFrame conversion errors (#15761)

🐞 Bug fixes

correct AExpr.to_field for bitwise and logical and/or (#16360)
cargo clippy for uleb128 safety comment (#16368)
Infer CSV schema as supertype of all files (#16349)
Address overflow combining u64 hashes in Debug builds (#16323)
Don't exclude explicitly named columns in group-by context' expr expansion (#16318)
Harden Series.reshape against invalid parameters (#16281)
Fix list.sum dtype for boolean (#16290)
Don't stackoverflow on all/any horizontal (#16287)
compilation error when both lazy and ipc features are enabled (#16284)
`rolling_*_by was throwing incorrect error when dataframe was sorted by contained multiple chunks (#16247)
Clippy Error for CPUID (#16241)
Reading CSV with low_memory gave no data (#16231)
Empty unique (#16214)
Fix empty drop nulls (#16213)
Fix get expression group-by state (#16189)
Fix rolling empty group OOB (#16186)
offset=-0i was being treated differently to offset=0i in rolling (#16184)
Fix panic on empty frame joins (#16181)
Fix streaming glob slice (#16174)
Fix CSV skip_rows_after_header for streaming (#16176)
Flush parquet at end of batches tick (#16073)
Check CSE name aliases for collisions. (#16149)
Don't override CSV reader encoding with lossy UTF-8 (#16151)
Add missing allow macros for windows (#16130)
Ensure hex and bitstring literals work inside SQL IN clauses (#16101)
Revert "Add RLE to RLE_DICTIONARY encoder" (#16113)
Respect user passed 'reader_schema' in 'scan_csv' (#16080)
Lazy csv + projection; respect null values arg (#16077)
Materialize dtypes when converting to arrow (#16074)
Fix casting decimal to decimal for high precision (#16049)
Fix printing max scale decimals (#16048)
Decimal supertype for dyn int (#16046)
Do not set sorted flag on lexical sorting (#16032)
properly handle nulls in DictionaryArray::iter_typed (#16013)
Fix CSE case where upper plan has no projection (#16011)
Crash/incorrect group_by/n_unique on categoricals created by (q)cut (#16006)
Ternary supertype dynamics (#15995)
Treat splitting by empty string as iterating over chars (#15922)
Fix PartialEq for DataType::Unknown (#15992)
Do not reverse null indices in descending arg_sort (#15974)
Finish adding typed_lit to help schema determination in SQL "extract" func (#15955)
do not panic when comparing against categorical with incompatible dtype (#15857)
Join validation for multiple keys (#15947)
Set default limit for String column display to 30 and fix edge cases (#15934)
typo in add_half_life takes ln(negative) (#15932)
Remove ffspec from parquet reader (#15927)
avoid WRITE+EXEC for CPUID check (#15912)
fix inconsistent decimal formatting (#15457)
Preserve NULLs for is_not_nan (#15889)
double projection check should only take the upstream projections into account (#15901)
Ensure we don't create invalid frames when combining unit lit + … (#15903)
Clear cached rename schema (#15902)
Fix OOB in struct lit/agg aggregation (#15891)
create (q)cut labels in fixed order (#15843)
Tag shrink_dtype as non-streaming (#15828)
drop-nulls edge case; remove drop-nulls special case (#15815)
ewm_mean_by was skipping initial nulls when it was already sorted by "by" column (#15812)
Consult cgroups to determine free memory (#15798)
raise if index count like 2i is used when performing rolling, group_by_dynamic, upsample, or other temporal operatios (#15751)
Don't deduplicate sort that has slice pushdown (#15784)
Fix incorrect is_between pushdown to scan_pyarrow_dataset (#15769)
Handle null index correctly for list take (#15737)
Preserve lexical ordering on concat (#15753)
Remove incorrect unsafe pointer cast for int -> enum (#15740)
pass series name to apply for cut/qcut (#15715)
count of null column shouldn't panic in agg context (#15710)

📖 Documentation

Clarify arrow usage (#16152)
Solve inconsistency between code and comment (#16135)
add filter docstring examples to date and datetime (#15996)
update the link to R API docs (#15973)
Fix a typo in categorical section of the user guide (#15777)
Fix incorrect column name in LazyFrame.sort doc example (#15658)

📦 Build system

Update Rust nightly toolchain version (#16222)
Don't import jemalloc (#15942)
Use default allocator for lts-cpu (#15941)
replace all macos-latest referrals with macos-13 (#15926)
pin mimalloc and macos-13 (#15925)
use jemalloc in lts-cpu (#15913)

🛠️ Other improvements

simplify interpolate code, add test for rolling_*_by with nulls (#16334)
Move expression expansion to conversion module (#16331)
Add polars-expr README (#16316)
Move physical expressions to new crate (#16306)
Use cls (not self) in classmethods (#16303)
conditionally print the CSEs (#16292)
Rename ChunkedArray.chunk_id to chunk_lengths (#16273)
Use Scalar instead of Series some aggregations (#16277)
Use CsvReadOptions in LazyCsvReader (#16283)
Do not hardcode bash path in Makefile (#16263)
Add IR::Reduce (not yet implemented) (#16216)
Remove incremental read based batched CSV reader (#16259)
move all describe, describe_tree and dot-viz code to IR instead of DslPlan (#16237)
move describe to IR instead of DSL (#16191)
Use Duration.is_zero instead of comparing Duration.duration_ns to 0 (#16195)
Remove unused code (#16175)
Don't override CSV reader encoding with lossy UTF-8 (#16151)
Move CSV read options from CsvReader to CsvReadOptions (#16126)
Bump sccache action (#16088)
Fix failures in test coverage workflow (#16083)
Rename all 'Chunk's to RecordBatch (#16063)
Use UnionArgs for DSL side (#16017)
Add some comments (#16008)
prepare for join coalescing argument (#15418)
Pin coverage job to MacOS 13 for now (#15918)
Reorganize from_iter and dispatch to collect_ca when possible (#15904)
More polars-io cleanup (#15885)
Improve type-coercion (#15879)
Move type coercion to IR conversion phase (#15868)
Reorganize polars_io::parquet module (#15860)
Reorganize polars_io::csv module (#15831)
Always expand horizontal_any/all (#15816)
Rename decimal_float to decimal_comma (#15817)
Move IO-related options structs to polars-io (#15806)
Split coverage calculation (#15780)
Update readme (#15787)
Move schema resolving of file scan to IR phase (#15739)
Factor out ensure_is_constant_duration (#15733)
Move schema resolving to IR phase. (#15714)
Rename LogicalPlan and builders to reflect their uses better (#15712)

Thank you to all our contributors for making this release possible!
@CanglongCl, @JulianCologne, @KDruzhkin, @MarcoGorelli, @NedJWestern, @NexVeridian, @NickCondron, @Robinsane, @ShivMunagala, @TobiasDummschat, @YichiZhang0613, @alexander-beedie, @avimallu, @bertiewooster, @brandon-b-miller, @c-peters, @coastalwhite, @dangotbanned, @datenzauberai, @deanm0000, @dependabot, @dependabot[bot], @eitsupi, @gasmith, @haocheng6, @ion-elgreco, @itamarst, @janpipek, @jr200, @jrycw, @jsarbach, @luke396, @marenwestermann, @max-muoto, @mbuhidar, @nameexhaustion, @orlp, @pydanny, @r-brink, @reswqa, @ritchie46, @stinodego, @thalassemia, @tharunsuresh-code, @twoertwein, @wence- and @wsyxbcl

polars - Python Polars 0.19.14

Published by ritchie46 11 months ago

🏆 Highlights

Support Python 3.12 (#12094)
make 1D numpy to polars conversion zero-copy for numeric data (#12403)

⚠️ Deprecations

Rename DataFrame column index methods (#12542)
Rename Series.set_at_idx to scatter (#12540)
Deprecate Series.view (#12539)
Rename cumulative functions cumsum -> cum_sum and similar (#12513)
Rename take to gather (#12528)
Add dedicated horizontal aggregation methods to DataFrame (#12492)
Rename take_every to gather_every (#12531)
Deprecate Series.inner_dtype property (#12494)
Deprecate parse_int in favor of to_integer (#12464)
Deprecate DataType method is_not (#12458)
Deprecate Series methods is_boolean and is_utf8 (#12457)
Add DataType.is_integer and other dtype groups (#12200)

🚀 Performance improvements

speed up parquet download of streaming engine (#12544)
speed up cov/corr with SIMD + strength-reduction ~3x 0.19.13/ ~2x numpy (#12471)
apply predicates and statistics of parquet files in streaming mode (#12439)
use online algorithm for cov/corr ~2x (#12412)
make 1D numpy to polars conversion zero-copy for numeric data (#12403)

✨ Enhancements

Add dedicated horizontal aggregation methods to DataFrame (#12492)
support http scan_parquet (#12517)
Add support for UTF-8 BOM option in write_csv and sink_csv (#12253)
remove lexical (replace with atoi_simd, ryu, and itao). (#12512)
more changes for versioned plugins (#12504)
plugins add version and context (#12433)
Add DataType.is_integer and other dtype groups (#12200)
include i128 in more primitive functions (#12413)
write rolling functions as private expressions. (#12379)

🐞 Bug fixes

fix incorrect ternary agg states (#12538)
fix and improve ternary evaluation on groups (#12529)
saturating sub in debug msg (#12525)
fix panic when writing Decimal type to parquet (#12532)
pre-fefetch struct columns in async projection pd (#12514)
rechunk cross join output in streaming (#12511)
Ensure behaviour ofSeries comparison with timedelta matches that of other types (#12497)
fix as_list logical types (#12507)
fix streaming cross join on empty df (#12491)
dont overflow when calculating date range over very long periods (#12479)
Allow append/zip_with/extend on local categoricals (#12369)
Do not panic if time is invalid (#12466)
ensure explicit "return_dtype" is respected by map_dicts (#12436)
empty csv no-raise (#12434)
Fix scan_csv error type (#12355)
binary operations in aggregation context on literals (#12430)
raw HTML output alignment was incorrect for dtype in header (#12422)
update groups state after binary aggregation (#12415)
Remove extra \n when reading file-like object wi… (#12333)
Issue correct PolarsInefficientMapWarning for lshift/rshift operations (#12385)
revert ternary special broadcast, ensure broadcast is always to max height (#12395)
ensure first/last return null if empty (#12401)

🛠️ Other improvements

fix and improve ternary evaluation on groups (#12529)
Add polars-ds to list of community plugins (#12527)
Future-proof consortium standard test (#12524)
add schema test (#12523)
remove lexical (replace with atoi_simd, ryu, and itao). (#12512)
add test for previous commit (#12510)
Update polars-hash reference (#12505)
Add note on hash stability and mention polars-hash (#12496)
Support Python 3.12 (#12094)
Improved import polars timing test; now much more consistent/reliable (#12478)
Use .with_columns() in all .list namespace examples (#12475)
update rustc (#12468)
Fix docs trigger (#12449)
Update for new maturin release (#12437)
Remove 'experimental' tag for auto-structify setting (#12435)
make "DataFrame" and "Series" case more consistent across docs/comments/errors (#12428)
dprint/markdown link checker minor updates (#12409)
Use manylinux_2_17 for building x86-64 wheel (#12408)
Use manylinux 2.24 instead of 2.28 for compatibility reasons (#12397)
use with_columns in is_in example, and fix some bullet points not rendering (#12383)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @abstractqqq, @alexander-beedie, @c-peters, @cmdlineluser, @hirohira9119, @ion-elgreco, @jerome3o, @nameexhaustion, @reswqa, @ritchie46, @stinodego and @uchiiii

polars - Python Polars 0.17.5

Published by stinodego 12 months ago

🚀 Performance improvements

use online variance kernel for aggregation (#8306)

Thank you to all our contributors for making this release possible!
@ritchie46

polars - Python Polars 0.18.2

Published by ritchie46 over 1 year ago

🚀 Performance improvements

increase streaming groupby spill size from 256 to 10_000 (#9312)
perf(rust, python) Improve rolling min and max for nonulls (#9277)

✨ Enhancements

allow use of StringCache object as a function decorator (#9309)
allow use of Config object as a function decorator (#9307)
serde for 'to_physical' expr (#9294)

🐞 Bug fixes

fix rolling weighted mean (#9292)
fix overly-broad string matching in selectors (#9303)
fix when loading model data from upcoming pydantic 2.x release (#9296)

🛠️ Other improvements

fix extraneous indent in examples block (#9297)
Fix typo in Selectors documentation (#9295)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @magarick, @ritchie46, @stinodego and @thomascamminady

polars - Python Polars 0.15.7

Published by ritchie46 almost 2 years ago

🚀 Performance improvements

impove performance reducing window functions with numeric output ~-14% (#5841)

✨ Enhancements

allow more pyarrow literals (#5842)
add sink_parquet (#5480)
release GIL when writing (#5830)
Support parsing more float string representations. (#5824)
implement mean aggregation for duration (#5807)
implement sensible boolean aggregates (#5806)

🐞 Bug fixes

correct invalid type in struct anyvalue access (#5844)
don't set fast_explode if null values in list (#5838)
duration formatting (#5837)
respect fetch in union (#5836)
keep f32 dtype in fill_null by int (#5834)
fix(python): fix delta issues (#5802)
err on epoch on time dtype (#5831)
fix panic in hmean (#5808)
asof join by logical groups (#5805)

🛠️ Other improvements

lazily import connectorx (#5835)

Thank you to all our contributors for making this release possible!
@chitralverma, @ghuls and @ritchie46

polars - Python Polars 0.14.31

Published by ritchie46 almost 2 years ago

🚀 Performance improvements

improve streaming primitve groupby (#5575)
vectorize integer vec-hash by using very simple, … (#5572)

✨ Enhancements

prefer streaming groupby if partitionable (#5580)

🐞 Bug fixes

fix ub due to invalid dtype on splitting dfs (#5579)

🛠️ Other improvements

Remove old Python changelog file (#5577)
namespace registration docs update (#5565)
Improve contributing guide (#5558)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ghuls, @ritchie46 and @stinodego

polars - Python Polars 0.14.15

Published by stinodego about 2 years ago

polars - Rust Polars 0.24.3

Published by stinodego about 2 years ago

polars - Rust Polars 0.24.0

Published by ritchie46 about 2 years ago

New rust polars release! 🚀

This is the release of rust polars 0.24.0. This release comes with a lot of bug fixes, performance improvements and added functionality. The changes that stand out are larger than RAM memory mapping of IPC files and a new common-subplan-optimization that prunes duplicated sub-plan from the query plan and thereby potentially save a lot of duplicated work.

See more

Update to arrow2 0.14.0

See the 0.14.0 release for all upstream improvements.

New Contributors

@ydarma made their first contribution in https://github.com/pola-rs/polars/pull/4269
@gaoxinge made their first contribution in https://github.com/pola-rs/polars/pull/4300
@SimonSchneider made their first contribution in https://github.com/pola-rs/polars/pull/4436
@lorenzwalthert made their first contribution in https://github.com/pola-rs/polars/pull/4445
@neeldug made their first contribution in https://github.com/pola-rs/polars/pull/4384
@isaacthefallenapple made their first contribution in https://github.com/pola-rs/polars/pull/4522
@Chuxiaof made their first contribution in https://github.com/pola-rs/polars/pull/4524
@luk-f-a made their first contribution in https://github.com/pola-rs/polars/pull/4565
@OneRaynyDay made their first contribution in https://github.com/pola-rs/polars/pull/4621
@abalkin made their first contribution in https://github.com/pola-rs/polars/pull/4650
@tikkanz made their first contribution in https://github.com/pola-rs/polars/pull/4676
@hpux735 made their first contribution in https://github.com/pola-rs/polars/pull/4693
@huang12zheng made their first contribution in https://github.com/pola-rs/polars/pull/4823
@owrior made their first contribution in https://github.com/pola-rs/polars/pull/4840
@jly36963 made their first contribution in https://github.com/pola-rs/polars/pull/4886

Full Changelog: https://github.com/pola-rs/polars/compare/rust-polars-v0.23.0...rust-polars-v0.24.0

polars - Rust polars 0.23.0

Published by ritchie46 about 2 years ago

What's Changed

respect ipc column ordering by @ritchie46 in https://github.com/pola-rs/polars/pull/3591
zfill expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3593
Patch release by @ritchie46 in https://github.com/pola-rs/polars/pull/3595
Fix TOML typos by @ryanrussell in https://github.com/pola-rs/polars/pull/3598
Anonymous scan lazyframe by @universalmind303 in https://github.com/pola-rs/polars/pull/3561
ljust and rjust expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3603
cast string to categorical in 'is_in' by @ritchie46 in https://github.com/pola-rs/polars/pull/3606
python data type units by @ritchie46 in https://github.com/pola-rs/polars/pull/3609
unset sorted metadata on append by @ritchie46 in https://github.com/pola-rs/polars/pull/3610
feat(nodejs): scan json by @universalmind303 in https://github.com/pola-rs/polars/pull/3611
Expand regex function input by @ritchie46 in https://github.com/pola-rs/polars/pull/3613
node 0.5.3 release by @universalmind303 in https://github.com/pola-rs/polars/pull/3612
improve when then otherwise for lists by @ritchie46 in https://github.com/pola-rs/polars/pull/3614
python polars 0.13.44 by @ritchie46 in https://github.com/pola-rs/polars/pull/3615
Fix mode for multiple modes by @GregoryBL in https://github.com/pola-rs/polars/pull/3566
fix empty list edge case by @ritchie46 in https://github.com/pola-rs/polars/pull/3621
fix invalid concat dtype by @ritchie46 in https://github.com/pola-rs/polars/pull/3622
respect n_rows by @ritchie46 in https://github.com/pola-rs/polars/pull/3624
Python: scan_ipc/parquet can scan from fsspec sources e.g. s3. by @ritchie46 in https://github.com/pola-rs/polars/pull/3626
Fix Series init (as pl.Object dtype) from mixed-type input and extend test coverage by @alexander-beedie in https://github.com/pola-rs/polars/pull/3627
restrict parallel branches in lazy Union by @ritchie46 in https://github.com/pola-rs/polars/pull/3628
native exp expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3629
python dict parallel dataframe creation by @ritchie46 in https://github.com/pola-rs/polars/pull/3630
Enhanced column typedef/inference support for DataFrame init by @alexander-beedie in https://github.com/pola-rs/polars/pull/3633
fix row count file projection pushdown by @ritchie46 in https://github.com/pola-rs/polars/pull/3635
fix list concat by @ritchie46 in https://github.com/pola-rs/polars/pull/3636
rust publish makefile by @ritchie46 in https://github.com/pola-rs/polars/pull/3637
improve explode of empty lists by @ritchie46 in https://github.com/pola-rs/polars/pull/3638
Improve numpy ufunc support. fixes: #3228 by @ghuls in https://github.com/pola-rs/polars/pull/3583
Update various python build requirements. by @ghuls in https://github.com/pola-rs/polars/pull/3641
is_in for struct dtype by @ritchie46 in https://github.com/pola-rs/polars/pull/3639
Update black and change some code so is sees it as a call chain. by @ghuls in https://github.com/pola-rs/polars/pull/3645
concat list determine supertype by @ritchie46 in https://github.com/pola-rs/polars/pull/3649
update arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/3650
Parallel csv writer by @ritchie46 in https://github.com/pola-rs/polars/pull/3652
fix groups state in complex aggregation by @ritchie46 in https://github.com/pola-rs/polars/pull/3656
Rust Comment Readability Fixes by @ryanrussell in https://github.com/pola-rs/polars/pull/3662
Add Expr.reverse() Python API example by @cnpryer in https://github.com/pola-rs/polars/pull/3660
Added StringCache Python API example by @cnpryer in https://github.com/pola-rs/polars/pull/3659
improve dtype selection by @ritchie46 in https://github.com/pola-rs/polars/pull/3664
accept regex in filter by @ritchie46 in https://github.com/pola-rs/polars/pull/3666
python: improve html render by @ritchie46 in https://github.com/pola-rs/polars/pull/3667
Python: infer_schema_len arg to from_dicts by @ritchie46 in https://github.com/pola-rs/polars/pull/3669
Add LICENSE link to py-polars by @gyscos in https://github.com/pola-rs/polars/pull/3674
python: fix and test globbing by @ritchie46 in https://github.com/pola-rs/polars/pull/3675
python polars 0.13.45 by @ritchie46 in https://github.com/pola-rs/polars/pull/3676
Add useful example for pl.StringCache(). by @ghuls in https://github.com/pola-rs/polars/pull/3677
Fix StringCache docstring typo by @cnpryer in https://github.com/pola-rs/polars/pull/3678
Fix polars.Expr.apply() Python API docs text by @cnpryer in https://github.com/pola-rs/polars/pull/3661
Anonymous scan enhancements & cleanup by @universalmind303 in https://github.com/pola-rs/polars/pull/3657
add pyarrow install to quickstart setup by @ritchie46 in https://github.com/pola-rs/polars/pull/3682
fix oob in sorted groupby by @ritchie46 in https://github.com/pola-rs/polars/pull/3681
fix branch supertypes by @ritchie46 in https://github.com/pola-rs/polars/pull/3683
fix cargo.toml for docs.rs by @ritchie46 in https://github.com/pola-rs/polars/pull/3684
python polars 0.13.46 by @ritchie46 in https://github.com/pola-rs/polars/pull/3686
ndjson reader complex types support by @universalmind303 in https://github.com/pola-rs/polars/pull/3665
fix groupby aggregation on empty df by @ritchie46 in https://github.com/pola-rs/polars/pull/3688
Nodejs groupbyrolling by @universalmind303 in https://github.com/pola-rs/polars/pull/3670
Add pl.Expr.hash Python example by @cnpryer in https://github.com/pola-rs/polars/pull/3679
Adding 'line-height' at 95% to df _html.py print by @LVG77 in https://github.com/pola-rs/polars/pull/3691
unique counts for logical types by @ritchie46 in https://github.com/pola-rs/polars/pull/3694
Update arrow and prepare for mutable arithmetics by @ritchie46 in https://github.com/pola-rs/polars/pull/3695
Improve lit agg by @ritchie46 in https://github.com/pola-rs/polars/pull/3702
panic on invalid groupby rolling input by @ritchie46 in https://github.com/pola-rs/polars/pull/3703
docs: Readability improvements in py-polars by @ryanrussell in https://github.com/pola-rs/polars/pull/3700
docs: polars-lazy readability improvements by @ryanrussell in https://github.com/pola-rs/polars/pull/3701
Python: parallel concat df by @gunjunlee in https://github.com/pola-rs/polars/pull/3671
fix ipc column order by @ritchie46 in https://github.com/pola-rs/polars/pull/3706
nodejs release by @universalmind303 in https://github.com/pola-rs/polars/pull/3698
add coc by @ritchie46 in https://github.com/pola-rs/polars/pull/3712
inplace arithmetic by @ritchie46 in https://github.com/pola-rs/polars/pull/3709
format empty df by @ritchie46 in https://github.com/pola-rs/polars/pull/3719
Add typing overloads for DataFrame.hstack() by @adamgreg in https://github.com/pola-rs/polars/pull/3697
Add Series to DataFrame.with_columns() argument annotation by @adamgreg in https://github.com/pola-rs/polars/pull/3696
fix rolling groupby ordering with 'by' argument by @ritchie46 in https://github.com/pola-rs/polars/pull/3720
allow literal as aggregation by @ritchie46 in https://github.com/pola-rs/polars/pull/3722
Improve performance of categorical casting by @ritchie46 in https://github.com/pola-rs/polars/pull/3724
Add flag to allow str.contains to search for string literals (#3711) by @alexander-beedie in https://github.com/pola-rs/polars/pull/3718
fix join negative keys by @ritchie46 in https://github.com/pola-rs/polars/pull/3730
fix arr.get() offsets by @ritchie46 in https://github.com/pola-rs/polars/pull/3731
update arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/3732
fix from_pandas object null array by @ritchie46 in https://github.com/pola-rs/polars/pull/3733
python polars 0.13.47 by @ritchie46 in https://github.com/pola-rs/polars/pull/3734
Replace OOB slice indexing with spare_capacity_mut by @saethlin in https://github.com/pola-rs/polars/pull/3737
pow fast paths by @ritchie46 in https://github.com/pola-rs/polars/pull/3738
Simplify contains check that opts-in to contains_literal fast-path by @alexander-beedie in https://github.com/pola-rs/polars/pull/3736
fix aritmetic bug introduced in #3709 by @ritchie46 in https://github.com/pola-rs/polars/pull/3741
check nan in sort by single column by @ritchie46 in https://github.com/pola-rs/polars/pull/3742
python fix concat by @ritchie46 in https://github.com/pola-rs/polars/pull/3743
patch python polars 0.13.48 by @ritchie46 in https://github.com/pola-rs/polars/pull/3744
ternary literal predicates by @ritchie46 in https://github.com/pola-rs/polars/pull/3747
python polars 0.13.49 by @ritchie46 in https://github.com/pola-rs/polars/pull/3748
unset sorted on take by @ritchie46 in https://github.com/pola-rs/polars/pull/3756
reexport polars for extension libraries by @universalmind303 in https://github.com/pola-rs/polars/pull/3760
add global pl by @universalmind303 in https://github.com/pola-rs/polars/pull/3763
arg_where expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3757
update arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/3762
python lhs power and broadcast by @ritchie46 in https://github.com/pola-rs/polars/pull/3768
allow regex expansion in binary/ternary expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3769
str.ends_with/ str.starts_with by @ritchie46 in https://github.com/pola-rs/polars/pull/3770
fix bug in agg projections and init tpch schema tests by @ritchie46 in https://github.com/pola-rs/polars/pull/3771
always include offset in groupby_dynamic by @ritchie46 in https://github.com/pola-rs/polars/pull/3779
Cache file reads (tpch 2/7) ~5% faster by @ritchie46 in https://github.com/pola-rs/polars/pull/3774
python fix arr.contains type by @ritchie46 in https://github.com/pola-rs/polars/pull/3782
improve predicate combination and schema state by @ritchie46 in https://github.com/pola-rs/polars/pull/3788
fix duration computation by @ritchie46 in https://github.com/pola-rs/polars/pull/3790
Update arrow2 to support IPC Stream Reading with projections by @joshuataylor in https://github.com/pola-rs/polars/pull/3793
Some API alignment (missing funcs) between DataFrame, LazyFrame, and Series by @alexander-beedie in https://github.com/pola-rs/polars/pull/3791
Docs: sort entries within subsections by @alexander-beedie in https://github.com/pola-rs/polars/pull/3794
csv don't skip delimiter in whitespace trimming by @ritchie46 in https://github.com/pola-rs/polars/pull/3796
don't copy the sorted flag on many operations by @ritchie46 in https://github.com/pola-rs/polars/pull/3795
csv don't skip trailing delimiters when infering schema. by @ghuls in https://github.com/pola-rs/polars/pull/3799
Allow date_range to produce date ranges as well as datetime by @alexander-beedie in https://github.com/pola-rs/polars/pull/3798
quarter expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3797
Update rustc to 2022-06-22 by @ritchie46 in https://github.com/pola-rs/polars/pull/3801
Fix Node installation instructions by @Smittyvb in https://github.com/pola-rs/polars/pull/3804
python polars 0.13.50 by @ritchie46 in https://github.com/pola-rs/polars/pull/3802
rolling groupby fix index column output order by @ritchie46 in https://github.com/pola-rs/polars/pull/3806
Add support for IPC Streaming Read/Write by @joshuataylor in https://github.com/pola-rs/polars/pull/3783
chore: chunked_array readability improvements by @ryanrussell in https://github.com/pola-rs/polars/pull/3810
Add serde feature to field to fix serde feature by @joshuataylor in https://github.com/pola-rs/polars/pull/3808
fix join asof on floats by @ritchie46 in https://github.com/pola-rs/polars/pull/3812
chore: /polars/polars-core/src/frame/ readability by @ryanrussell in https://github.com/pola-rs/polars/pull/3813
Fixing small typos in docs by @thatlittleboy in https://github.com/pola-rs/polars/pull/3811
fix join asof tolerance by @ritchie46 in https://github.com/pola-rs/polars/pull/3816
docs: use quotes in pip install instruction by @thatlittleboy in https://github.com/pola-rs/polars/pull/3820
Improve parquet reading performance ~35-40% by @ritchie46 in https://github.com/pola-rs/polars/pull/3821
from anyvalue for small integers by @ritchie46 in https://github.com/pola-rs/polars/pull/3826
add date offset by @ritchie46 in https://github.com/pola-rs/polars/pull/3827
fix sorted unique by @ritchie46 in https://github.com/pola-rs/polars/pull/3837
fix ternary groupby agg_list/not_aggregated combination by @ritchie46 in https://github.com/pola-rs/polars/pull/3835
don't parallelize upsample by @ritchie46 in https://github.com/pola-rs/polars/pull/3836
python fix time divide by zero by @ritchie46 in https://github.com/pola-rs/polars/pull/3838
Improve map/apply docstrings by @braaannigan in https://github.com/pola-rs/polars/pull/3750
don't cache in-expression window functions by @ritchie46 in https://github.com/pola-rs/polars/pull/3840
Hypothesis testing framework integrations for Polars by @alexander-beedie in https://github.com/pola-rs/polars/pull/3842
docs: Improve expr.string documentation by @thatlittleboy in https://github.com/pola-rs/polars/pull/3841
make hypothesis optional and don't fail if not installed by @ritchie46 in https://github.com/pola-rs/polars/pull/3849
update arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/3848
python: fix time conversion by @ritchie46 in https://github.com/pola-rs/polars/pull/3851
Make frame/series asserts more resilient against integer overflow by @alexander-beedie in https://github.com/pola-rs/polars/pull/3850
parquet: allow writing smaller row groups by @ritchie46 in https://github.com/pola-rs/polars/pull/3852
python polars 0.13.51 by @ritchie46 in https://github.com/pola-rs/polars/pull/3854
allow branching null with struct dtype by @ritchie46 in https://github.com/pola-rs/polars/pull/3856
Address distinction between DataType and DataType() by @alexander-beedie in https://github.com/pola-rs/polars/pull/3857
Deprecate df/ldf argument to .join by @thomasaarholt in https://github.com/pola-rs/polars/pull/3855
null_probability functionality for dataframes/series test strategies. by @alexander-beedie in https://github.com/pola-rs/polars/pull/3860
Modern style type hints by @stinodego in https://github.com/pola-rs/polars/pull/3863
Concise empty class syntax by @stinodego in https://github.com/pola-rs/polars/pull/3864
fix groups after take expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3881
fix predicate pushdown in union + count expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3882
add join/union branch in window cache keys by @ritchie46 in https://github.com/pola-rs/polars/pull/3884
Fast/cheap empty clone ops by @alexander-beedie in https://github.com/pola-rs/polars/pull/3883
parquet read: fix remaining_rows counter by @ritchie46 in https://github.com/pola-rs/polars/pull/3887
Parquet writing: reduce heap allocs by @ritchie46 in https://github.com/pola-rs/polars/pull/3879
Negative-indexing support for additional functions, and frame-level take_every by @alexander-beedie in https://github.com/pola-rs/polars/pull/3888
Make numpy an optional requirement by @stinodego in https://github.com/pola-rs/polars/pull/3861
Address deprecation warnings while running pytest by @stinodego in https://github.com/pola-rs/polars/pull/3889
Fix reading of gzipped CSV files. Fixes: #3895 by @ghuls in https://github.com/pola-rs/polars/pull/3896
Relocate hypothesis unit tests to parallel tests_parametric dir by @alexander-beedie in https://github.com/pola-rs/polars/pull/3899
Assign dtypes to expected columns when dtypes is a list and column se… by @ghuls in https://github.com/pola-rs/polars/pull/3901
docs: fix link to series method in DataFrame by @duskmoon314 in https://github.com/pola-rs/polars/pull/3897
docs: Improve py-polars docs by @thatlittleboy in https://github.com/pola-rs/polars/pull/3873
Complete pythonic slice support (inc. negative indexing/stride) for DataFrame and Series by @alexander-beedie in https://github.com/pola-rs/polars/pull/3904
Update docstring outputs by @ghuls in https://github.com/pola-rs/polars/pull/3912
Make embedded CSV test strings easier to read. by @ghuls in https://github.com/pola-rs/polars/pull/3907
Quiet an unnecessary warning (tests), and minor optimisation for slices with negative stride by @alexander-beedie in https://github.com/pola-rs/polars/pull/3913
fix dataframe explode with empty lists by @ritchie46 in https://github.com/pola-rs/polars/pull/3916
Implement pow/rpow for Series by @stinodego in https://github.com/pola-rs/polars/pull/3908
Fix Series __setitem__ and take by @stinodego in https://github.com/pola-rs/polars/pull/3910
fix negative offset in groupby_rolling by @ritchie46 in https://github.com/pola-rs/polars/pull/3918
make string formatting configurable by @ritchie46 in https://github.com/pola-rs/polars/pull/3919
Expr docstrings by @braaannigan in https://github.com/pola-rs/polars/pull/3871
parquet: parallelize over row groups ~3x by @ritchie46 in https://github.com/pola-rs/polars/pull/3924
Don't unwrap IPC Stream, instead use ? to not panic by @joshuataylor in https://github.com/pola-rs/polars/pull/3927
Corrected .select type hint to Sequence[str, Expr] by @thomasaarholt in https://github.com/pola-rs/polars/pull/3931
add impl from anyvalue for literal by @savente93 in https://github.com/pola-rs/polars/pull/3921
update arrow: ipc limit and reduce categorical-> dictionary bound checks by @ritchie46 in https://github.com/pola-rs/polars/pull/3926
fix window expression case by @ritchie46 in https://github.com/pola-rs/polars/pull/3937
fix oob panic on expand_at_index and series from pyarrow chunkedarray by @ritchie46 in https://github.com/pola-rs/polars/pull/3938
block equality/ordering based predicates on null producing joins by @ritchie46 in https://github.com/pola-rs/polars/pull/3939
Extended with_columns to allow **kwargs style named expressions by @alexander-beedie in https://github.com/pola-rs/polars/pull/3917
upcast float16 to float32 by @ritchie46 in https://github.com/pola-rs/polars/pull/3940
python: fix already mutable borrowed append by @ritchie46 in https://github.com/pola-rs/polars/pull/3943
Fixed assert_frame_equal and assert_series_equal for NaN values by @alexander-beedie in https://github.com/pola-rs/polars/pull/3941
Add from_numpy constructor by @stinodego in https://github.com/pola-rs/polars/pull/3944
Fix Pandas date_range warnings in tests by @zundertj in https://github.com/pola-rs/polars/pull/3945
fix ipc ordering by @ritchie46 in https://github.com/pola-rs/polars/pull/3947
Remove "import polars as pl" from docstrings by @zundertj in https://github.com/pola-rs/polars/pull/3948
[docs] improve python polars documentation by @thatlittleboy in https://github.com/pola-rs/polars/pull/3954
Modern style type hints for the test suite by @stinodego in https://github.com/pola-rs/polars/pull/3949
Fixed most See Also docstring formatting, quietened the last warnings coming from doctests by @alexander-beedie in https://github.com/pola-rs/polars/pull/3932
python: loossen truncate sorted restriction in docstring by @ritchie46 in https://github.com/pola-rs/polars/pull/3956
groupby apply: use inner type to infer dtype by @ritchie46 in https://github.com/pola-rs/polars/pull/3955
python polars 0.13.52 by @ritchie46 in https://github.com/pola-rs/polars/pull/3957
Fix pytest warning by @stinodego in https://github.com/pola-rs/polars/pull/3962
Update README.md by @cxtruong70 in https://github.com/pola-rs/polars/pull/3959
implicit datelike string comparison warning by @ritchie46 in https://github.com/pola-rs/polars/pull/3967
fix count union predicate by @ritchie46 in https://github.com/pola-rs/polars/pull/3969
docs: conventions, mwe and docstring fixes by @thatlittleboy in https://github.com/pola-rs/polars/pull/3973
Pythonic slice support for LazyFrame (efficient computation paths only) by @alexander-beedie in https://github.com/pola-rs/polars/pull/3970
add from_numpy to docs by @thatlittleboy in https://github.com/pola-rs/polars/pull/3976
use bitflags crate by @ritchie46 in https://github.com/pola-rs/polars/pull/3978
fix accidentally slow cross join by @ritchie46 in https://github.com/pola-rs/polars/pull/3980
ensure main lazyframe gets file cache opt state by @ritchie46 in https://github.com/pola-rs/polars/pull/3981
chore(tests): small readability fixes by @ryanrussell in https://github.com/pola-rs/polars/pull/3989
Remove unnessary imports by @zundertj in https://github.com/pola-rs/polars/pull/3988
Add support for loading a collection of parquet files by @andrei-ionescu in https://github.com/pola-rs/polars/pull/3894
improve from dictionary -> categorical by @ritchie46 in https://github.com/pola-rs/polars/pull/3996
fix col aggregation schema and ternary on empty series by @ritchie46 in https://github.com/pola-rs/polars/pull/3995
release memory on 0% selectivity by @ritchie46 in https://github.com/pola-rs/polars/pull/4000
col(dtypes).exclude() by @ritchie46 in https://github.com/pola-rs/polars/pull/4001
fix explode offsets for empty lists by @ritchie46 in https://github.com/pola-rs/polars/pull/4005
reduce peak memory of reading parquet by row groups ~-22% by @ritchie46 in https://github.com/pola-rs/polars/pull/4006
fix rolling groupby with negative windows by @ritchie46 in https://github.com/pola-rs/polars/pull/4010
fix: Lazyframe::from(lp) #3877 by @universalmind303 in https://github.com/pola-rs/polars/pull/4012
Date encode types by @ritchie46 in https://github.com/pola-rs/polars/pull/4013
csv: allow multiple null values by @ritchie46 in https://github.com/pola-rs/polars/pull/4016
python polars 0.13.53 by @ritchie46 in https://github.com/pola-rs/polars/pull/4017
Improve lazy state struct by @ritchie46 in https://github.com/pola-rs/polars/pull/4008
python: fix pyarrow imports by @ritchie46 in https://github.com/pola-rs/polars/pull/4025
fix lazy schema by @ritchie46 in https://github.com/pola-rs/polars/pull/4027
Align the exclude docstrings and annotation by @thatlittleboy in https://github.com/pola-rs/polars/pull/4020
docs: add mwe and internal links by @thatlittleboy in https://github.com/pola-rs/polars/pull/4019
impl explode for nested lists by @ritchie46 in https://github.com/pola-rs/polars/pull/4028
allow joining on expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/4029
allow nulls last in sort by expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/4030
python polars 0.13.54 by @ritchie46 in https://github.com/pola-rs/polars/pull/4031
feat: implement contains for DataFrame and LazyFrame by @thatlittleboy in https://github.com/pola-rs/polars/pull/4035
Remove py-polars legacy package by @stinodego in https://github.com/pola-rs/polars/pull/4037
Native trigonometry functions by @stinodego in https://github.com/pola-rs/polars/pull/4034
parquet: stop reading when slice is reached by @ritchie46 in https://github.com/pola-rs/polars/pull/4046
fix cross join by @ritchie46 in https://github.com/pola-rs/polars/pull/4045
More trigonometry by @stinodego in https://github.com/pola-rs/polars/pull/4047
Update flake8 settings by @stinodego in https://github.com/pola-rs/polars/pull/4038
pivot: fix categorical logicaltype by @ritchie46 in https://github.com/pola-rs/polars/pull/4048
Update mypy settings by @stinodego in https://github.com/pola-rs/polars/pull/4049
fix: reproducible Expr.hash by @thatlittleboy in https://github.com/pola-rs/polars/pull/4033
Fix constructor orient type hint by @stinodego in https://github.com/pola-rs/polars/pull/3961
Improve coverage report settings by @stinodego in https://github.com/pola-rs/polars/pull/4039
Added literal param to string-replace functions, optimized replace performance in small-string regime (30-80% faster) by @alexander-beedie in https://github.com/pola-rs/polars/pull/4057
parquet: low memory arg by @ritchie46 in https://github.com/pola-rs/polars/pull/4050
Upgrade Windows 10 tests, benchmark and doc jobs to Python3.10 by @zundertj in https://github.com/pola-rs/polars/pull/4059
Revert "Upgrade Windows 10 tests, benchmark and doc jobs to Python3.10" by @ritchie46 in https://github.com/pola-rs/polars/pull/4062
fill_null expr: ensure minimal supertype by @ritchie46 in https://github.com/pola-rs/polars/pull/4061
Fix connector-x integration for PostgreSQL by @valxv in https://github.com/pola-rs/polars/pull/4063
node updates by @universalmind303 in https://github.com/pola-rs/polars/pull/3984
python polars 0.13.55 by @ritchie46 in https://github.com/pola-rs/polars/pull/4064
Handle wrong input for orient argument by @stinodego in https://github.com/pola-rs/polars/pull/4065
Turn on doctests; fix wrong examples by @zundertj in https://github.com/pola-rs/polars/pull/4060
Mypy warn redundant casts by @zundertj in https://github.com/pola-rs/polars/pull/4055
Add mypy optional error codes by @stinodego in https://github.com/pola-rs/polars/pull/4054
recursively convert arrow logical types in to_arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/4067
improve unique performance by @ritchie46 in https://github.com/pola-rs/polars/pull/4070
Small formatting fixes by @stinodego in https://github.com/pola-rs/polars/pull/4071
[mypy] Add error codes by @stinodego in https://github.com/pola-rs/polars/pull/4072
reduce contention of global string cache: >4x performance improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/4078
Add lazy() method to LazyFrame by @zundertj in https://github.com/pola-rs/polars/pull/4077
[flake8] Enable flake8-bugbear extension by @stinodego in https://github.com/pola-rs/polars/pull/4073
csv: allow reading with different eol character by @ritchie46 in https://github.com/pola-rs/polars/pull/4080
docs: rework some MWE and minor formatting fixes by @thatlittleboy in https://github.com/pola-rs/polars/pull/4082
Upgrade maturin to 0.13.0 by @messense in https://github.com/pola-rs/polars/pull/4086
dataframe display: use POLARS_FMT_STR_LEN by @ritchie46 in https://github.com/pola-rs/polars/pull/4088
don't allow comparing local categoricals by @ritchie46 in https://github.com/pola-rs/polars/pull/4087
implement list hash for simply nested lists by @ritchie46 in https://github.com/pola-rs/polars/pull/4090
improve error on missing column access by @ritchie46 in https://github.com/pola-rs/polars/pull/4095
value_counts add sorted argument by @ritchie46 in https://github.com/pola-rs/polars/pull/4094
from_rows improve schema correctness by @ritchie46 in https://github.com/pola-rs/polars/pull/4097
Cache length of ChunkedArray. by @ritchie46 in https://github.com/pola-rs/polars/pull/4105
fix explode with empty lists by @ritchie46 in https://github.com/pola-rs/polars/pull/4113
fix so rank by @ritchie46 in https://github.com/pola-rs/polars/pull/4114
fix explode for sliced arrays by @ritchie46 in https://github.com/pola-rs/polars/pull/4115
python: to_numpy use first type as supertype by @ritchie46 in https://github.com/pola-rs/polars/pull/4116
python: remove css line for vscode by @ritchie46 in https://github.com/pola-rs/polars/pull/4117
Remove read_excel hacks by @cnpryer in https://github.com/pola-rs/polars/pull/4081
python allow set by string by @ritchie46 in https://github.com/pola-rs/polars/pull/4118
fill_nan preserve name by @ritchie46 in https://github.com/pola-rs/polars/pull/4119
Fix prefix/suffix docstrings. by @ghuls in https://github.com/pola-rs/polars/pull/4122
allow summing of duration in selection context by @ritchie46 in https://github.com/pola-rs/polars/pull/4124
python: improve setitem by @ritchie46 in https://github.com/pola-rs/polars/pull/4121
python polars 0.13.56 by @ritchie46 in https://github.com/pola-rs/polars/pull/4127
Assert deprecation warning on DataFrame.setitem in tests by @zundertj in https://github.com/pola-rs/polars/pull/4126
Run PR workflows on definition changes by @zundertj in https://github.com/pola-rs/polars/pull/4125
fix 'fatal: unsafe repository' in python build by @ritchie46 in https://github.com/pola-rs/polars/pull/4129
Nested dict by @ritchie46 in https://github.com/pola-rs/polars/pull/4131
improve performance of building global string cache from arrow dictio… by @ritchie46 in https://github.com/pola-rs/polars/pull/4132
csv writer quote if string contains new line char by @ritchie46 in https://github.com/pola-rs/polars/pull/4134
fix explode edge cases by @ritchie46 in https://github.com/pola-rs/polars/pull/4133
add pl.cut utility by @ritchie46 in https://github.com/pola-rs/polars/pull/4137
python polars 0.13.57 by @ritchie46 in https://github.com/pola-rs/polars/pull/4141
Mypy disallow untyped calls by @ritchie46 in https://github.com/pola-rs/polars/pull/4140
Improve re-raises of Exceptions by @zundertj in https://github.com/pola-rs/polars/pull/4142
pivot fix categorical index by @ritchie46 in https://github.com/pola-rs/polars/pull/4149
Fix typo by @stinodego in https://github.com/pola-rs/polars/pull/4146
Wrap long strings by @stinodego in https://github.com/pola-rs/polars/pull/4144
Fix Python line lengths to 88 characters by @stinodego in https://github.com/pola-rs/polars/pull/4152
add is_in for categoricals by @ritchie46 in https://github.com/pola-rs/polars/pull/4153
python 0.13.58 by @ritchie46 in https://github.com/pola-rs/polars/pull/4154
Docstring lints & improvements by @stinodego in https://github.com/pola-rs/polars/pull/4155
pivot: fix logical type of multiple indexes by @ritchie46 in https://github.com/pola-rs/polars/pull/4159
more tests by @ritchie46 in https://github.com/pola-rs/polars/pull/4163
Use latest arrow2 to support latest nightly rust by @gyscos in https://github.com/pola-rs/polars/pull/4162
Fix invalid inputs for trigonometric functions by @stinodego in https://github.com/pola-rs/polars/pull/4164
update schema in udfs by @ritchie46 in https://github.com/pola-rs/polars/pull/4165
python: expose idx type by @ritchie46 in https://github.com/pola-rs/polars/pull/4167
Improve getitem for Dataframe/Series. by @ghuls in https://github.com/pola-rs/polars/pull/4160
Dataframe equality by @stinodego in https://github.com/pola-rs/polars/pull/4076
Docstring improvements & enable lints by @stinodego in https://github.com/pola-rs/polars/pull/4161
Native implementation of the sign function by @stinodego in https://github.com/pola-rs/polars/pull/4147
Minor docs updates by @stinodego in https://github.com/pola-rs/polars/pull/4173
Validation for groupby arguments by @stinodego in https://github.com/pola-rs/polars/pull/4176
update arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/4177
throw error on schema failure by @ritchie46 in https://github.com/pola-rs/polars/pull/4178
with_columns update on duplicates by @ritchie46 in https://github.com/pola-rs/polars/pull/4179
fold regex expand by @ritchie46 in https://github.com/pola-rs/polars/pull/4181
python: prefer pyarrow when we can memory map the file by @ritchie46 in https://github.com/pola-rs/polars/pull/4182
window functions: sort cached groups if needed by @ritchie46 in https://github.com/pola-rs/polars/pull/4184
reduce supertype match by calling twice/ allow Some(tz)/None supertype by @ritchie46 in https://github.com/pola-rs/polars/pull/4186
Added const empty initializer to DataFrame by @TheDan64 in https://github.com/pola-rs/polars/pull/4187
fix utf8 explode for nulls and empty strings by @ritchie46 in https://github.com/pola-rs/polars/pull/4189
type-coercion: ignore unknown untill replaced by @ritchie46 in https://github.com/pola-rs/polars/pull/4192
python: always use stdlib http reader and improve memmap ipc reader a… by @ritchie46 in https://github.com/pola-rs/polars/pull/4193
slice pushdown for cross joins by @ritchie46 in https://github.com/pola-rs/polars/pull/4194
csv: ignore quoted lines in skip lines by @ritchie46 in https://github.com/pola-rs/polars/pull/4191
Small fixes in type formatting by @stinodego in https://github.com/pola-rs/polars/pull/4195
use native ndjson reader by @ritchie46 in https://github.com/pola-rs/polars/pull/4196
python polars: 0.13.59 by @ritchie46 in https://github.com/pola-rs/polars/pull/4198
Miscellaneous improvements by @matteosantama in https://github.com/pola-rs/polars/pull/4203
Add flake8 extension: comprehensions by @stinodego in https://github.com/pola-rs/polars/pull/4200
Add flake8 extension: simplify by @stinodego in https://github.com/pola-rs/polars/pull/4201
don't use pyarrow read if we have categoricals in the schema by @ritchie46 in https://github.com/pola-rs/polars/pull/4205
python: don't lock gil in arr.contains by @ritchie46 in https://github.com/pola-rs/polars/pull/4210
fix nested struct append by @ritchie46 in https://github.com/pola-rs/polars/pull/4217
use default context for col upstream col expression type by @ritchie46 in https://github.com/pola-rs/polars/pull/4219
ensure weekday starts at 0 by @ritchie46 in https://github.com/pola-rs/polars/pull/4220
python datetime consistency by @ritchie46 in https://github.com/pola-rs/polars/pull/4221
python: improve error by @ritchie46 in https://github.com/pola-rs/polars/pull/4223
Upgrade black, blackdoc, mypy, flake8 by @matteosantama in https://github.com/pola-rs/polars/pull/4209
python: ensure utf8 encoding when writing dot file by @ritchie46 in https://github.com/pola-rs/polars/pull/4225
convert arrow map to list by @ritchie46 in https://github.com/pola-rs/polars/pull/4226
fast path for sorted min/max by @ritchie46 in https://github.com/pola-rs/polars/pull/4228
Set no_implicit_reexport = true in pyproject.toml by @matteosantama in https://github.com/pola-rs/polars/pull/4211
fix and improve rolling_skew by @ritchie46 in https://github.com/pola-rs/polars/pull/4232
ternary expr: validate predicate in groupby context by @ritchie46 in https://github.com/pola-rs/polars/pull/4237
Overload pl.from_arrow type hints by @matteosantama in https://github.com/pola-rs/polars/pull/4236
python: allow horizontal expanding sum by @ritchie46 in https://github.com/pola-rs/polars/pull/4242
improve strictness/consistency of when then otherwise by @ritchie46 in https://github.com/pola-rs/polars/pull/4241
reinstate old ternary behavior as experimental by @ritchie46 in https://github.com/pola-rs/polars/pull/4244
correct dtype for power by @ritchie46 in https://github.com/pola-rs/polars/pull/4246
csv: improve data/datetime/bool overwrite by @ritchie46 in https://github.com/pola-rs/polars/pull/4247
Release rust 0.23.0 by @ritchie46 in https://github.com/pola-rs/polars/pull/4248

New Contributors

@GregoryBL made their first contribution in https://github.com/pola-rs/polars/pull/3566
@gyscos made their first contribution in https://github.com/pola-rs/polars/pull/3674
@LVG77 made their first contribution in https://github.com/pola-rs/polars/pull/3691
@gunjunlee made their first contribution in https://github.com/pola-rs/polars/pull/3671
@saethlin made their first contribution in https://github.com/pola-rs/polars/pull/3737
@joshuataylor made their first contribution in https://github.com/pola-rs/polars/pull/3793
@Smittyvb made their first contribution in https://github.com/pola-rs/polars/pull/3804
@thatlittleboy made their first contribution in https://github.com/pola-rs/polars/pull/3811
@braaannigan made their first contribution in https://github.com/pola-rs/polars/pull/3750
@thomasaarholt made their first contribution in https://github.com/pola-rs/polars/pull/3855
@duskmoon314 made their first contribution in https://github.com/pola-rs/polars/pull/3897
@savente93 made their first contribution in https://github.com/pola-rs/polars/pull/3921
@cxtruong70 made their first contribution in https://github.com/pola-rs/polars/pull/3959
@andrei-ionescu made their first contribution in https://github.com/pola-rs/polars/pull/3894
@valxv made their first contribution in https://github.com/pola-rs/polars/pull/4063
@matteosantama made their first contribution in https://github.com/pola-rs/polars/pull/4203

Full Changelog: https://github.com/pola-rs/polars/compare/rust-polars-v0.22.1...rust-polars-v0.23.0

polars - Rust polars 0.22.1

Published by ritchie46 over 2 years ago

What's Changed

partial support for list arithmetic by @ritchie46 in https://github.com/pola-rs/polars/pull/3307
shuffle sample option by @ritchie46 in https://github.com/pola-rs/polars/pull/3308
improve predicate pushdown by @ritchie46 in https://github.com/pola-rs/polars/pull/3313
Improve partitioned agg by @ritchie46 in https://github.com/pola-rs/polars/pull/3314
list to struct by @ritchie46 in https://github.com/pola-rs/polars/pull/3317
oncecell in favor of lazy_static by @ritchie46 in https://github.com/pola-rs/polars/pull/3319
Update cummax documentation by @briandk in https://github.com/pola-rs/polars/pull/3323
scan pyarrow dataset by @ritchie46 in https://github.com/pola-rs/polars/pull/3327
fix panic in csv parser by @ritchie46 in https://github.com/pola-rs/polars/pull/3339
implement anyvalue -> datatype for all variants by @ritchie46 in https://github.com/pola-rs/polars/pull/3340
remove badge by @ritchie46 in https://github.com/pola-rs/polars/pull/3341
Added PartitionedWriter for disk partitioning. by @illumination-k in https://github.com/pola-rs/polars/pull/3331
Fast json by @universalmind303 in https://github.com/pola-rs/polars/pull/3324
add hash to rust expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3350
serde for group options by @elferherrera in https://github.com/pola-rs/polars/pull/3349
Check if length of index in pivot operation is non-zero. Fixes: #3343. by @ghuls in https://github.com/pola-rs/polars/pull/3346
improve agg_list performance of chunked numerical data by @ritchie46 in https://github.com/pola-rs/polars/pull/3351
Fix init of DataFrame with empty dataset (eg:"[]") and column/schema typedefs by @alexander-beedie in https://github.com/pola-rs/polars/pull/3353
rechunk on default sort and groupby by @ritchie46 in https://github.com/pola-rs/polars/pull/3354
more partitioned groupby by @ritchie46 in https://github.com/pola-rs/polars/pull/3355
Add extension_module in python example by @Maxyme in https://github.com/pola-rs/polars/pull/3358
allow join on same cat source by @ritchie46 in https://github.com/pola-rs/polars/pull/3363
fix rename same name by @ritchie46 in https://github.com/pola-rs/polars/pull/3364
initial timezone support by @ritchie46 in https://github.com/pola-rs/polars/pull/3357
pivot index maintain logical type by @ritchie46 in https://github.com/pola-rs/polars/pull/3367
use array_ref in favor of chunks by @ritchie46 in https://github.com/pola-rs/polars/pull/3368
entropy normalization arg by @ritchie46 in https://github.com/pola-rs/polars/pull/3369
categorical keep type in comparisson by @ritchie46 in https://github.com/pola-rs/polars/pull/3370
rechunk in asof and allow concat to empty df by @ritchie46 in https://github.com/pola-rs/polars/pull/3376
improve overflow of numeric mean by @ritchie46 in https://github.com/pola-rs/polars/pull/3377
fix parquet stats by @ritchie46 in https://github.com/pola-rs/polars/pull/3378
delay rechunk optimization by @ritchie46 in https://github.com/pola-rs/polars/pull/3381
Allow Z in native strpttime by @ritchie46 in https://github.com/pola-rs/polars/pull/3382
more partitioned aggregators by @ritchie46 in https://github.com/pola-rs/polars/pull/3385
improve partition_by by @ritchie46 in https://github.com/pola-rs/polars/pull/3386
Add overload support to partition_by. by @ghuls in https://github.com/pola-rs/polars/pull/3388
Check if some arguments for read_csv and scan_csv got a 1 byte input. by @ghuls in https://github.com/pola-rs/polars/pull/3389
fix rayon SO in partition_by by @ritchie46 in https://github.com/pola-rs/polars/pull/3391
fix bug in predicate pushdown on dependent predicates by @ritchie46 in https://github.com/pola-rs/polars/pull/3394
fix predicate pushdown for predicates that do aggregations by @ritchie46 in https://github.com/pola-rs/polars/pull/3396
cumulative_eval by @ritchie46 in https://github.com/pola-rs/polars/pull/3400
ensure that Cast expressions first updates groups before it flattens by @ritchie46 in https://github.com/pola-rs/polars/pull/3401
improve and simplify ternary aggregation by @ritchie46 in https://github.com/pola-rs/polars/pull/3403
fix explode empty df by @ritchie46 in https://github.com/pola-rs/polars/pull/3405
Improve list builders, iteration and construction by @ritchie46 in https://github.com/pola-rs/polars/pull/3419
feature gate timezones by @ritchie46 in https://github.com/pola-rs/polars/pull/3422
fix cumulative_eval on window expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3421
csv allow only header and fix lazy rename by @ritchie46 in https://github.com/pola-rs/polars/pull/3423
upgrade arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/3425
infer dtype of empty list in recursive list construction & fix struct.arr take by @ritchie46 in https://github.com/pola-rs/polars/pull/3433
fix struct list concat by @ritchie46 in https://github.com/pola-rs/polars/pull/3435
csv parser fallback on chrono if datetime pattern fails by @ritchie46 in https://github.com/pola-rs/polars/pull/3436
improve rolling_quantile kernel (no nulls) ~28x by @ritchie46 in https://github.com/pola-rs/polars/pull/3437
improve rolling_{min/max/sum/mean} prerformance ~3.4x by @ritchie46 in https://github.com/pola-rs/polars/pull/3444
struct add chunk and impl reverse by @ritchie46 in https://github.com/pola-rs/polars/pull/3445
fix struct equality by @ritchie46 in https://github.com/pola-rs/polars/pull/3446
Struct error on different dict orders by @ritchie46 in https://github.com/pola-rs/polars/pull/3447
Inherit Exception in fallback exception classes by @adamgreg in https://github.com/pola-rs/polars/pull/3450
Struct creations/append/extend stricter schema by @ritchie46 in https://github.com/pola-rs/polars/pull/3454
don't allow predicate pushdown if compared column is being coerced by @ritchie46 in https://github.com/pola-rs/polars/pull/3457
improve rolling_min/max for columns with null values by @ritchie46 in https://github.com/pola-rs/polars/pull/3458
Improve rolling_sum/rolling_mean for windows with null values. by @ritchie46 in https://github.com/pola-rs/polars/pull/3466
explode series after slide fast path by @ritchie46 in https://github.com/pola-rs/polars/pull/3467
Improve struct by @ritchie46 in https://github.com/pola-rs/polars/pull/3468
improve rolling_var performance by @ritchie46 in https://github.com/pola-rs/polars/pull/3470
power by expression and improve rust lazy ergonomics by @ritchie46 in https://github.com/pola-rs/polars/pull/3475
add specialized rolling_std kernel by @ritchie46 in https://github.com/pola-rs/polars/pull/3476
fix null commutativity by @ritchie46 in https://github.com/pola-rs/polars/pull/3479
use anyvalue if first apply list result is empty by @ritchie46 in https://github.com/pola-rs/polars/pull/3480
Added describe method to rust library by @glennpierce in https://github.com/pola-rs/polars/pull/3320
Groupby Optimization for sorted keys: ~15x perf gain. by @ritchie46 in https://github.com/pola-rs/polars/pull/3489
make cat merge fallible and loossen restrictions on categorical appends by @ritchie46 in https://github.com/pola-rs/polars/pull/3491
Fix LazyFrame.join_asof documentation reference by @adamgreg in https://github.com/pola-rs/polars/pull/3493
feat: support pl.Time in Series.str.strptime by @fsimkovic in https://github.com/pola-rs/polars/pull/3496
str().extract_all / str().count_match by @ritchie46 in https://github.com/pola-rs/polars/pull/3507
add apply to cookbooks by @ritchie46 in https://github.com/pola-rs/polars/pull/3504
support all arrow dictionary keys < 64 bit by @ritchie46 in https://github.com/pola-rs/polars/pull/3508
fix accidental quadratic behavior in rolling_groupby by @ritchie46 in https://github.com/pola-rs/polars/pull/3510
Fix some unit test deprecation warnings by @adamgreg in https://github.com/pola-rs/polars/pull/3503
Experimental Allow rolling_<agg> expressions to determine window size by another {Date, Datetime} series. by @ritchie46 in https://github.com/pola-rs/polars/pull/3514
use specialize kernels in rolling_groupby aggregation ~10x perf gain (window of 100 elements) by @ritchie46 in https://github.com/pola-rs/polars/pull/3515
reduce probability of quadratic behavior in min/max rolling by @ritchie46 in https://github.com/pola-rs/polars/pull/3516
adjust for kleene logic in drop_na by @ritchie46 in https://github.com/pola-rs/polars/pull/3529
fix aggregation of empty list by @ritchie46 in https://github.com/pola-rs/polars/pull/3527
fix sorting of chunked numeric arrays by @ritchie46 in https://github.com/pola-rs/polars/pull/3528
adjust for kleene logic in drop_na by @ritchie46 in https://github.com/pola-rs/polars/pull/3530
Improve rolling min max by @ritchie46 in https://github.com/pola-rs/polars/pull/3531
fix null aggregation edge case by @ritchie46 in https://github.com/pola-rs/polars/pull/3536
allow concat/append expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3541
make sort by multiple columns parallel by @ritchie46 in https://github.com/pola-rs/polars/pull/3549
allow more aggregations on dtype duration by @ritchie46 in https://github.com/pola-rs/polars/pull/3550
use first series to validate length by @ritchie46 in https://github.com/pola-rs/polars/pull/3551
Raise a more helpful TypeError when trying to subscript a LazyFrame. by @ghuls in https://github.com/pola-rs/polars/pull/3554
Readability Fixes r2 by @ryanrussell in https://github.com/pola-rs/polars/pull/3556
add count_match, extract_all to python ref guide by @ritchie46 in https://github.com/pola-rs/polars/pull/3558
fill_null limits by @ritchie46 in https://github.com/pola-rs/polars/pull/3559
test sortedness propagation by @ritchie46 in https://github.com/pola-rs/polars/pull/3560
update boolean aggregates and ensure they return IdxSize by @ritchie46 in https://github.com/pola-rs/polars/pull/3563
Improve parse_lines error message. by @ghuls in https://github.com/pola-rs/polars/pull/3569
sorted_merge_join by @ritchie46 in https://github.com/pola-rs/polars/pull/3505
Rust Readability Improvements by @ryanrussell in https://github.com/pola-rs/polars/pull/3573
fix invalid fast path of sorted joins and improve sortedness propagation by @ritchie46 in https://github.com/pola-rs/polars/pull/3577
prevent expensive type coercion in expression and fix when->then->oth… by @ritchie46 in https://github.com/pola-rs/polars/pull/3579
Updated the fmt feature flag error message by @TheDan64 in https://github.com/pola-rs/polars/pull/3586
Fix u16 Series formatting. by @ghuls in https://github.com/pola-rs/polars/pull/3584
update arrow to crates.io: ~2x json parsing improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/3588

New Contributors

@kianmeng made their first contribution in https://github.com/pola-rs/polars/pull/3311
@briandk made their first contribution in https://github.com/pola-rs/polars/pull/3323
@EwoutH made their first contribution in https://github.com/pola-rs/polars/pull/3352
@adamgreg made their first contribution in https://github.com/pola-rs/polars/pull/3450
@ryanrussell made their first contribution in https://github.com/pola-rs/polars/pull/3488
@fsimkovic made their first contribution in https://github.com/pola-rs/polars/pull/3496
@chitralverma made their first contribution in https://github.com/pola-rs/polars/pull/3578
@TheDan64 made their first contribution in https://github.com/pola-rs/polars/pull/3586

Full Changelog: https://github.com/pola-rs/polars/compare/rust-polars-v0.21.1...rust-polars-v0.22.1

polars - Rust polars 0.21.1

Published by ritchie46 over 2 years ago

What's Changed

Remove crate num_cpus from polars by @dandxy89 in https://github.com/pola-rs/polars/pull/2890
temporarely pin crossbeam-epoch by @ritchie46 in https://github.com/pola-rs/polars/pull/2902
fix unique and drop by @ritchie46 in https://github.com/pola-rs/polars/pull/2908
fix explode of empty lists by @ritchie46 in https://github.com/pola-rs/polars/pull/2910
fix function input expansion by @ritchie46 in https://github.com/pola-rs/polars/pull/2913
fix compilation lazy + string by @ritchie46 in https://github.com/pola-rs/polars/pull/2914
respect dtype overwrite when schema is overwritten in lazy csv scanner by @ritchie46 in https://github.com/pola-rs/polars/pull/2915
deprecate to_ and string cache in lazy by @ritchie46 in https://github.com/pola-rs/polars/pull/2916
Refactor: move most temporal related code to polars-time. by @ritchie46 in https://github.com/pola-rs/polars/pull/2918
improve datetime inference by @ritchie46 in https://github.com/pola-rs/polars/pull/2923
rename distinct to unique by @ritchie46 in https://github.com/pola-rs/polars/pull/2926
fix some warning by @ritchie46 in https://github.com/pola-rs/polars/pull/2927
improve date/datetime inference by @ritchie46 in https://github.com/pola-rs/polars/pull/2925
fix fill_nan dtypes by @ritchie46 in https://github.com/pola-rs/polars/pull/2933
fix future calculation in groupby dynamic by @ritchie46 in https://github.com/pola-rs/polars/pull/2935
add tolerance to asof + by by @ritchie46 in https://github.com/pola-rs/polars/pull/2937
fix(scan_csv): handle empty csv file exception by @LuisCardosoOliveira in https://github.com/pola-rs/polars/pull/2934
handle Utf8Owned AnyValue for DataType by @cigrainger in https://github.com/pola-rs/polars/pull/2944
Fix argsort by @ritchie46 in https://github.com/pola-rs/polars/pull/2946
value_counts and unique_counts expression by @ritchie46 in https://github.com/pola-rs/polars/pull/2947
use schema in 'with_columns' to amortize lookups and fix bug in emptr… by @ritchie46 in https://github.com/pola-rs/polars/pull/2949
add native log and entropy expression by @ritchie46 in https://github.com/pola-rs/polars/pull/2952
csv parsing: skip whitespace on failed parse by @ritchie46 in https://github.com/pola-rs/polars/pull/2953
Literal in groupby context, arange and repeat by @ritchie46 in https://github.com/pola-rs/polars/pull/2958
Huge perf improvement of many expressions and ListChunked::from_iter perf by @ritchie46 in https://github.com/pola-rs/polars/pull/2962
update groups in count() agg and correctly update state by @ritchie46 in https://github.com/pola-rs/polars/pull/2963
add sign by @ritchie46 in https://github.com/pola-rs/polars/pull/2977
see kurtosis as aggregation by @ritchie46 in https://github.com/pola-rs/polars/pull/2993
fix groups state after apply by @ritchie46 in https://github.com/pola-rs/polars/pull/2992
Home directory support by @cjermain in https://github.com/pola-rs/polars/pull/2940
make sure that sort does not index empty list by @ritchie46 in https://github.com/pola-rs/polars/pull/2996
python: improve arithmetic consistency by @ritchie46 in https://github.com/pola-rs/polars/pull/3001
python: add apply on struct dtype by @ritchie46 in https://github.com/pola-rs/polars/pull/3003
fix null in non-fast-explode explode of numeric arrays by @ritchie46 in https://github.com/pola-rs/polars/pull/3006
also expand rename in filters by @ritchie46 in https://github.com/pola-rs/polars/pull/3008
fix when then with literal by @ritchie46 in https://github.com/pola-rs/polars/pull/3009
fix groups update to match exploded offsets by @ritchie46 in https://github.com/pola-rs/polars/pull/3010
add duration expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3017
allow nested groupby in groupby_rolling by @ritchie46 in https://github.com/pola-rs/polars/pull/3018
Fix read_parquet with list having nested struct by @cjermain in https://github.com/pola-rs/polars/pull/2991
fix outer join schema by @ritchie46 in https://github.com/pola-rs/polars/pull/3021
lazy: fix drop all by @ritchie46 in https://github.com/pola-rs/polars/pull/3023
fix schemas of groupby rolling/dynamic by @ritchie46 in https://github.com/pola-rs/polars/pull/3028
fix div by zero by @ritchie46 in https://github.com/pola-rs/polars/pull/3031
fix incorrect match in agg_mean by @ritchie46 in https://github.com/pola-rs/polars/pull/3030
check alias in whole expr on opt by @ritchie46 in https://github.com/pola-rs/polars/pull/3032
align groups in binary when they not align by @ritchie46 in https://github.com/pola-rs/polars/pull/3033
only expand function inputs if wildcard expansion allows it by @ritchie46 in https://github.com/pola-rs/polars/pull/3039
fix when_then_chain containing nulls by @ritchie46 in https://github.com/pola-rs/polars/pull/3040
fixed typo in format_path docstring by @cnpryer in https://github.com/pola-rs/polars/pull/3045
fix when-then-chain by @ritchie46 in https://github.com/pola-rs/polars/pull/3048
throw error on empty keyed groupby by @ritchie46 in https://github.com/pola-rs/polars/pull/3049
compare expand_cols by variant not exact datatype by @ritchie46 in https://github.com/pola-rs/polars/pull/3050
dot: use apply instead of map by @ritchie46 in https://github.com/pola-rs/polars/pull/3051
check output length of all 'map' expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3052
error on invalid asof_join by input by @ritchie46 in https://github.com/pola-rs/polars/pull/3053
improve performance of asof_join by equal or more than 2 keys by @ritchie46 in https://github.com/pola-rs/polars/pull/3055
remove unneeded expensive assert by @ritchie46 in https://github.com/pola-rs/polars/pull/3069
improve boolean null comparsions consistency by @ritchie46 in https://github.com/pola-rs/polars/pull/3068
fix entropy by @ritchie46 in https://github.com/pola-rs/polars/pull/3070
fix explode empty lists by @ritchie46 in https://github.com/pola-rs/polars/pull/3083
Lazy: update schema in explode op by @ritchie46 in https://github.com/pola-rs/polars/pull/3084
CSV datetime inference 3x performance improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/2950
[polars-sql] Adding SQL Context, SELECT and GROUP BY by @potter420 in https://github.com/pola-rs/polars/pull/3024
Default sample n param to 1 by @cnpryer in https://github.com/pola-rs/polars/pull/3090
Expose 'rechunk' param from "read_ipc" for consistency (default behaviour unchanged) by @alexander-beedie in https://github.com/pola-rs/polars/pull/3088
Add optional seeding for sampling by @cnpryer in https://github.com/pola-rs/polars/pull/3080
default to native strptime by @ritchie46 in https://github.com/pola-rs/polars/pull/3093
Raise error in sample() if n and frac are both passed by @cnpryer in https://github.com/pola-rs/polars/pull/3091
split up planner by @ritchie46 in https://github.com/pola-rs/polars/pull/3095
add test for #3097 by @ritchie46 in https://github.com/pola-rs/polars/pull/3098
Initial support for serde/pickling expressions. by @ritchie46 in https://github.com/pola-rs/polars/pull/3096
Adding nested struct support by fixing ArrayRef determination by @cjermain in https://github.com/pola-rs/polars/pull/3103
Enhanced columns param for DataFrame init, additionally allowing for inline type specification by @alexander-beedie in https://github.com/pola-rs/polars/pull/3100
Improve rolling agg by @ritchie46 in https://github.com/pola-rs/polars/pull/3101
add estimate_size methods by @ritchie46 in https://github.com/pola-rs/polars/pull/3110
fix and test estimated_size by @ritchie46 in https://github.com/pola-rs/polars/pull/3113
remove unused datafusion integration by @ritchie46 in https://github.com/pola-rs/polars/pull/3115
Nodejs writejson fix & avro read/write by @universalmind303 in https://github.com/pola-rs/polars/pull/3116
Parquet statistics: don't panic by @ritchie46 in https://github.com/pola-rs/polars/pull/3127
lazy: expand cols in filter by @ritchie46 in https://github.com/pola-rs/polars/pull/3128
melt extra arguments by @ritchie46 in https://github.com/pola-rs/polars/pull/3133
Lazy: Don't materialize whole table in JOIN followed by SLICE by @ritchie46 in https://github.com/pola-rs/polars/pull/3136
Pushdown SLICE to GROUPBY nodes by @ritchie46 in https://github.com/pola-rs/polars/pull/3138
Switch from unmaintained jemalloctor to maintained tikv-jemallocator. by @ghuls in https://github.com/pola-rs/polars/pull/3141
Polars vs Pivot: Round 3 🥊 ~2-25x improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/3143
DataFrame::partition_by by @ritchie46 in https://github.com/pola-rs/polars/pull/3148
Add semi and anti joins. by @ritchie46 in https://github.com/pola-rs/polars/pull/3149
derive clone for lazy groupby by @elferherrera in https://github.com/pola-rs/polars/pull/3156
pushdown slice to sort nodes by @ritchie46 in https://github.com/pola-rs/polars/pull/3159
slice_pushdown projections by @ritchie46 in https://github.com/pola-rs/polars/pull/3160
lazy err on not found col by @ritchie46 in https://github.com/pola-rs/polars/pull/3169
improve inner join performance by @ritchie46 in https://github.com/pola-rs/polars/pull/3168
fix duration filters with different time units by @marcvanheerden in https://github.com/pola-rs/polars/pull/3179
fix overflow in agg_mean by @ritchie46 in https://github.com/pola-rs/polars/pull/3183
list eval expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3185
Supporting Struct comparison and any/all API by @cjermain in https://github.com/pola-rs/polars/pull/3180
struct logical type arrow conversion by @ritchie46 in https://github.com/pola-rs/polars/pull/3193
make series comparissons fallible by @ritchie46 in https://github.com/pola-rs/polars/pull/3192
fix_pivot by @ritchie46 in https://github.com/pola-rs/polars/pull/3199
recursively convert arrow by @ritchie46 in https://github.com/pola-rs/polars/pull/3200
fix arr.eval type inference by @ritchie46 in https://github.com/pola-rs/polars/pull/3203
Improve Left join on chunked data by @ritchie46 in https://github.com/pola-rs/polars/pull/3177
polars-ops by @ritchie46 in https://github.com/pola-rs/polars/pull/3212
Fix tree traversal complexity by @ritchie46 in https://github.com/pola-rs/polars/pull/3213
Adding struct column tests by @ishmandoo in https://github.com/pola-rs/polars/pull/3209
struct: handle validity by @ritchie46 in https://github.com/pola-rs/polars/pull/3217
bug template bounce resolved bugs by @ritchie46 in https://github.com/pola-rs/polars/pull/3218
add duration minutes by @ritchie46 in https://github.com/pola-rs/polars/pull/3219
fix partition boundary by @ritchie46 in https://github.com/pola-rs/polars/pull/3223
Option to check column order when comparing polars dataframes by @physinet in https://github.com/pola-rs/polars/pull/3206
fix dispatch of quantile aggregations by @ritchie46 in https://github.com/pola-rs/polars/pull/3234
Improving array refs for to_list by @cjermain in https://github.com/pola-rs/polars/pull/3231
fix offsets in categorical merge by @ritchie46 in https://github.com/pola-rs/polars/pull/3242
Serialize/Deserialize LazyFrames/Logical plans by @ritchie46 in https://github.com/pola-rs/polars/pull/3244
setup serializable function + null_count expr by @ritchie46 in https://github.com/pola-rs/polars/pull/3247
improve ternary in groupby context by @ritchie46 in https://github.com/pola-rs/polars/pull/3248
fix skew autoexplode and add test by @marcvanheerden in https://github.com/pola-rs/polars/pull/3251
quantile agg; update grouptuples by @ritchie46 in https://github.com/pola-rs/polars/pull/3252
Only pass dtype to array, if not None: Fixes #3253 by @ghuls in https://github.com/pola-rs/polars/pull/3257
polars 0.21.0 by @ritchie46 in https://github.com/pola-rs/polars/pull/3258
do not write empty chunk to parquet by @ritchie46 in https://github.com/pola-rs/polars/pull/3259
Improve partitioned groupby by @ritchie46 in https://github.com/pola-rs/polars/pull/3263
improve sample_perf by @ritchie46 in https://github.com/pola-rs/polars/pull/3264
add iso strptime patterns by @ritchie46 in https://github.com/pola-rs/polars/pull/3265
add partial decompression in read_csv by @ritchie46 in https://github.com/pola-rs/polars/pull/3268
fix partitoned and error don't ignore errors by @ritchie46 in https://github.com/pola-rs/polars/pull/3273
fix row count for u64 idx by @ritchie46 in https://github.com/pola-rs/polars/pull/3285
Code coverage for Rust/Python by @cjermain in https://github.com/pola-rs/polars/pull/3278
Improve groupby states by @ritchie46 in https://github.com/pola-rs/polars/pull/3291
recursive list builder in rows by @ritchie46 in https://github.com/pola-rs/polars/pull/3293
Fix ipc_read_schema so Path() and filename which start with "~/" work. by @ghuls in https://github.com/pola-rs/polars/pull/3297

New Contributors

@LuisCardosoOliveira made their first contribution in https://github.com/pola-rs/polars/pull/2934
@keiv-fly made their first contribution in https://github.com/pola-rs/polars/pull/2930
@cigrainger made their first contribution in https://github.com/pola-rs/polars/pull/2944
@slonik-az made their first contribution in https://github.com/pola-rs/polars/pull/3124
@physinet made their first contribution in https://github.com/pola-rs/polars/pull/3215

Full Changelog*: https://github.com/pola-rs/polars/compare/rust-polars-v0.20.0...rust-polars-v0.21.

polars - Rust polars 0.20.0

Published by ritchie46 over 2 years ago

New rust polars release! 🚀

This release of 286 commits is here thanks to the contributions of: (in no specific order):

@moritzwilksch
@JakobGM
@illumination-k
@tamasfe
@ghuls
@alexander-beedie
@Maxyme
@universalmind303
@qiemem
@glennpierce
@nmandery
@ilsley
@marcvanheerden

did I forget your contribution, please ping me, I do this manually 🙈

Most notable changes are:

Many bug fixes.
Many performance improvements.

features

Made representation of groups tuples more cache friendly #2431
Remove Seek requirement of readers
Add groupby_rolling as new entrance to expression API.
Improve CSV parsers stability and performance on several occasions
Horizontal aggregations are parallelized #2454
Reduce pivot code bloat and improve performance #2458
Struct data type added.
Extend methods that allow modification of the same memory if Arc::ref_count == 1
Avro readers and writers.
Improved rules of window expressions.
Support for us time unit.
Parquet use statistics in query optimizations.
Optimize projections in lazy computations. (Mostly useful when you deal with a large number of columns e.g. millions).
Improve performance and flexibility of melt operation @2799
new expressions
- str.split
- str.split_inclusive
- arr.join
- unique_stable
- str.split_exact
- count expression that does not require column names
- arr.arg_min
- arr.arg_max
- arr.diff
- arr.shift