Dataframes powered by a multithreaded, vectorized query engine, written in Rust
OTHER License
Bot releases are hidden (Show)
Published by ritchie46 5 months ago
rolling_*_by
from rolling_*(..., by=...)
in Rust (#16102)CsvReader
to CsvReadOptions
(#16126)CsvParserOptions
to CsvReaderOptions
, use in CsvReader
(#15919)LazyFrame
conversion errors (#15761)is_sorted
for numeric data (#16333)ctypes.util
in CPU check script if possible (#16307)concat
(#16128)to_datetime
(#15826)slope
in interpolate
(#15819)by
column in rolling_*_by
operations (#16249)struct.with_fields
(#16305)BETWEEN
clause (#16279)cs.by_index
, allow multiple indices for nth
(#16217)GROUP BY ALL
syntax and fix several issues with aliased group keys (#16179)rolling_*_by
from rolling_*(..., by=...)
in Rust (#16102)dd.mm.YYYY
(#16045)RLE_DICTIONARY
encoder (#15959)uint
datatype support for the SQL interface (#15993)by
argument for Expr.top_k
and Expr.bottom_k
(#15468)CsvParserOptions
to CsvReaderOptions
, use in CsvReader
(#15919)dt.round
(#15861)dt.truncate
supports broadcasting lhs (#15768)str.json_path_match
(#15764)LazyFrame
conversion errors (#15761)Series.reshape
against invalid parameters (#16281)IN
clauses (#16101)RLE_DICTIONARY
encoder" (#16113)typed_lit
to help schema determination in SQL "extract" func (#15955)is_not_nan
(#15889)shrink_dtype
as non-streaming (#15828)is_between
pushdown to scan_pyarrow_dataset
(#15769)LazyFrame.sort
doc example (#15658)polars-expr
README (#16316)cls
(not self
) in classmethods (#16303)ChunkedArray.chunk_id
to chunk_lengths
(#16273)CsvReadOptions
in LazyCsvReader
(#16283)Duration.is_zero
instead of comparing Duration.duration_ns to 0 (#16195)CsvReader
to CsvReadOptions
(#16126)sccache
action (#16088)polars-io
cleanup (#15885)polars_io::parquet
module (#15860)polars_io::csv
module (#15831)polars-io
(#15806)ensure_is_constant_duration
(#15733)Thank you to all our contributors for making this release possible!
@CanglongCl, @JulianCologne, @KDruzhkin, @MarcoGorelli, @NedJWestern, @NexVeridian, @NickCondron, @Robinsane, @ShivMunagala, @TobiasDummschat, @YichiZhang0613, @alexander-beedie, @avimallu, @bertiewooster, @brandon-b-miller, @c-peters, @coastalwhite, @dangotbanned, @datenzauberai, @deanm0000, @dependabot, @dependabot[bot], @eitsupi, @gasmith, @haocheng6, @ion-elgreco, @itamarst, @janpipek, @jr200, @jrycw, @jsarbach, @luke396, @marenwestermann, @max-muoto, @mbuhidar, @nameexhaustion, @orlp, @pydanny, @r-brink, @reswqa, @ritchie46, @stinodego, @thalassemia, @tharunsuresh-code, @twoertwein, @wence- and @wsyxbcl
Published by ritchie46 11 months ago
Series.set_at_idx
to scatter
(#12540)Series.view
(#12539)cumsum -> cum_sum
and similar (#12513)take
to gather
(#12528)DataFrame
(#12492)take_every
to gather_every
(#12531)Series.inner_dtype
property (#12494)parse_int
in favor of to_integer
(#12464)is_not
(#12458)is_boolean
and is_utf8
(#12457)DataType.is_integer
and other dtype groups (#12200)~3x 0.19.13/ ~2x numpy
(#12471)~2x
(#12412)DataFrame
(#12492)write_csv
and sink_csv
(#12253)DataType.is_integer
and other dtype groups (#12200)Decimal
type to parquet (#12532)Series
comparison with timedelta
matches that of other types (#12497)map_dicts
(#12436)scan_csv
error type (#12355)\n
when reading file-like object wi⦠(#12333)PolarsInefficientMapWarning
for lshift/rshift operations (#12385)polars-ds
to list of community plugins (#12527)polars-hash
reference (#12505)polars-hash
(#12496)import polars
timing test; now much more consistent/reliable (#12478).with_columns()
in all .list
namespace examples (#12475)manylinux_2_17
for building x86-64
wheel (#12408)Thank you to all our contributors for making this release possible!
@MarcoGorelli, @abstractqqq, @alexander-beedie, @c-peters, @cmdlineluser, @hirohira9119, @ion-elgreco, @jerome3o, @nameexhaustion, @reswqa, @ritchie46, @stinodego and @uchiiii
Published by stinodego 12 months ago
Thank you to all our contributors for making this release possible!
@ritchie46
Published by ritchie46 over 1 year ago
StringCache
object as a function decorator (#9309)Config
object as a function decorator (#9307)pydantic
2.x release (#9296)Thank you to all our contributors for making this release possible!
@alexander-beedie, @magarick, @ritchie46, @stinodego and @thomascamminady
Published by ritchie46 almost 2 years ago
~-14%
(#5841)Thank you to all our contributors for making this release possible!
@chitralverma, @ghuls and @ritchie46
Published by ritchie46 almost 2 years ago
Thank you to all our contributors for making this release possible!
@alexander-beedie, @ghuls, @ritchie46 and @stinodego
Published by stinodego about 2 years ago
Published by stinodego about 2 years ago
Published by ritchie46 about 2 years ago
This is the release of rust polars 0.24.0. This release comes with a lot of bug fixes, performance improvements and added functionality. The changes that stand out are larger than RAM memory mapping of IPC files and a new common-subplan-optimization that prunes duplicated sub-plan from the query plan and thereby potentially save a lot of duplicated work.
See the 0.14.0 release for all upstream improvements.
Full Changelog: https://github.com/pola-rs/polars/compare/rust-polars-v0.23.0...rust-polars-v0.24.0
Published by ritchie46 about 2 years ago
ljust
and rjust
expressions by @ritchie46 in https://github.com/pola-rs/polars/pull/3603
scan_ipc/parquet
can scan from fsspec sources e.g. s3
. by @ritchie46 in https://github.com/pola-rs/polars/pull/3626
py-polars
by @ryanrussell in https://github.com/pola-rs/polars/pull/3700
polars-lazy
readability improvements by @ryanrussell in https://github.com/pola-rs/polars/pull/3701
DataFrame.hstack()
by @adamgreg in https://github.com/pola-rs/polars/pull/3697
Series
to DataFrame.with_columns()
argument annotation by @adamgreg in https://github.com/pola-rs/polars/pull/3696
contains
check that opts-in to contains_literal
fast-path by @alexander-beedie in https://github.com/pola-rs/polars/pull/3736
arg_where
expression by @ritchie46 in https://github.com/pola-rs/polars/pull/3757
(tpch 2/7) ~5%
faster by @ritchie46 in https://github.com/pola-rs/polars/pull/3774
DataFrame
, LazyFrame
, and Series
by @alexander-beedie in https://github.com/pola-rs/polars/pull/3791
date_range
to produce date
ranges as well as datetime
by @alexander-beedie in https://github.com/pola-rs/polars/pull/3798
chunked_array
readability improvements by @ryanrussell in https://github.com/pola-rs/polars/pull/3810
/polars/polars-core/src/frame/
readability by @ryanrussell in https://github.com/pola-rs/polars/pull/3813
~35-40%
by @ritchie46 in https://github.com/pola-rs/polars/pull/3821
agg_list
/not_aggregated
combination by @ritchie46 in https://github.com/pola-rs/polars/pull/3835
null_probability
functionality for dataframes/series test strategies. by @alexander-beedie in https://github.com/pola-rs/polars/pull/3860
clone
ops by @alexander-beedie in https://github.com/pola-rs/polars/pull/3883
take_every
by @alexander-beedie in https://github.com/pola-rs/polars/pull/3888
tests_parametric
dir by @alexander-beedie in https://github.com/pola-rs/polars/pull/3899
__setitem__
and take
by @stinodego in https://github.com/pola-rs/polars/pull/3910
~3x
by @ritchie46 in https://github.com/pola-rs/polars/pull/3924
with_columns
to allow **kwargs style named expressions by @alexander-beedie in https://github.com/pola-rs/polars/pull/3917
assert_frame_equal
and assert_series_equal
for NaN values by @alexander-beedie in https://github.com/pola-rs/polars/pull/3941
See Also
docstring formatting, quietened the last warnings coming from doctests
by @alexander-beedie in https://github.com/pola-rs/polars/pull/3932
LazyFrame
(efficient computation paths only) by @alexander-beedie in https://github.com/pola-rs/polars/pull/3970
~-22%
by @ritchie46 in https://github.com/pola-rs/polars/pull/4006
orient
type hint by @stinodego in https://github.com/pola-rs/polars/pull/3961
literal
param to string-replace functions, optimized replace
performance in small-string regime (30-80% faster) by @alexander-beedie in https://github.com/pola-rs/polars/pull/4057
orient
argument by @stinodego in https://github.com/pola-rs/polars/pull/4065
>4x
performance improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/4078
ChunkedArray
. by @ritchie46 in https://github.com/pola-rs/polars/pull/4105
fill_nan
preserve name by @ritchie46 in https://github.com/pola-rs/polars/pull/4119
is_in
for categoricals by @ritchie46 in https://github.com/pola-rs/polars/pull/4153
pyproject.toml
by @matteosantama in https://github.com/pola-rs/polars/pull/4211
Full Changelog: https://github.com/pola-rs/polars/compare/rust-polars-v0.22.1...rust-polars-v0.23.0
Published by ritchie46 over 2 years ago
PartitionedWriter
for disk partitioning. by @illumination-k in https://github.com/pola-rs/polars/pull/3331
rolling_{min/max/sum/mean}
prerformance ~3.4x
by @ritchie46 in https://github.com/pola-rs/polars/pull/3444
rolling_var
performance by @ritchie46 in https://github.com/pola-rs/polars/pull/3470
~15x
perf gain. by @ritchie46 in https://github.com/pola-rs/polars/pull/3489
Experimental
Allow rolling_<agg>
expressions to determine window size by another {Date, Datetime}
series. by @ritchie46 in https://github.com/pola-rs/polars/pull/3514
~10x
perf gain (window of 100 elements) by @ritchie46 in https://github.com/pola-rs/polars/pull/3515
sorted_merge_join
by @ritchie46 in https://github.com/pola-rs/polars/pull/3505
~2x json
parsing improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/3588
Full Changelog: https://github.com/pola-rs/polars/compare/rust-polars-v0.21.1...rust-polars-v0.22.1
Published by ritchie46 over 2 years ago
num_cpus
from polars by @dandxy89 in https://github.com/pola-rs/polars/pull/2890
polars-time
. by @ritchie46 in https://github.com/pola-rs/polars/pull/2918
value_counts
and unique_counts
expression by @ritchie46 in https://github.com/pola-rs/polars/pull/2947
n
param to 1 by @cnpryer in https://github.com/pola-rs/polars/pull/3090
n
and frac
are both passed by @cnpryer in https://github.com/pola-rs/polars/pull/3091
~2-25x
improvement by @ritchie46 in https://github.com/pola-rs/polars/pull/3143
semi
and anti
joins. by @ritchie46 in https://github.com/pola-rs/polars/pull/3149
Full Changelog*: https://github.com/pola-rs/polars/compare/rust-polars-v0.20.0...rust-polars-v0.21.
Published by ritchie46 over 2 years ago
This release of 286 commits is here thanks to the contributions of: (in no specific order):
did I forget your contribution, please ping me, I do this manually π
Most notable changes are:
Made representation of groups tuples more cache friendly #2431
Remove Seek
requirement of readers
Add groupby_rolling
as new entrance to expression API.
Improve CSV parsers stability and performance on several occasions
Horizontal aggregations are parallelized #2454
Reduce pivot code bloat and improve performance #2458
Struct
data type added.
Extend
methods that allow modification of the same memory if Arc::ref_count == 1
Avro readers and writers.
Improved rules of window expressions.
Support for us
time unit.
Parquet use statistics in query optimizations.
Optimize projections in lazy computations. (Mostly useful when you deal with a large number of columns e.g. millions).
Improve performance and flexibility of melt operation @2799
new expressions
See the 0.10.0 release for all upstream improvements.