Dataframes powered by a multithreaded, vectorized query engine, written in Rust
OTHER License
Bot releases are visible (Hide)
These API's were marked unstable and are allowed to change.
DELTA_LENGTH_BYTE_ARRAY
decoding (#18299)time
/timedelta
literals (#18223)~40%
(#18197)str.replace_many
(#18214)read_database
(#18277)explode
as gather
(#18431)scan_parquet(parallel='prefiltered')
problems (#18278)upsample
only have to be sorted within groups (#18264)hist
when bin_count
specified (#16942)SQL
set op syntax (#18205)include_file_paths
(#18255)eager=True
(#18379)group_by_dynamic
(#18415)Series
methods to API reference (#18312)DataFrame.__getitem__
and Series.__getitem__
(#18309)coalesce
behaviour in join_asof
(#18273)Expr.shuffle
differentiating from df method (#18266)bin.size
expr docstring (#18222)DataFrame.map_rows
(#18227)nightly-2024-08-26
(#18370)py-polars
crate (#18204)test_read_database_cx_credentials
(#18220)Thank you to all our contributors for making this release possible!
@BartSchuurmans, @ChayimFriedman2, @MarcoGorelli, @StepfenShawn, @agossard, @alexander-beedie, @cgbur, @coastalwhite, @corwinjoy, @deanm0000, @henryharbeck, @ion-elgreco, @jqnatividad, @krasnobaev, @liufeimath, @markxwang, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stinodego, @sunadase, @thomascamminady and @wence-
Published by github-actions[bot] 2 months ago
Arc<Vec<_>>
instead of Arc<[_]>
for paths and hive partitions (#18066)FixedSizeBinary
(#18059)FixedSizeBinary
to BinaryView
cast (#18043)read_excel
and read_ods
(#18078)Config
state (#18151)filter=None
(#18139)include_index=False
(the default) (#18133)read_csv
(#18131)to_titlecase
was too narrowly defined (#18122)write_excel
column totals, don't forget to include any row-total cols (#18042)select(len())
for compressed files (#18067)sink_ipc_cloud
panicking with runtime error (#18091)CloudWriter
to use buffer before making requests (#18027)cfg(feature)
for shrink_dtype
(#18038)lazy
docstring (#18178)Thank you to all our contributors for making this release possible!
@EricTulowetzke, @KDruzhkin, @MarcoGorelli, @Vincenthays, @alexander-beedie, @coastalwhite, @davanstrien, @deanm0000, @ember91, @kylebarron, @mcrumiller, @nameexhaustion, @orlp, @philss, @ritchie46 and @rosstitmarsh
Published by github-actions[bot] 2 months ago
sort_by_exprs()
(#17606)Arc<Vec<_>>
instead of Arc<[_]>
for paths and hive partitions (#18066)FixedSizeBinary
(#18059)FixedSizeBinary
to BinaryView
cast (#18043).dt.weekday
20x faster (#17992)MemSliceInner
enum (#17991)MemSlice
(#17983)MemReader
to file buffer in Parquet reader (#17712)scan
functions (#17616)ArrayChunks
to optimize codegen of BatchDecoder (#17632)Series
(#18166)size
method to Expr and Series "bin" namespace (#17924)SQL
interface support for PostgreSQL dollar-quoted string literals (#17940)apply_into_string_amortized
instead of apply_to_buffer
(#17903)SQL
"INTERSECT" and "EXCEPT" set ops (#17835)to_string
for Date dtype (#17670)is_in
operation on decimal type (#17832)hf://
in read_(csv|ipc|ndjson)
functions (#17785)hf://
(#17682)returns_scalar
to map_elements (#17613)describe
on decimal (#15092)scan_ipc
(#17434)filter=None
(#18139)include_index=False
(the default) (#18133)read_csv
(#18131)to_titlecase
was too narrowly defined (#18122)select(len())
for compressed files (#18067)sink_ipc_cloud
panicking with runtime error (#18091)CloudWriter
to use buffer before making requests (#18027)cfg(feature)
for shrink_dtype
(#18038)read_csv
schema
to take unparsable types (#17765)sort
method (#17947)COUNT(DISTINCT x)
should not include NULL values (#17930)glob=False
for cloud reads (#17860)from_arrow
for struct type (#17839)NullArray
in Parquet (#17807)write_ipc
(#17752)pivot_schema
(#17611)sort_by_exprs()
(#17606)collect
in file scan methods (#17532).list.(get|gather)
(#17511)slice
length no longer allowing None
(#17372)SchemaError
exception message (#17350)str.lengths
to str.len_bytes
in description text (#11577) (#17626)polars.Expr.bin.decode
(#17508)nightly-2024-07-26
(#17891)comfy-table
version (#18028)str.contains_any
and str.replace_many
(#17961)typos
command in make pre-commit
for py-polars folder (#17897)typos
configuration features (#17800)ComputeNode
in new streaming engine (#17389)ArrayChunks
to optimize codegen of BatchDecoder (#17632)utils
to path_utils
in polars-io (#17635)polars-io
crate (#17521)Thank you to all our contributors for making this release possible!
@5j9, @ByteNybbler, @EricTulowetzke, @JamesCE2001, @Julian-J-S, @KDruzhkin, @MarcoGorelli, @Object905, @SandroCasagrande, @Vincenthays, @alexander-beedie, @anergictcell, @arnabanimesh, @atigbadr, @brandon-b-miller, @brunobbaraujo, @cmdlineluser, @coastalwhite, @davanstrien, @deanm0000, @deepyaman, @delsner, @dependabot, @dependabot[bot], @diegoglozano, @eitsupi, @ember91, @flisky, @henryharbeck, @implicit-apparatus, @itamarst, @jonaylor89, @jparag, @knl, @kylebarron, @lukapeschke, @mcrumiller, @moritzwilksch, @nameexhaustion, @orlp, @phi-friday, @philss, @r-brink, @ragyabraham, @rcorty, @ritchie46, @rosstitmarsh, @ruihe774, @sherlockbeard, @stinodego, @szepeviktor, @tylerriccio33, @wangxiaoying and @wence-
Published by github-actions[bot] 3 months ago
Worksheet
objects to the write_excel
method (#18031)comfy-table
version (#18028)Thank you to all our contributors for making this release possible!
@alexander-beedie, @coastalwhite, @deanm0000, @nameexhaustion and @ritchie46
Published by github-actions[bot] 3 months ago
.dt.weekday
20x faster (#17992)MemSliceInner
enum (#17991)MemSlice
(#17983)size
method to Expr and Series "bin" namespace (#17924)SQL
interface support for PostgreSQL dollar-quoted string literals (#17940)strict
argument (#17990)sort
method (#17947)COUNT(DISTINCT x)
should not include NULL values (#17930)None
in pycapsule interface export (#17922)last
is never ambiguous with max
(#17962)str.contains_any
and str.replace_many
(#17961)allow_null
as replacement (#17969)Thank you to all our contributors for making this release possible!
@JamesCE2001, @MarcoGorelli, @alexander-beedie, @coastalwhite, @deanm0000, @deepyaman, @dependabot, @dependabot[bot], @henryharbeck, @kylebarron, @nameexhaustion, @ritchie46 and @wangxiaoying
Published by github-actions[bot] 3 months ago
MemReader
to file buffer in Parquet reader (#17712)apply_into_string_amortized
instead of apply_to_buffer
(#17903)SQL
"INTERSECT" and "EXCEPT" set ops (#17835)write_excel
(#17757)to_string
for Date dtype (#17670)is_in
operation on decimal type (#17832)read_excel
when using "calamine" engine with the latest fastexcel
(#17735)hf://
in read_(csv|ipc|ndjson)
functions (#17785)collect_schema
(#17761)hf://
(#17682)get_column_index
(#17868)glob=False
for cloud reads (#17860)write_excel
int/float format when using a dark "table_style" (#17869)from_arrow
for struct type (#17839)write_excel
(#17846)NullArray
in Parquet (#17807)named_expr
and schema
in pl.struct
(#17768)write_ipc
(#17752)join
types for clarity (#17843)read_*
functions in Hugging Face section in user guide (#17799)Expr.map_batches
(#17789)nightly-2024-07-26
(#17891)uv pip install
to verbose (#17901)typos
command in make pre-commit
for py-polars folder (#17897)typos
configuration features (#17800)setuptools
(#17726)Thank you to all our contributors for making this release possible!
@MarcoGorelli, @Object905, @SandroCasagrande, @alexander-beedie, @atigbadr, @coastalwhite, @deanm0000, @delsner, @dependabot, @dependabot[bot], @henryharbeck, @implicit-apparatus, @jparag, @knl, @kylebarron, @lukapeschke, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @ruihe774, @stinodego, @szepeviktor and @wence-
Published by github-actions[bot] 3 months ago
__all__
(#17494)sink
methods (#17698)read_database
issue with batched reads from Snowflake (#17688)value_counts
methods based on normalize parameter (#17685)setuptools
to fix failing CI (#17695)sink
methods (#17698)setuptools
to fix failing CI (#17695)ComputeNode
in new streaming engine (#17389)Thank you to all our contributors for making this release possible!
@5j9, @ByteNybbler, @MarcoGorelli, @alexander-beedie, @coastalwhite, @diegoglozano, @eitsupi, @nameexhaustion, @orlp, @ragyabraham, @ritchie46 and @ruihe774
Published by github-actions[bot] 3 months ago
scan
functions (#17616)ArrayChunks
to optimize codegen of BatchDecoder (#17632)infer_schema
parameter to read_csv
/ scan_csv
(#17617)returns_scalar
to map_elements (#17613)describe
on decimal (#15092)write_database
(#17470)pivot_schema
(#17611)sort_by_exprs()
(#17606)O_CLOEXEC
on duplicated file descriptor (#17537)collect
in file scan methods (#17532)retries
parameter in scan functions not taking effect when it was set to 0
(#17564).list.(get|gather)
(#17511)scan_ipc
does not go through fsspec (#17495)sink_csv
(#17476)plot
docs to refer to docstrings (#17504)str.lengths
to str.len_bytes
in description text (#11577) (#17626)polars.Expr.bin.decode
(#17508)read_database_uri
docstring (#17536)DataFrame.melt
and LazyFrame.melt
(#17530)write_parquet_partitioned
(#17488)ArrayChunks
to optimize codegen of BatchDecoder (#17632)utils
to path_utils
in polars-io (#17635)with_column
method of PyLazyFrame (#17607)style
accessor to DataFrame
(#17502)is_supported_cloud
util (#17493)Thank you to all our contributors for making this release possible!
@Julian-J-S, @MarcoGorelli, @alexander-beedie, @anergictcell, @arnabanimesh, @brandon-b-miller, @cmdlineluser, @coastalwhite, @deanm0000, @eitsupi, @flisky, @henryharbeck, @itamarst, @jonaylor89, @moritzwilksch, @nameexhaustion, @orlp, @phi-friday, @r-brink, @rcorty, @ritchie46, @ruihe774, @stinodego, @tylerriccio33 and @wence-
Published by github-actions[bot] 4 months ago
scan_ipc
(#17434)Series.__getitem__
(#17408)read_excel
engines (#17448)from_pandas
for string columns with missing values (#17397)SQL
interface (#17400)slice
length no longer allowing None
(#17372)SchemaError
exception message (#17350)partition_by
docstring to match new behavior (#17394)GroupBy.__iter__
docstring to match new behavior (#17383)np.trapz
in tests to prepare for NumPy 2.0 (#17387)sink_csv
test (#17386)Thank you to all our contributors for making this release possible!
@alexander-beedie, @brunobbaraujo, @cmdlineluser, @coastalwhite, @dependabot, @dependabot[bot], @nameexhaustion, @orlp, @phi-friday, @ritchie46, @ruihe774, @sherlockbeard, @stinodego, @tylerriccio33 and @wence-
Published by github-actions[bot] 4 months ago
unique
performance by adding RangedUniqueKernel for primitive arrays (#17166)NATURAL
joins and the COLUMNS
function (#17295)str.extract_many
expression (#17304)SQL
Struct/JSON field access operators (#17226)ORDER BY ALL
syntax (#17212)^@
("starts with"), and ~~
,~~*
,!~~
,!~~*
("like", "ilike") string-matching operators (#17251)SELECT * ILIKE
wildcard syntax (#17169)SQL
temporal functions STRFTIME
and STRPTIME
, and typed literal syntax (#17245)round/ceil/floor
on integer types (#17241)write_csv/write_json
(#14209)float_scientific
option to write_csv
/sink_csv
(#17111)sink_csv
fails (#17313)list.get
for column index (#17276)list.get
(#17262)nulls_last
parameter in aggregate sort_by
(#17249)DataFrame.top_k
not handling nulls correctly (#17239)selector
set ops (#17299)CAST
and TRY_CAST
functions (#17214)iter
in list.get
(#17286)select
and with_columns
to new streaming engine (#17185)chrono
's ParseErrorKind is now public (#17201)Thank you to all our contributors for making this release possible!
@IvanIsCoding, @JamesCE2001, @MarcoGorelli, @SeanTater, @adamreeve, @alexander-beedie, @coastalwhite, @datapythonista, @flisky, @itamarst, @jqnatividad, @lukeshingles, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stinodego and @wence-
Published by github-actions[bot] 4 months ago
This is the first major release for Python Polars. Please check out the upgrade guide for help navigating the breaking changes when upgrading to this version.
read_excel
to "calamine"
(#17263)pyproject.toml
(#17168)read/scan_parquet
to disable Hive partitioning by default for file inputs (#17106)replace
functionality into two separate methods (#16921)compression
argument as keyword-only (#17084)ModuleUpgradeRequired
and PolarsPanicError
error, remove InvalidAssert
error (#17033)strict
parameter in Series constructor (#16939)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)selector
XOR set operation, guarantee consistent selector column-order (#16833)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)set_sorted
to only accept a single column (#16800)Series.cut/qcut
and update struct field names (#16741)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)LazyFrame.fetch
(#17278)size
parameter in parametric testing strategies in favor of min_size
/max_size
(#17128)replace
functionality into two separate methods (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)dt.mean
/dt.median
in favor of mean
/median
(#16888)LazyFrame.with_context
in favor of horizontal concatenation (#16860)descending
to reverse
in top_k
methods (#16817)str.concat
to str.join
and update default delimiter (#16790)arctan2d
in favor of arctan2(...).degrees()
(#16786)group_by
`iteration (#17302)unique
performance by adding RangedUniqueKernel for primitive arrays (#17166)unique
performance by creating UniqueKernel and improve bool implementation (#17160)compression
argument as keyword-only (#17084)if-then-else
view kernel (#16993)AND
filter into multiple nodes (#16992)arg_sort
of row-encoding (#16894)rle_id
iteration performance and set sorted flags (#16893)sort
for String and Binary types (#16871)split_at
in split
(#16865)split_at
instead of double slice in chunk splits. (#16856)align_
if arrays are aligned (#16850)arg_sort
(#16808)dt.offset_by
2x for constant durations (#16728)join
if non-coalesced key isn't projected (#16677)dt.truncate
1.5x faster when every
is just a single duration (and not an expression) (#16666)NATURAL
joins and the COLUMNS
function (#17295)str.extract_many
expression (#17304)read_excel
to "calamine"
(#17263)LazyFrame.fetch
(#17278)SQL
Struct/JSON field access operators (#17226)ORDER BY ALL
syntax (#17212)^@
("starts with"), and ~~
,~~*
,!~~
,!~~*
("like", "ilike") string-matching operators (#17251)SELECT * ILIKE
wildcard syntax (#17169)SQL
temporal functions STRFTIME
and STRPTIME
, and typed literal syntax (#17245)round/ceil/floor
on integer types (#17241)write_csv/write_json
(#14209)get_column
DataFrame method (#17176)float_scientific
option to write_csv
/sink_csv
(#17111)Struct
field selection in the SQL engine, RENAME
and REPLACE
select wildcard options (#17109)DataFrame.pivot
to allow index=None
when values
is set (#17126)read/scan_parquet
to disable Hive partitioning by default for file inputs (#17106)replace
functionality into two separate methods (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)explain
and show_graph
(#17074)pl.col
autocompletion for iPython (#17080)read_ndjson
(#17068)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)ModuleUpgradeRequired
and PolarsPanicError
error, remove InvalidAssert
error (#17033)rechunk
parameter to read_delta
(#16991)json_normalize
(#17015)AND
filter into multiple nodes (#16992)strict
parameter in Series constructor (#16939)INTERSECT
and EXCEPT
ops (#16960)PerformanceWarning
to LazyFrame properties (#16964)collect_schema
method to LazyFrame
and DataFrame
(#16929)lit
(#16950)DataFrame.style
namespace (#16809)Schema
class (#16873)value_counts
(#16917)read_csv
SQL table reading function defaults (better handle dates) (#16866)VALUES
clause and inline renaming of columns in CTE & derived table definitions (#16851)Enum
values in lit
(#16858).str.to_datetime
when values are offset-aware (#16742)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)SQL
"SELECT" with no tables, optimise registration of globals (#16836)selector
XOR set operation, guarantee consistent selector column-order (#16833)EXTRACT
and DATE_PART
SQL part abbreviations (#16767)set_sorted
to only accept a single column (#16800)group_by
iteration and partition_by
to always return tuple keys (#16793)read_database_uri
passthrough from read_database
(#16783)pyxlsb
engine from read_excel
(#16784)check_order
parameter to assert_series_equal
(#16778)scan_csv
(#16674)INTERVAL
handling and improve related error messages, update sqlparser-rs
lib (#16744)ORDER BY
clause (#16745)pandas
and pyarrow
objects (#16746)Series.cut/qcut
and update struct field names (#16741)date_range
to no longer produce datetime ranges (#16734)min_periods
as keyword-only for rolling
methods (#16738)top_k
parameters nulls_last
, maintain_order
, and multithreaded
(#16599)NULLS FIRST/LAST
ordering (#16711)INTERVAL
strings (#16732)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)str.to_datetime
(#16634)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)SQLInterface
and SQLSyntax
errors (#16635)DIV
function support to the SQL interface (#16678)sink_csv
fails (#17313)adbc
connections in write_database
(#17298)list.get
for column index (#17276)list.get
(#17262)nulls_last
parameter in aggregate sort_by
(#17249)DataFrame.top_k
not handling nulls correctly (#17239)lit
to address spurious test failure (#17187)ChainedWhen
should not inherit Expr
(#17142)fold
in certain situations (#17114)Series
dunder method type signatures (#17053)sqlalchemy
libraries (#17029)sort_by
of unequal length (#17026)FAST_EXPLODE_LIST
metadata (#16951)extend()
(#16890)should_rechunk
check (#16852)read_excel
and read_ods
return identical frames across all engines when given empty spreadsheet tables (#16802)read_excel
(#16840)top_k/bottom_k
and fix a variety of bugs (#16804)DATE_PART
SQL syntax/parsing, improve some error messages (#16761)pl.
qualifier for inner dtypes in to_init_repr
(#16235)assert_series_equal
when categorical_as_str=True
(#16700)read_database
check for SQLAlchemy async Session objects (#16680)selector
set ops (#17299)CAST
and TRY_CAST
functions (#17214)plot
namespace as unstable (#17205)concat_list
(#17127)DataFrame.unique
docstring (#17119)InProcessQuery
in docs, mark as unstable (#17097)write_parquet
docstring (#16909)select
and with_columns
to idiomatic form (#16801)DataFrame.limit
(#16753)include_nulls
in DataFrame.update
docstring (#16701)DataFrame.rolling
(#16600)Expr/Series.map_elements
(#16079)polars.sql
docs entry and small docstring update (#16656)pyproject.toml
(#17168)< 2.0.0
for now (#17060)iter
in list.get
(#17286)type_aliases
module to _typing
(#17282)cargo.toml
(#17145)concat_list
(#17120)orient="row"
in DataFrame constructor when applicable (#16977)Arc
from FileCacheEntry
(#16870)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)ChunkedArray::from_chunks_and_dtype
(#16697)1.0.0
release (#16705)Thank you to all our contributors for making this release possible!
@IvanIsCoding, @JamesCE2001, @JulianCologne, @KDruzhkin, @Kylea650, @MarcoGorelli, @Mottl, @Object905, @SeanTater, @adamreeve, @alexander-beedie, @bertiewooster, @borchero, @c-peters, @coastalwhite, @datapythonista, @datenzauberai, @dependabot, @dependabot[bot], @eitsupi, @flisky, @henryharbeck, @itamarst, @jqnatividad, @lukeshingles, @machow, @marenwestermann, @mcrumiller, @montanarograziano, @nameexhaustion, @orlp, @p3i0t, @ritchie46, @sherlockbeard, @stinodego, @tkellogg, @universalmind303 and @wence-
Published by github-actions[bot] 4 months ago
hive_partitioning
parameter default to None
, which is automatically enabled for single directory inputs, and disabled otherwise (#17106)replace
functionality into two separate functions (#16921)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)ModuleUpgradeRequired
and PolarsPanicError
error, remove InvalidAssert
error (#17033)strict
parameter in Series constructor (#16939)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)selector
XOR set operation, guarantee consistent selector column-order (#16833)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)set_sorted
to only accept a single column (#16800)Series.cut/qcut
and update struct field names (#16741)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)size
parameter in parametric testing strategies in favor of min_size
/max_size
(#17128)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)dt.mean
/dt.median
in favor of mean
/median
(#16888)LazyFrame.with_context
in favor of horizontal concatenation (#16860)descending
to reverse
in top_k
methods (#16817)str.concat
to str.join
and update default delimiter (#16790)arctan2d
in favor of arctan2(...).degrees()
(#16786)AND
filter into multiple nodes (#16992)split_at
in split
(#16865)split_at
instead of double slice in chunk splits. (#16856)align_
if arrays are aligned (#16850)arg_sort
(#16808)dt.offset_by
2x for constant durations (#16728)join
if non-coalesced key isn't projected (#16677)dt.truncate
1.5x faster when every
is just a single duration (and not an expression) (#16666)float_scientific
option to write_csv
/sink_csv
(#17111)Struct
field selection in the SQL engine, RENAME
and REPLACE
select wildcard options (#17109)DataFrame.pivot
to allow index=None
when values
is set (#17126)hive_partitioning
parameter default to None
, which is automatically enabled for single directory inputs, and disabled otherwise (#17106)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)pl.col
autocompletion for iPython (#17080)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)ModuleUpgradeRequired
and PolarsPanicError
error, remove InvalidAssert
error (#17033)rechunk
parameter to read_delta
(#16991)json_normalize
(#17015)AND
filter into multiple nodes (#16992)strict
parameter in Series constructor (#16939)INTERSECT
and EXCEPT
ops (#16960)PerformanceWarning
to LazyFrame properties (#16964)collect_schema
method to LazyFrame
and DataFrame
(#16929)lit
(#16950)Schema
class (#16873)value_counts
(#16917)eq
/ne
for more FixedSizeList
s (#16902)read_csv
SQL table reading function defaults (better handle dates) (#16866)VALUES
clause and inline renaming of columns in CTE & derived table definitions (#16851)Enum
values in lit
(#16858).str.to_datetime
when values are offset-aware (#16742)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)SQL
"SELECT" with no tables, optimise registration of globals (#16836)selector
XOR set operation, guarantee consistent selector column-order (#16833)EXTRACT
and DATE_PART
SQL part abbreviations (#16767)set_sorted
to only accept a single column (#16800)group_by
iteration and partition_by
to always return tuple keys (#16793)read_database_uri
passthrough from read_database
(#16783)pyxlsb
engine from read_database
(#16784)check_order
parameter to assert_series_equal
(#16778)scan_csv
(#16674)INTERVAL
handling and improve related error messages, update sqlparser-rs
lib (#16744)ORDER BY
clause (#16745)pandas
and pyarrow
objects (#16746)Series.cut/qcut
and update struct field names (#16741)date_range
to no longer produce datetime ranges (#16734)min_periods
as keyword-only for rolling
methods (#16738)top_k
parameters nulls_last
, maintain_order
, and multithreaded
(#16599)NULLS FIRST/LAST
ordering (#16711)INTERVAL
strings (#16732)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)str.to_datetime
(#16634)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)SQLInterface
and SQLSyntax
errors (#16635)DIV
function support to the SQL interface (#16678)ChainedWhen
should not inherit Expr
(#17142)GetOutput::get_field
fallible (#17114)Series
dunder method type signatures (#17053)sqlalchemy
libraries (#17029)FAST_EXPLODE_LIST
metadata (#16951)extend()
(#16890)should_rechunk
check (#16852)read_excel
and read_ods
return identical frames across all engines when given empty spreadsheet tables (#16802)read_excel
(#16840)top_k/bottom_k
and fix a variety of bugs (#16804)DATE_PART
SQL syntax/parsing, improve some error messages (#16761)pl.
qualifier for inner dtypes in to_init_repr
(#16235)assert_series_equal
when categorical_as_str=True
(#16700)read_database
check for SQLAlchemy async Session objects (#16680)concat_list
(#17127)DataFrame.unique
docstring (#17119)InProcessQuery
in docs, mark as unstable (#17097)select
and with_columns
to idiomatic form (#16801)DataFrame.limit
(#16753)include_nulls
in DataFrame.update
docstring (#16701)DataFrame.rolling
(#16600)Expr/Series.map_elements
(#16079)polars.sql
docs entry and small docstring update (#16656)< 2.0.0
for now (#17060)cargo.toml
(#17145)concat_list
(#17120)orient="row"
in DataFrame constructor when applicable (#16977)Arc
from FileCacheEntry
(#16870)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)ChunkedArray::from_chunks_and_dtype
(#16697)1.0.0
release (#16705)Thank you to all our contributors for making this release possible!
@JulianCologne, @KDruzhkin, @Kylea650, @MarcoGorelli, @Mottl, @Object905, @adamreeve, @alexander-beedie, @bertiewooster, @borchero, @c-peters, @coastalwhite, @datapythonista, @datenzauberai, @dependabot, @dependabot[bot], @eitsupi, @henryharbeck, @itamarst, @lukeshingles, @machow, @marenwestermann, @mcrumiller, @montanarograziano, @nameexhaustion, @orlp, @p3i0t, @ritchie46, @sherlockbeard, @stinodego, @tkellogg, @universalmind303 and @wence-
Published by github-actions[bot] 4 months ago
Struct
field selection in the SQL engine, RENAME
and REPLACE
select wildcard options (#17109)DataFrame.pivot
to allow index=None
when values
is set (#17126)cargo.toml
(#17145)Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @coastalwhite, @datapythonista, @eitsupi, @mcrumiller, @nameexhaustion, @orlp, @ritchie46 and @stinodego
Published by github-actions[bot] 4 months ago
hive_partitioning
parameter default to None
, which is automatically enabled for single directory inputs, and disabled otherwise (#17106)replace
functionality into two separate functions (#16921)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)ModuleUpgradeRequired
and PolarsPanicError
error, remove InvalidAssert
error (#17033)strict
parameter in Series constructor (#16939)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)selector
XOR set operation, guarantee consistent selector column-order (#16833)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)set_sorted
to only accept a single column (#16800)Series.cut/qcut
and update struct field names (#16741)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)size
parameter in parametric testing strategies in favor of min_size
/max_size
(#17128)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)dt.mean
/dt.median
in favor of mean
/median
(#16888)LazyFrame.with_context
in favor of horizontal concatenation (#16860)descending
to reverse
in top_k
methods (#16817)str.concat
to str.join
and update default delimiter (#16790)arctan2d
in favor of arctan2(...).degrees()
(#16786)AND
filter into multiple nodes (#16992)split_at
in split
(#16865)split_at
instead of double slice in chunk splits. (#16856)align_
if arrays are aligned (#16850)arg_sort
(#16808)dt.offset_by
2x for constant durations (#16728)join
if non-coalesced key isn't projected (#16677)dt.truncate
1.5x faster when every
is just a single duration (and not an expression) (#16666)DataFrame.pivot
to allow index=None
when values
is set (#17126)hive_partitioning
parameter default to None
, which is automatically enabled for single directory inputs, and disabled otherwise (#17106)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)pl.col
autocompletion for iPython (#17080)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)ModuleUpgradeRequired
and PolarsPanicError
error, remove InvalidAssert
error (#17033)rechunk
parameter to read_delta
(#16991)json_normalize
(#17015)AND
filter into multiple nodes (#16992)strict
parameter in Series constructor (#16939)INTERSECT
and EXCEPT
ops (#16960)PerformanceWarning
to LazyFrame properties (#16964)collect_schema
method to LazyFrame
and DataFrame
(#16929)lit
(#16950)Schema
class (#16873)value_counts
(#16917)eq
/ne
for more FixedSizeList
s (#16902)read_csv
SQL table reading function defaults (better handle dates) (#16866)VALUES
clause and inline renaming of columns in CTE & derived table definitions (#16851)Enum
values in lit
(#16858).str.to_datetime
when values are offset-aware (#16742)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)SQL
"SELECT" with no tables, optimise registration of globals (#16836)selector
XOR set operation, guarantee consistent selector column-order (#16833)EXTRACT
and DATE_PART
SQL part abbreviations (#16767)set_sorted
to only accept a single column (#16800)group_by
iteration and partition_by
to always return tuple keys (#16793)read_database_uri
passthrough from read_database
(#16783)pyxlsb
engine from read_database
(#16784)check_order
parameter to assert_series_equal
(#16778)scan_csv
(#16674)INTERVAL
handling and improve related error messages, update sqlparser-rs
lib (#16744)ORDER BY
clause (#16745)pandas
and pyarrow
objects (#16746)Series.cut/qcut
and update struct field names (#16741)date_range
to no longer produce datetime ranges (#16734)min_periods
as keyword-only for rolling
methods (#16738)top_k
parameters nulls_last
, maintain_order
, and multithreaded
(#16599)NULLS FIRST/LAST
ordering (#16711)INTERVAL
strings (#16732)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)str.to_datetime
(#16634)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)SQLInterface
and SQLSyntax
errors (#16635)DIV
function support to the SQL interface (#16678)GetOutput::get_field
fallible (#17114)Series
dunder method type signatures (#17053)sqlalchemy
libraries (#17029)FAST_EXPLODE_LIST
metadata (#16951)extend()
(#16890)should_rechunk
check (#16852)read_excel
and read_ods
return identical frames across all engines when given empty spreadsheet tables (#16802)read_excel
(#16840)top_k/bottom_k
and fix a variety of bugs (#16804)DATE_PART
SQL syntax/parsing, improve some error messages (#16761)pl.
qualifier for inner dtypes in to_init_repr
(#16235)assert_series_equal
when categorical_as_str=True
(#16700)read_database
check for SQLAlchemy async Session objects (#16680)concat_list
(#17127)DataFrame.unique
docstring (#17119)InProcessQuery
in docs, mark as unstable (#17097)select
and with_columns
to idiomatic form (#16801)DataFrame.limit
(#16753)include_nulls
in DataFrame.update
docstring (#16701)DataFrame.rolling
(#16600)Expr/Series.map_elements
(#16079)polars.sql
docs entry and small docstring update (#16656)< 2.0.0
for now (#17060)concat_list
(#17120)orient="row"
in DataFrame constructor when applicable (#16977)Arc
from FileCacheEntry
(#16870)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)ChunkedArray::from_chunks_and_dtype
(#16697)1.0.0
release (#16705)Thank you to all our contributors for making this release possible!
@JulianCologne, @KDruzhkin, @Kylea650, @MarcoGorelli, @Mottl, @Object905, @alexander-beedie, @bertiewooster, @borchero, @c-peters, @coastalwhite, @datenzauberai, @dependabot, @dependabot[bot], @henryharbeck, @itamarst, @machow, @marenwestermann, @mcrumiller, @montanarograziano, @nameexhaustion, @orlp, @p3i0t, @ritchie46, @sherlockbeard, @stinodego, @tkellogg, @universalmind303 and @wence-
Published by github-actions[bot] 4 months ago
hive_partitioning
parameter default to None
, which is automatically enabled for single directory inputs, and disabled otherwise (#17106)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)selector
XOR set operation, guarantee consistent selector column-order (#16833)str.concat
to str.join
and update default delimiter (#16790)Series.cut/qcut
and update struct field names (#16741)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)str.explode
in favor of str.split("").explode()
(#16508)how="outer"
join type in favour of how="full"
(left/right are *also* outer joins) (#16417)DataFrame.is_empty()
to check height == 0
instead of width == 0
(#16351)AND
filter into multiple nodes (#16992)split_at
in split
(#16865)split_at
instead of double slice in chunk splits. (#16856)align_
if arrays are aligned (#16850)arg_sort
(#16808)dt.offset_by
2x for constant durations (#16728)join
if non-coalesced key isn't projected (#16677)dt.truncate
1.5x faster when every
is just a single duration (and not an expression) (#16666)hive_partitioning
parameter default to None
, which is automatically enabled for single directory inputs, and disabled otherwise (#17106)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)pl.col
autocompletion for iPython (#17080)strict
parameter to DataFrame/LazyFrame.drop
and fix behavior to default to True (#17044)AND
filter into multiple nodes (#16992)POLARS_METADATA_FLAGS=extensive
(#16963)INTERSECT
and EXCEPT
ops (#16960)value_counts
(#16917)eq
/ne
for more FixedSizeList
s (#16902)read_csv
SQL table reading function defaults (better handle dates) (#16866)VALUES
clause and inline renaming of columns in CTE & derived table definitions (#16851).str.to_datetime
when values are offset-aware (#16742)SQL
"SELECT" with no tables, optimise registration of globals (#16836)selector
XOR set operation, guarantee consistent selector column-order (#16833)EXTRACT
and DATE_PART
SQL part abbreviations (#16767)scan_csv
(#16674)INTERVAL
handling and improve related error messages, update sqlparser-rs
lib (#16744)ORDER BY
clause (#16745)pandas
and pyarrow
objects (#16746)env
locked metadata functions (#16719)Series.cut/qcut
and update struct field names (#16741)date_range
to no longer produce datetime ranges (#16734)top_k
parameters nulls_last
, maintain_order
, and multithreaded
(#16599)NULLS FIRST/LAST
ordering (#16711)INTERVAL
strings (#16732)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)str.to_datetime
(#16634)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)SQLInterface
and SQLSyntax
errors (#16635)DIV
function support to the SQL interface (#16678)write_parquet::statistics
parameter (#16575)nulls_last
on sort operations (#16639)split_at
method to arrow Array
(#16620)ARRAY
literals and the UNNEST
table function (#16330)struct.with_fields
in grouping (#16629)TRY_CAST
function (#16589)group_by_dynamic
, upsample
, and rolling
(#16494)ChunkedArray
(#16399)is_column_selection()
to expression meta, enhance expand_selector
(#16479)value_counts
"count" column (#16434)field
expression as selector with an struct scope (#16402)DataFrame.is_empty()
to check height == 0
instead of width == 0
(#16351)GetOutput::get_field
fallible (#17114)sqlalchemy
libraries (#17029)FAST_EXPLODE_LIST
metadata (#16951)extend()
(#16890)should_rechunk
check (#16852)describe
/ explain
streaming plan (#16771)top_k/bottom_k
and fix a variety of bugs (#16804)DATE_PART
SQL syntax/parsing, improve some error messages (#16761)read_database
check for SQLAlchemy async Session objects (#16680)ORDER BY
should not cause reordering of SELECT
cols (#16579)Series
in LazyFrame.select()
(#16592)floordiv
(#16578)JOIN
issues (#16507)cluster_with_columns
, found with small fuzzer (#16562)cluster_with_columns
(#16548)sum
over a list
of str
s (#16521)split_chunks
for nested dtypes (#16493)COUNT(*)
in SQL GROUP BY
operations (#16465)nightly-2024-06-03
(#16669)replace
functionality into two separate functions (#16921)DataFrame.melt
to unpivot
and make parameters consistent with pivot
(#17095)Arc
from FileCacheEntry
(#16870)MutableBitmap.null_count
method (#16797)str.concat
to str.join
and update default delimiter (#16790)ChunkedArray::from_chunks_and_dtype
(#16697)Aggregation
evaluation PhysicalExpr from conversion (#16688)1.0.0
release (#16705)try_{add, mul, ...}
ops in borrowed dispatch (#16580)dates_times
into separate date
and time
modules (#16667)pushable
and potential_pushable
(#16626)min_value
, max_value
and distinct_count
(#16593)str.explode
in favor of str.split("").explode()
(#16508)Statistics
into enum
instead of trait
(#16485)pushable_set_bits
and reserve space for input_exprs
(#16468)how="outer"
join type in favour of how="full"
(left/right are *also* outer joins) (#16417)Thank you to all our contributors for making this release possible!
@BGR360, @JulianCologne, @KDruzhkin, @Kylea650, @MarcoGorelli, @Mottl, @Object905, @alexander-beedie, @ankane, @bertiewooster, @borchero, @c-peters, @cmdlineluser, @coastalwhite, @dangotbanned, @datenzauberai, @dependabot, @dependabot[bot], @hattajr, @henryharbeck, @itamarst, @machow, @marenwestermann, @mcrumiller, @mdavis-xyz, @messense, @montanarograziano, @nameexhaustion, @orlp, @p3i0t, @r-brink, @ritchie46, @siddharth-gulia, @stinodego, @tkellogg, @twoertwein, @universalmind303 and @wence-
Published by github-actions[bot] 4 months ago
strict
parameter in Series constructor (#16939)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)selector
XOR set operation, guarantee consistent selector column-order (#16833)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)set_sorted
to only accept a single column (#16800)Series.cut/qcut
and update struct field names (#16741)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)dt.mean
/dt.median
in favor of mean
/median
(#16888)LazyFrame.with_context
in favor of horizontal concatenation (#16860)descending
to reverse
in top_k
methods (#16817)str.concat
to str.join
and update default delimiter (#16790)arctan2d
in favor of arctan2(...).degrees()
(#16786)AND
filter into multiple nodes (#16992)split_at
in split
(#16865)split_at
instead of double slice in chunk splits. (#16856)align_
if arrays are aligned (#16850)arg_sort
(#16808)dt.offset_by
2x for constant durations (#16728)join
if non-coalesced key isn't projected (#16677)dt.truncate
1.5x faster when every
is just a single duration (and not an expression) (#16666)json_normalize
(#17015)AND
filter into multiple nodes (#16992)strict
parameter in Series constructor (#16939)INTERSECT
and EXCEPT
ops (#16960)PerformanceWarning
to LazyFrame properties (#16964)collect_schema
method to LazyFrame
and DataFrame
(#16929)lit
(#16950)Schema
class (#16873)value_counts
(#16917)eq
/ne
for more FixedSizeList
s (#16902)read_csv
SQL table reading function defaults (better handle dates) (#16866)VALUES
clause and inline renaming of columns in CTE & derived table definitions (#16851)Enum
values in lit
(#16858).str.to_datetime
when values are offset-aware (#16742)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)SQL
"SELECT" with no tables, optimise registration of globals (#16836)selector
XOR set operation, guarantee consistent selector column-order (#16833)EXTRACT
and DATE_PART
SQL part abbreviations (#16767)set_sorted
to only accept a single column (#16800)group_by
iteration and partition_by
to always return tuple keys (#16793)read_database_uri
passthrough from read_database
(#16783)pyxlsb
engine from read_database
(#16784)check_order
parameter to assert_series_equal
(#16778)scan_csv
(#16674)INTERVAL
handling and improve related error messages, update sqlparser-rs
lib (#16744)ORDER BY
clause (#16745)pandas
and pyarrow
objects (#16746)Series.cut/qcut
and update struct field names (#16741)date_range
to no longer produce datetime ranges (#16734)min_periods
as keyword-only for rolling
methods (#16738)top_k
parameters nulls_last
, maintain_order
, and multithreaded
(#16599)NULLS FIRST/LAST
ordering (#16711)INTERVAL
strings (#16732)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
type instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)str.to_datetime
(#16634)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)SQLInterface
and SQLSyntax
errors (#16635)DIV
function support to the SQL interface (#16678)FAST_EXPLODE_LIST
metadata (#16951)extend()
(#16890)should_rechunk
check (#16852)read_excel
and read_ods
return identical frames across all engines when given empty spreadsheet tables (#16802)read_excel
(#16840)top_k/bottom_k
and fix a variety of bugs (#16804)DATE_PART
SQL syntax/parsing, improve some error messages (#16761)pl.
qualifier for inner dtypes in to_init_repr
(#16235)assert_series_equal
when categorical_as_str=True
(#16700)read_database
check for SQLAlchemy async Session objects (#16680)select
and with_columns
to idiomatic form (#16801)DataFrame.limit
(#16753)include_nulls
in DataFrame.update
docstring (#16701)DataFrame.rolling
(#16600)Expr/Series.map_elements
(#16079)polars.sql
docs entry and small docstring update (#16656)orient="row"
in DataFrame constructor when applicable (#16977)Arc
from FileCacheEntry
(#16870)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)ChunkedArray::from_chunks_and_dtype
(#16697)1.0.0
release (#16705)Thank you to all our contributors for making this release possible!
@JulianCologne, @KDruzhkin, @MarcoGorelli, @Object905, @alexander-beedie, @bertiewooster, @borchero, @coastalwhite, @datenzauberai, @dependabot, @dependabot[bot], @henryharbeck, @itamarst, @machow, @marenwestermann, @mcrumiller, @montanarograziano, @nameexhaustion, @orlp, @ritchie46, @siddharth-gulia, @stinodego, @tkellogg, @universalmind303 and @wence-
Published by github-actions[bot] 4 months ago
reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)selector
XOR set operation, guarantee consistent selector column-order (#16833)infer_schema_length
as keyword-only argument in str.json_decode
(#16835)set_sorted
to only accept a single column (#16800)group_by
iteration and partition_by
to always return tuple keys (#16793)coalesce=False
in left outer join (#16769)pyxlsb
engine from read_database
(#16784)Series.cut/qcut
and update struct field names (#16741)top_k
parameters nulls_last
, maintain_order
, and multithreaded
(#16599)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic
from 'negative every
' to 'zero' (#16658)DataFrame.sql
in favor of top-level pl.sql
(#16598)Array
instead of List
(#16710)clip
to no longer propagate nulls in the given bounds (#14413)str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)pivot
when pivoting by multiple values (#16439)ewm_mean
, ewm_std
, and ewm_var
(#15503)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)LazyFrame.with_context
(#16860)descending
to reverse
in top_k
methods (#16817)str.concat
to str.join
(#16790)arctan2d
(#16786)split_at
in split
(#16865)split_at
instead of double slice in chunk splits. (#16856)align_
if arrays are aligned (#16850)arg_sort
(#16808)dt.offset_by
2x for constant durations (#16728)dt.truncate
1.5x faster when every
is just a single duration (and not an expression) (#16666)read_csv
SQL table reading function defaults (better handle dates) (#16866)VALUES
clause and inline renaming of columns in CTE & derived table definitions (#16851)Enum
values in lit
(#16858).str.to_datetime
when values are offset-aware (#16742)reshape
to return Array types instead of List types (#16825)get
/gather
operations (#16841)SQL
"SELECT" with no tables, optimise registration of globals (#16836)selector
XOR set operation, guarantee consistent selector column-order (#16833)EXTRACT
and DATE_PART
SQL part abbreviations (#16767)set_sorted
(#16800)coalesce=False
in left outer join (#16769)read_database_uri
passthrough from read_database
(#16783)pyxlsb
engine from read_database
(#16784)check_order
parameter to assert_series_equal
(#16778)scan_csv
(#16674)INTERVAL
handling and improve related error messages, update sqlparser-rs
lib (#16744)ORDER BY
clause (#16745)pandas
and pyarrow
objects (#16746)Series.cut/qcut
(#16741)date_range
to no longer produce datetime ranges (#16734)min_periods
as keyword-only for rolling
methods (#16738)top_k
parameters (#16599)NULLS FIRST/LAST
ordering (#16711)INTERVAL
strings (#16732)offset
arg in truncate
and round
(#16655)offset
in group_by_dynamic from "negative every
" to "zero" (#16658)df.sql
in favour of top-level pl.sql
(#16598)clip
bounds (#14413).str.to_datetime
to default to microsecond precision for format specifiers "%f"
and "%.f"
(#13597)ewm_mean
, ewm_std
, and ewm_var
(#15503)str.to_datetime
(#16634)pl.read_json
and DataFrame.write_json
(#16550)nth
to allow positional input of indices, remove columns
parameter (#16510)rle
output to len
/value
and update data type of len
field (#15249)check_names
parameter to Series.equals
and default to False
(#16610)SQLInterface
and SQLSyntax
errors (#16635)DIV
function support to the SQL interface (#16678)should_rechunk
check (#16852)read_excel
and read_ods
return identical frames across all engines when given empty spreadsheet tables (#16802)read_excel
(#16840)top_k/bottom_k
and fix a variety of bugs (#16804)DATE_PART
SQL syntax/parsing, improve some error messages (#16761)pl.
qualifier for inner dtypes in to_init_repr
(#16235)assert_series_equal
when categorical_as_str=True
(#16700)read_database
check for SQLAlchemy async Session objects (#16680)select
and with_columns
to idiomatic form (#16801)DataFrame.limit
(#16753)include_nulls
in DataFrame.update
docstring (#16701)DataFrame.rolling
(#16600)Expr/Series.map_elements
(#16079)polars.sql
docs entry and small docstring update (#16656)Arc
from FileCacheEntry
(#16870)infer_schema_length
as keyword-only for str.json_decode
(#16835)ChunkedArray::from_chunks_and_dtype
(#16697)1.0.0
release (#16705)Thank you to all our contributors for making this release possible!
@JulianCologne, @KDruzhkin, @MarcoGorelli, @Object905, @alexander-beedie, @bertiewooster, @coastalwhite, @datenzauberai, @dependabot, @dependabot[bot], @henryharbeck, @marenwestermann, @mcrumiller, @montanarograziano, @nameexhaustion, @orlp, @ritchie46, @siddharth-gulia, @stinodego, @universalmind303 and @wence-
Published by github-actions[bot] 5 months ago
[!IMPORTANT]
The decision to change the default coalesce behavior of left join has been reversed.
You can ignore the associated deprecation warning.
dtypes
parameter to schema_overrides
for read_csv
/scan_csv
/read_csv_batched
(#16628)nulls_last
/maintain_order
/multithreaded
parameters for top_k
methods (#16597)SQLContext
"eager_execution" param to "eager" (#16595)Series.equals
parameter strict
to check_dtypes
and rename assertion utils parameter check_dtype
to check_dtypes
(#16573)DataFrame.serialize/deserialize
(#16545)str.explode
in favor of str.split("").explode()
(#16508)nulls_last
on sort operations (#16639)ARRAY
literals and the UNNEST
table function (#16330)struct.with_fields
in grouping (#16629)TRY_CAST
function (#16589)pl.sql
function (#16528)DataFrame.serialize/deserialize
(#16545)group_by_dynamic
, upsample
, and rolling
(#16494)ORDER BY
should not cause reordering of SELECT
cols (#16579)shape
in Array
constructor and deprecate width
parameter (#16567)Series
in LazyFrame.select()
(#16592)JOIN
issues (#16507)sum
over a list
of str
s (#16521)DataFrame.__getitem__
for empty list input - df[[]]
(#16520)DataFrame.__getitem__
with 2 column inputs (#16517)LazyFrame
properties may be expensive (#16618)versionadded
tags, and add is_column_selection
to the Expr meta docs (#16590)DataFrame.join
docstring (#16576)implode
reference from the user guide section on window functions (#16544)cargo update
(#16574)typing.no_type_check
(#16497)Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @coastalwhite, @hattajr, @itamarst, @mcrumiller, @nameexhaustion, @r-brink, @ritchie46, @stinodego, @twoertwein and @wence-
Published by github-actions[bot] 5 months ago
Series/Expr.has_nulls
and deprecate Series.has_validity
(#16488)tree_format
parameter for LazyFrame.explain
in favor of format
(#16486)DataFrame.__getitem__
improvements (#16495)is_column_selection()
to expression meta, enhance expand_selector
(#16479)Series/Expr.has_nulls
and deprecate Series.has_validity
(#16488)split_chunks
for nested dtypes (#16493)top_k
/bottom_k
(#16489)COUNT(*)
in SQL GROUP BY
operations (#16465)nan_to_null
when using multi-thread in pl.from_pandas
(#16459)pl.field
inside with_fields
examples. (#16451)cum_max
(#16456)Series/DataFrame.__getitem__
logic (#16482)Thank you to all our contributors for making this release possible!
@BGR360, @alexander-beedie, @cmdlineluser, @coastalwhite, @itamarst, @marenwestermann, @mdavis-xyz, @messense, @orlp, @ritchie46 and @stinodego
Published by github-actions[bot] 5 months ago
how="outer"
join type in favour of how="full"
(left/right are *also* outer joins) (#16417)DataFrame.to_numpy
(#16429)value_counts
"count" column (#16434)alpha
and alphanumeric
selectors, add "ascii_only" to digit
(#16362)__array__
method for Series and DataFrame to support copy
parameter (#16401)read_excel
dtype inference of "calamine" int/float results that include NaN (#16400)apply
call in str_duration_
util. (#16412)interpolate_by
entry to rst files. (#16422)Thank you to all our contributors for making this release possible!
@KDruzhkin, @alexander-beedie, @ankane, @cmdlineluser, @coastalwhite, @itamarst, @nameexhaustion, @ritchie46 and @stinodego