siuba

Python library for using dplyr like syntax with pandas and SQL

MIT License

Downloads
24.6K
Stars
1.1K
Committers
10

Bot releases are hidden (Show)

siuba - v0.4.4: misc fixes, basic .str.cat sql translation Latest Release

Published by machow about 1 year ago

siuba - v0.4.3: compatibility with sqlalchemy v2, and pandas v2

Published by machow about 1 year ago

What's Changed

Full Changelog: https://github.com/machow/siuba/compare/v0.4.2...v0.4.3

siuba - v0.4.2: better handling of all NA grouping columns

Published by machow almost 2 years ago

What's Changed

Full Changelog: https://github.com/machow/siuba/compare/v0.4.1...v0.4.2

siuba - v0.4.1: fix across with grouped data and aggregations in mutate

Published by machow almost 2 years ago

What's Changed

Full Changelog: https://github.com/machow/siuba/compare/v0.4.0...v0.4.1

siuba - v0.4.0rc1

Published by machow about 2 years ago

This PR mainly implements the functions tbl(), across(), pivot_longer(), pivot_wider(). It ensures verbs work with grouped data. As part of implementing across(), it refactored tidyselection and SQL support.

What's Changed

.

.

.

New Contributors

Full Changelog: https://github.com/machow/siuba/compare/v0.3.0...v0.4.0rc1

siuba - v0.3.0: duckdb support, prepare for sql win_over

Published by machow over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/machow/siuba/compare/v0.2.3...v0.3.0

siuba - Fix incorrect numpy handling in previous

Published by machow over 2 years ago

Also adds more tests of calling numpy functions over symbolics

siuba - Support numpy functions (e.g. np.add(_.x, 1))

Published by machow over 2 years ago

What's Changed

Full Changelog: https://github.com/machow/siuba/compare/v0.2.1...v0.2.2

siuba - v0.2.1

Published by machow over 2 years ago

Fixes

Full Changelog: https://github.com/machow/siuba/compare/v0.2.0...v0.2.1

siuba - v0.2.0

Published by machow over 2 years ago

What's Changed

siuba - refactor siu dispatch

Published by machow over 2 years ago

This is a dev release to support...

  • enable creating a LazyTbl from a subquery (for dbcooper-py)
  • early preview of snowflake dialect

Fixes

  • Error in pandas semi_join errored when no join cols were specified (#351; PR #374)

Misc

  • Add an extra test for fast_mutate (#355 #372 )
siuba - Support pandas 1.3, fix broken windows install

Published by machow over 2 years ago

⚠️ Note: this is a re-release of alpha v1.0.0a3 to be v0.1.1. This will allow users to easily install these releases using pip. Changelog copied below.

Fixes

  • fix(ci): allow testing multiple bigquery test branches at the same time (#360)
  • fix(pandas): fast methods support DataFrame aggs like n() (#363)
  • fix(pandas): support pandas 1.3 (#366)
  • fix(install): fix windows install breaking due to utf-8 in README (#370)

⚠️ Note: this is a re-release of alpha v1.0.0a2 to be v0.1.0. This will allow users to easily install these releases using pip. Changelog copied below.

This is an alpha release for v1.0.0. There will likely be extensive changes through January as I work to refactor the core API.

Big changes

  • Added ops submodule (and removed spec submodule).
    • This contains generic functions for pandas methods.
    • Has method data needed to translate pandas expressions to SQL (e.g. whether something is a property, or uses an accessor).
  • Refactored SQL translation mechanism:
    • Base dialect that others extend.
    • Over clauses in translate.py now have a func class method. This generates a constructor for a specific sql translation.
    • Now uses a pandas translator from ops submodule.

Features:

siuba - Support pandas 1.3, fix broken windows install

Published by machow almost 3 years ago

Fixes

  • fix(ci): allow testing multiple bigquery test branches at the same time (#360)
  • fix(pandas): fast methods support DataFrame aggs like n() (#363)
  • fix(pandas): support pandas 1.3 (#366)
  • fix(install): fix windows install breaking due to utf-8 in README (#370)

This is an alpha release for v1.0.0. There will likely be extensive changes through January as I work to refactor the core API.

Big changes

  • Added ops submodule (and removed spec submodule).
    • This contains generic functions for pandas methods.
    • Has method data needed to translate pandas expressions to SQL (e.g. whether something is a property, or uses an accessor).
  • Refactored SQL translation mechanism:
    • Base dialect that others extend.
    • Over clauses in translate.py now have a func class method. This generates a constructor for a specific sql translation.
    • Now uses a pandas translator from ops submodule.

Features:

siuba - SQLAlchemy 1.4 support and slicing fix

Published by machow over 3 years ago

This release adds support for SQLAlchemy 1.4, does some light refactoring, and fixes a slicing issue.

Fixes

  • Make compatible with SQLAlchemy 1.4. #327
  • Properly represent symbolics when indexing with multiple slices (e.g. _[1:2, _.a:_.b]). #325

Features

  • Refactored fast group operations to use generic functions (regroup(), broadcast_agg(), is_compatible()). #310
  • Enable piping to attributes (e.g. pipe(_.some_col)). #325

QA

  • Migrated to github actions. #299
siuba - SQL and fast grouped method improvements

Published by machow about 4 years ago

Much of this release is setting up for...

  • fast methods to become drop-in replacements for their slow, reference implementation counterparts!
  • SQL fixes and features for funneljoin-py

Fixes

  • Pandas anti_join breaking when on argument is a string (#264)
  • Sql now resets order by after joins (#277)
  • Fixed major regression for sql where filter with multiple arguments did strange and terrible things (#277)
  • Sql summarize correctly makes a subquery when group_by references a recently created column (#278)
  • Sql right_join had to switch lhs and rhs, but was not correctly handling all implications of that. Specifically how key columns are kept, and suffixes used. (#279)

Features

  • Fast grouped methods determine whether they can run quickly, fallback if not, along with a warning. Lambdas can be used to explicitly take the slower, reference implementation route (#268).
  • Fast grouped methods allow property operations that return dim 0 results, like dtype. (#269)
  • - in sql arrange now produces the correct DESC, rather than being a - operation (#280)
  • pandas arrange now works on grouped data. i.e. DataFrameGroupBy. (#280)

QA

  • a slew of tests!
siuba - Misc bug fixes

Published by machow about 4 years ago

Fixes

  • anti_join in pandas no longer breaks when on argument is a string. E.g. anti_join(df1, df2, on = "some_col") #264
  • fct_collapse now correctly keeps missing values (#262)
Package Rankings
Top 28.86% on Conda-forge.org
Top 2.83% on Pypi.org
Badges
Extracted from project README
CI Documentation Status Binder