python-bigquery-dataframes | Python Ecosystem Directory

Bot releases are visible (Hide)

python-bigquery-dataframes - v1.11.1 Latest Release

Published by release-please[bot] 3 months ago

1.11.1 (2024-07-08)

Documentation

Remove session and connection in llm notebook (#821) (74170da)
Remove the experimental flask icon from the public docs (#820) (067ff17)

python-bigquery-dataframes - v1.11.0

Published by release-please[bot] 4 months ago

1.11.0 (2024-07-01)

Features

Add .agg support for size (#792) (87e6018)
Add bigframes.bigquery.json_set (#782) (1b613e0)
Add bigframes.streaming.to_pubsub method to create continuous query that writes to Pub/Sub (#801) (b47f32d)
Add DataFrame.to_arrow to create Arrow Table from DataFrame (#807) (1e3feda)
Add PolynomialFeatures support to to_gbq and pipelines (#805) (57d98b9)
Add Series.peek to preview data efficiently (#727) (580e1b9)
Expose gcf memory param in remote_function (#803) (014765c)
More informative error when query plan too complex (#811) (136dc24)

Bug Fixes

Include internally required packages in remote_function hash (#799) (4b8fc15)

Documentation

Document dtype limitation on row processing remote_function (#800) (487dff6)

python-bigquery-dataframes - v1.10.0

Published by release-please[bot] 4 months ago

1.10.0 (2024-06-21)

Features

Add dataframe.insert (#770) (e8bab68)
Add groupby head API (#791) (44202bc)
Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
Bigframes.streaming module for continuous queries (#703) (0433a1c)
Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)

Bug Fixes

Allow __repr__ to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)
Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
Ensure numpy version matches in remote_function deployment (#798) (324d93c)
Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
Self-join optimization doesn't needlessly invalidate caching (#797) (1b96b80)

python-bigquery-dataframes - v1.9.0

Published by release-please[bot] 4 months ago

1.9.0 (2024-06-10)

Features

Allow functions returned from bpd.read_gbq_function to execute outside of apply (#706) (ad7d8ac)
Support bigquery.vector_search() (#736) (dad66fd)
Support score() in GeminiTextGenerator (#740) (b2c7d8b)
Support bytes type in remote_function (#761) (4915424)
Support fit() in GeminiTextGenerator (#758) (d751f5c)

Bug Fixes

ARIMAPlus loads auto_arima_min_order param (#752) (39d7013)
Improve to_pandas_batches for large results (#746) (61f18cb)
Resolve issue with unset thread-local options (#741) (d93dbaf)

Documentation

Fix ML.EVALUATE spelling (#749) (7899749)
Remove LogisticRegression normal_equation strategy (#753) (ea5d367)

python-bigquery-dataframes - v1.8.0

Published by release-please[bot] 5 months ago

1.8.0 (2024-05-31)

Features

merge only generates a default index if both inputs already have an index (#733) (25d049c)
Add +, - as unary ops, ^ binary op (#724) (968d825)
Add GroupBy.size() to get number of rows in each group (#479) (1fca588)
Add DataFrame ~ operator (#721) (354abc1)
Add GeminiText 1.5 Preview models (#737) (56cbd3b)
Add slot_millis and add stats to session object (#725) (72e9583)
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
Allow functions decorated with bpd.remote_function() to execute locally (#704) (d850da6)
Ensure "bigframes-api" label is always set on jobs, even if the API is unknown (#722) (1832778)
Support ml.SimpleImputer in bigframes (#708) (4c4415f)
Support type annotations to supply input and output types to bpd.remote_function() decorator (#717) (4a12e3c)
Support type annotations with bpd.remote_function() and axis=1 (a preview feature) (#730) (e5a2992)

Bug Fixes

Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
Fix Null index assign series to column (#711) (ffb4b57)
Set bpd.remote_function()s input_types and output_types default to None to allow omitting them when type annotations are present (#729) (0e25a3b)
Warn and disable time travel for linked datasets (#712) (085fa9d)

Performance Improvements

Optimize dataframe-series alignment on axis=1 (#732) (3d39221)

Documentation

Add examples to DataFrameGroupBy and SeriesGroupBy (#701) (e7da0f0)

python-bigquery-dataframes - v1.7.0

Published by release-please[bot] 5 months ago

1.7.0 (2024-05-20)

Features

read_gbq_query supports filters (9386373)
read_gbq suggests a correct column name when one is not found (9386373)
Add DefaultIndexKind.NULL to use as index_col in read_gbq*, creating an indexless DataFrame/Series (#662) (29e4886)
Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
To_datetime supports utc=False for string inputs (#579) (adf9889)

Bug Fixes

read_gbq_table respects primary keys even when filters are set (#689) (9386373)
Fix type error in test_cluster (#698) (14d81c1)
Improve escaping of literals and identifiers (#682) (da9b136)
Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
Remove a usage of the resource package when not available, such as on Windows (#681) (96243f2)
The imported samples error and use peek() (#688) (1a0b744)

Performance Improvements

Don't run query immediately from read_gbq_table if filters is set (9386373)
Use a LIMIT clause when max_results is set (9386373)

Documentation

Add code snippets for imported onnx tutorials (#684) (cb36e46)
Add code snippets for imported tensorflow model (#679) (b02c401)
Use class_weight="balanced" in the logistic regression prediction tutorial (#678) (b951549)

python-bigquery-dataframes - v1.6.0

Published by release-please[bot] 5 months ago

1.6.0 (2024-05-13)

Features

Add DataFrame.__delitem__ (#673) (2218c21)
Add Series.case_when() (#673) (2218c21)
Add strategy="quantile" in KBinsDiscretizer (#654) (c6c487f)
Add Series.combine (#680) (2fd1b81)
Series.str.split (#675) (6eb19a7)
Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
Support axis=1 in df.apply for scalar outputs (#629) (f6bdc4a)
Support gcf vpc connector in remote_function (#677) (9ca92d0)
Warn with a more specific DefaultLocationWarning category when no location can be detected (#648) (e084e54)

Bug Fixes

Include index_col when selecting columns and filters in read_gbq_table (#648) (e084e54)

Dependencies

Add jellyfish as a dependency for spelling correction (57ccabc)

Documentation

Add code snippets for llm text generatiion (#669) (93416ed)
Add logistic regression samples (#673) (2218c21)
Address lint errors in code samples (#665) (4fc8964)
Document inlining of small data in read_* APIs (#670) (306953a)

python-bigquery-dataframes - v1.5.0

Published by release-please[bot] 5 months ago

1.5.0 (2024-05-07)

Features

bigframes.options and bigframes.option_context now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)
Add ARIMAPlus.coef_ property exposing ML.ARIMA_COEFFICIENTS functionality (#585) (81d1262)
Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
Add the bigframes.bigquery sub-package with a bigframes.bigquery.array_length function (#630) (9963f85)
Always do a query dry run when option.repr_mode == "deferred" (#652) (651fd7d)
Custom query labels for compute options (#638) (f561799)
Raise NoDefaultIndexError from read_gbq on clustered/partitioned tables with no index_col or filters set (#631) (73064dd)
Support index_col=False in read_csv and engine="bigquery" (73064dd)
Support gcf max instance count in remote_function (#657) (36578ab)

Bug Fixes

Don't raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
Downgrade NoDefaultIndexError to DefaultIndexWarning (#658) (2715d2b)
Fix bug with na in the column labels in stack (#659) (4a34293)
Use explicit session in PaLM2TextGenerator (#651) (e4f13c3)

Documentation

Add python code sample for multiple forecasting time series (#531) (16866d2)
Fix the Palm2TextGenerator output token size (#649) (c67e501)

python-bigquery-dataframes - v1.4.0

Published by release-please[bot] 6 months ago

1.4.0 (2024-04-29)

Features

Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
Allow single input type in remote_function (#641) (3aa643f)
Expose gcf max timeout in remote_function (#639) (dfeaad0)
Series binary ops compatible with more types (#618) (518d315)
Support the score method for PaLM2TextGenerator (#634) (3ffc1d2)

Bug Fixes

Allow to_pandas to download more than 10GB (#637) (ce56495)
Extend row hash to 128 bits to guarantee unique row id (#632) (9005c6e)
Llm fine tuning tests (#627) (4724a1a)
Llm palm score tests (#643) (cf4ec3a)

Performance Improvements

Automatically condense internal expression representation (#516) (03c1b0d)
Cache transpose to allow performant retranspose (#635) (44b738d)

Documentation

Add supported pandas apis on the main page (#628) (8d2a51c)
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
Address more technical writers' feedback (#640) (1e7793c)

python-bigquery-dataframes - v1.3.0

Published by release-please[bot] 6 months ago

1.3.0 (2024-04-22)

Features

Add Series.struct.dtypes property (#599) (d924ec2)
Add fine tuning fit() for Palm2TextGenerator (#616) (9c106bd)
Add quantile statistic (#613) (bc82804)
Expose max_batching_rows in remote_function (#622) (240a1ac)
Support primary key(s) in read_gbq by using as the index_col by default (#625) (75bb240)
Warn if location is set to unknown location (#609) (3706b4f)

Bug Fixes

Address technical writers fb (#611) (9f8f181)
Infer narrowest numeric type when combining numeric columns (#602) (8f9ece6)
Use exact median implementation by default (#619) (9d205ae)

Documentation

Fix rendering of examples for multiple apis (#620) (9665e39)
Set index_cols in read_gbq as a best practice (#624) (70015b7)

python-bigquery-dataframes - v1.2.0

Published by release-please[bot] 6 months ago

1.2.0 (2024-04-15)

Features

Add hasnans, combine_first, update to Series (#600) (86e0f38)
Add MultiIndex subclass. (#596) (5d0f149)
Add pivot_table for DataFrame. (#473) (5f1d670)
Add Series.autocorr (#605) (4ec8034)
Support list of numerics in pandas.cut (#580) (290f95d)

Bug Fixes

Address more technical writers feedback (#581) (4b08d92)
Error for object dtype on read_pandas (#570) (8702dcf)
Inverting int now does bitwise inversion rather than sign flip (#574) (5f1db8b)
Loc setitem dtype issue. (#603) (b94bae9)
Toc menu missing plotting name (#591) (eed12c1)

Documentation

(Series|Dataframe).dtypes (#598) (edef48f)
Add code samples for str accessor methdos (#594) (a557ea2)
Add docs for DataFrame and Series dunder methods (#562) (8fc26c4)
Add examples for at/iat (#582) (3be4a2e)

python-bigquery-dataframes - v1.1.0

Published by release-please[bot] 7 months ago

1.1.0 (2024-04-04)

Features

(Series|DataFrame).explode (#556) (9e32f57)
Add DataFrame.eval and DataFrame.query (#361) (5e28ebd)
Add ColumnTransformer save/load (#541) (9d8cf67)
Add ml.metrics.mean_squared_error (#559) (853c25e)
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
Add transformers save/load (#552) (d805241)
Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
Expose DataFrame.bqclient to assist in integrations (#519) (0be8911)
Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
Support ML.GENERATE_EMBEDDING in PaLM2TextEmbeddingGenerator (#539) (1156c1e)
Support max_columns in repr and make repr more efficient (#515) (54e49cf)

Bug Fixes

Assign NaN scalar to column error. (#513) (0a4153c)
Don't download 100gb onto local python machine in load test (#537) (082c58b)
Exclude list-like s parameter in plot.scatter (#568) (1caac27)
Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
Fix error in Series.drop(0) (#575) (75dd786)
Include all names in MultiIndex repr (#564) (b188146)
Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
Product operation produces float result for all input types (#501) (6873b30)
Reloaded transformer .transform error (#569) (39fe474)
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
Restore string to date/time type coercion (#565) (4ae0262)
Sync the notebook with embedding changes (#550) (347f2dd)
Use bytes limit on frame inlining rather than element count (#576) (659a161)

Performance Improvements

Add multi-query execution capability for complex dataframes (#427) (d2d7e33)

Dependencies

Include pyarrow as a dependency (#529) (9b1525a)

Documentation

bigframes.options.bigquery.project and location are optional in some circumstances (#548) (90bcec5)
Add "Supported pandas APIs" reference to the documentation (#542) (74c3915)
Add General Availability banner to README (#507) (262ff59)
Add opeartions in API docs (#557) (ea95761)
Add progress_bar code sample (#508) (92a1af3)
Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
Address more comments from technical writers to meet legal purposes (#571) (9084df3)
Fix docs of ARIMAPlus.predict (#512) (3b80f95)
Include Index in table-of-contents (#564) (b188146)
Mark Gemini model as Pre-GA (#543) (769868b)
Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)

python-bigquery-dataframes - v1.0.0

Published by release-please[bot] 7 months ago

1.0.0 (2024-03-25)

⚠ BREAKING CHANGES

rename model parameter min_rel_progress to tol
early_stop setting no longer supported, always uses True
rename model parameter n_parallell_trees to n_estimators
rename class_weights to class_weight
rename learn_rate to learning_rate
PCA n_components supports float value and None, default to None
rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491)

Features

Add configuration option to read_gbq (#401) (85cede2)
Add ml ARIMAPlus model params (#488) (352cb85)
Add ml KMeans model params (#477) (23a8d9a)
Add ml LogisticRegression model params (#481) (f959b65)
Add ml PCA model params (#474) (fb5d83b)
Add params for LinearRegression model (#464) (21b2188)
Add support for Python 3.12 (#231) (df2976f)
Allow assigning directly to Series.name property (#495) (ad0e99e)
Ensure Series.str.len() can get length of array columns (#497) (10c0446)
Option to use bq connection without check (#460) (0b3f8e5)
PCA n_components supports float value and None, default to None (65c6f47)
Rename class_weights to class_weight (65c6f47)
Rename learn_rate to learning_rate (65c6f47)
Rename model parameter min_rel_progress to tol (65c6f47)
Rename model parameter n_parallell_trees to n_estimators (65c6f47)
Rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491) (65c6f47)
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 (#504) (fbada4a)
Support dataframe.cov (#498) (c4beafd)
Support Series.dt.floor (#493) (2dd01c2)
Support Series.dt.normalize (#483) (0bf1e91)
Update plot sample to 1000 rows (#458) (60d4a7b)

Bug Fixes

early_stop setting no longer supported, always uses True (65c6f47)
Fix -1 offset lookups failing (#463) (2dfb9c2)
Plot.scatter c argument functionalities (#494) (d6ee994)
Properly support format param for numerical input. (#486) (ae20c35)
Renable to_csv and to_json related tests (#468) (2b9a01d)
Sampling plot cannot preserve ordering if index is not ordered (#475) (a5345fe)
Use actual BigQuery types rather than ibis types in to_pandas (#500) (82b4f91)

Dependencies

Support pandas 2.2 (#492) (e2cf50e)

Documentation

Add code samples for metrics.{accuracy_score, confusion_matrix} (#478) (3e3329a)
Add code samples for metrics.{recall_score, precision_score, f11_score} (#502) (370fe90)
Improve API documentation (#489) (751266e)
Update bigquery connection documentation (#499) (4bfe094)
Update LLM + K-means notebook to handle partial failures (#496) (97afad9)

python-bigquery-dataframes - v0.26.0

Published by release-please[bot] 7 months ago

0.26.0 (2024-03-20)

⚠ BREAKING CHANGES

exclude remote models for .register() (#465)

Features

(Series|DataFrame).plot (#438) (1c3e668)
read_gbq_table supports LIKE as a operator in filters (#454) (d2d425a)
Add DataFrame.pipe() method (#421) (95f5a6e)
Set force=True by default in DataFrame.peek() (#469) (4e8e97d)
Support datetime related casting in (Series|DataFrame|Index).astype (#442) (fde339b)
Support Series.dt.strftime (#453) (8f6e955)

Bug Fixes

Any() on empty set now correctly returns False (#471) (f55680c)
Df.drop_na preserves columns dtype (#457) (3bab1a9)
Disable to_json and to_csv related tests (#462) (874026d)
Exclude remote models for .register() (#465) (73fe0f8)
Fix broken link in covid notebook (#450) (adadb06)
Fix broken multiindex loc cases (#467) (b519197)
Fix grouping series on multiple other series (#455) (3971bd2)
Groupby aggregates no longer check if grouping keys are numeric (#472) (4fbf938)
Raise ValueError when read_pandas() receives a bigframes DataFrame (#447) (b28f9fd)
Series.(to_csv|to_json) leverages bq export (#452) (718a00c)
Warn when read_gbq / read_gbq_table uses the snapshot time cache (#441) (e16a8c0)

Documentation

Add code samples for ml.metrics.r2_score (#459) (85fefa2)
Add the docs for loc and iloc indexers (#446) (14ab8d8)
Add the pages for at and iat indexers (#456) (340f0b5)
Add version information to bug template (#437) (91bd39e)
Indicate that project and location are optional in example notebooks (#451) (1df0140)

python-bigquery-dataframes - v0.25.0

Published by release-please[bot] 7 months ago

0.25.0 (2024-03-14)

Features

(Series|DataFrame).plot.(line|area|scatter) (#431) (0772510)
Support CMEK for remote_function cloud functions (#430) (2fd69f4)

python-bigquery-dataframes - v0.24.0

Published by release-please[bot] 7 months ago

0.24.0 (2024-03-12)

⚠ BREAKING CHANGES

read_parquet uses a "pandas" engine to parse files by default. Use engine="bigquery" for the previous behavior

Features

(Series|Dataframe).plot.hist() (#420) (4aadff4)
Add detect_anomalies to ml ARIMAPlus and KMeans models (#426) (6df28ed)
Add engine parameter to read_parquet (#413) (31325a1)
Add ml PCA.detect_anomalies method (#422) (8d82945)
Support BYOSA in remote_function (#407) (d92ced2)
Support CMEK for BQ tables (#403) (9a678e3)

Bug Fixes

Move third_party.bigframes_vendored to bigframes_vendored (#424) (763edeb)
Only do row identity based joins when joining by index (#356) (76b252f)
Read_pandas inline respects location (#412) (ae0e3ea)

Documentation

Add predict sample to samples/snippets/bqml_getting_started_test.py (#388) (6a3b0cc)
Document minimum IAM requirement (#416) (36173b0)
Fix the note rendering for DataFrames methods: nlargest, nsmallest (#417) (38bd2ba)

python-bigquery-dataframes - v0.23.0

Published by release-please[bot] 8 months ago

0.23.0 (2024-03-05)

Features

Add ml.metrics.pairwise.euclidean_distance (#397) (1726588)
Add TextEmbedding model version support (#394) (e0f1ab0)

Bug Fixes

Code exception in remote_function now prevents retry and surfaces in the client (#387) (dd3643d)
Docs link for metrics.pairwise (#400) (a60aba7)

Dependencies

Update ibis to version 8.0.0 and refactor remote_function to use ibis UDF method (#277) (350499b)

Documentation

Update README to point to new summary pages (#402) (bfe2b23)

python-bigquery-dataframes - v0.22.0

Published by release-please[bot] 8 months ago

0.22.0 (2024-02-27)

⚠ BREAKING CHANGES

rename cosine_similarity to paired_cosine_distances (#393)
move model optional args to kwargs (#381)

Features

Add DataFrames.corr() method (#379) (67fd434)
Add ml.metrics.pairwise.manhattan_distance (#392) (9d31865)
Enable regional endpoints for me-central2 (#386) (469674d)

Bug Fixes

Avoid ibis warning for "database" table() method argument (#390) (a0490a4)
Correct the numeric literal dtype (#365) (93b02cd)
Rename cosine_similarity to paired_cosine_distances (#393) (81ece46)

Performance Improvements

Inline read_pandas for small data (#383) (59b446b)

Dependencies

Add minimum version constraint for sqlglot to 19.9.0 (#389) (8b62d77)

Documentation

Add a code sample for creating a kmeans model (#267) (4291d65)
Fix bigframes.pandas.concat documentation (#382) (234b61c)

Miscellaneous Chores

Release 0.22.0 (#396) (8f73d9e)

Code Refactoring

Move model optional args to kwargs (#381) (4037992)

python-bigquery-dataframes - v0.21.0

Published by release-please[bot] 8 months ago

0.21.0 (2024-02-13)

Features

Add Series.cov method (#368) (443db22)
Add ml.llm.GeminiTextGenerator model (#370) (de1e0a4)
Add ml.metrics.pairwise.cosine_similarity function (#374) (126f566)
Add XGBoostModel (#363) (d5518b2)
Limited support of lambdas in Series.apply (#345) (208e081)
Support bigframes.pandas.to_datetime for scalars, iterables and series. (#372) (ffb0d15)
Support read_gbq wildcard table path (#377) (90caf86)