cudf

cuDF - GPU DataFrame Library

APACHE-2.0 License

Downloads
13.3K
Stars
7.2K
Committers
246

Bot releases are visible (Hide)

cudf - v21.08.01

Published by GPUtester about 3 years ago

v21.08.01

cudf - v21.08.00

Published by GPUtester about 3 years ago

🚨 Breaking Changes

  • Fix a crash in pack() when being handed tables with no columns. (#8697) @nvdbaranec
  • Remove unused cudf::strings::create_offsets (#8663) @davidwendt
  • Add delimiter parameter to cudf::strings::capitalize() (#8620) @davidwendt
  • Change default datetime index resolution to ns to match pandas (#8611) @vyasr
  • Add sequence_type parameter to cudf::strings::title function (#8602) @davidwendt
  • Add strings::repeat_strings API that can repeat each string a different number of times (#8561) @ttnghia
  • String-to-boolean conversion is different from Pandas (#8549) @skirui-source
  • Add accurate hash join size functions (#8453) @PointKernel
  • Expose a Decimal32Dtype in cuDF Python (#8438) @skirui-source
  • Update dask make_meta changes to be compatible with dask upstream (#8426) @galipremsagar
  • Adapt cudf::scalar classes to changes in rmm::device_scalar (#8411) @harrism
  • Remove special Index class from the general index class hierarchy (#8309) @vyasr
  • Add first-class dtype utilities (#8308) @vyasr
  • ORC - Support reading multiple orc files/buffers in a single operation (#8142) @jdye64
  • Upgrade arrow to 4.0.1 (#7495) @galipremsagar

πŸ› Bug Fixes

  • Fix contains check in string column (#8834) @galipremsagar
  • Remove unused variable from row_bit_count_test. (#8829) @mythrocks
  • Fixes issue with null struct columns in ORC reader (#8819) @rgsl888prabhu
  • Set CMake vars for python/parquet support in libarrow builds (#8808) @vyasr
  • Handle empty child columns in row_bit_count() (#8791) @mythrocks
  • Revert "Remove cudf unneeded build time requirement of the cuda driver" (#8784) @robertmaynard
  • Fix isort error in utils.pyx (#8771) @charlesbluca
  • Handle sliced struct/list columns properly in concatenate() bounds checking. (#8760) @nvdbaranec
  • Fix issues with _CPackedColumns.serialize() handling of host and device data (#8759) @charlesbluca
  • Fix issues with MultiIndex in dropna, stack & reset_index (#8753) @galipremsagar
  • Write pandas extension types to parquet file metadata (#8749) @devavret
  • Fix where to handle DataFrame & Series input combination (#8747) @galipremsagar
  • Fix replace to handle null values correctly (#8744) @galipremsagar
  • Handle sliced structs properly in pack/contiguous_split. (#8739) @nvdbaranec
  • Fix issue in slice() where columns with a positive offset were computing null counts incorrectly. (#8738) @nvdbaranec
  • Fix cudf.Series constructor to handle list of sequences (#8735) @galipremsagar
  • Fix min/max sorted groupby aggregation on string column with nulls (argmin, argmax sentinel value missing on nulls) (#8731) @karthikeyann
  • Fix orc reader assert on create data_type in debug (#8706) @davidwendt
  • Fix min/max inclusive cudf::scan for strings column (#8705) @davidwendt
  • JNI: Fix driver version assertion logic in testGetCudaRuntimeInfo (#8701) @sperlingxx
  • Adding fix for skip_rows and crash in orc reader (#8700) @rgsl888prabhu
  • Bug fix: replace_nulls_policy functor not returning correct indices for gathermap (#8699) @isVoid
  • Fix a crash in pack() when being handed tables with no columns. (#8697) @nvdbaranec
  • Add post-processing steps to dask_cudf.groupby.CudfSeriesGroupby.aggregate (#8694) @charlesbluca
  • JNI build no longer looks for Arrow in conda environment (#8686) @jlowe
  • Handle arbitrarily different data in null list column rows when checking for equivalency. (#8666) @nvdbaranec
  • Add ConfigureNVBench to avoid concurrent main() entry points (#8662) @PointKernel
  • Pin *arrow to use *cuda in run (#8651) @jakirkham
  • Add proper support for tolerances in testing methods. (#8649) @vyasr
  • Support multi-char case conversion in capitalize function (#8647) @davidwendt
  • Fix repeated mangled names in read_csv with duplicate column names (#8645) @karthikeyann
  • Temporarily disable libcudf example build tests (#8642) @isVoid
  • Use conda-sourced cudf artifacts for libcudf example in CI (#8638) @isVoid
  • Ensure dev environment uses Arrow GPU packages (#8637) @charlesbluca
  • Fix bug that columns only initialized once when specified columns and index in dataframe ctor (#8628) @isVoid
  • Propagate **kwargs through to as_*_column methods (#8618) @shwina
  • Fix orc_reader_benchmark.cpp compile error (#8609) @davidwendt
  • Fix missed renumbering of Aggregation values (#8600) @revans2
  • Update cmake to 3.20.5 in the Java Docker image (#8593) @NvTimLiu
  • Fix bug in replace_with_backrefs when group has greedy quantifier (#8575) @davidwendt
  • Apply metadata to keys before returning in Frame._encode (#8560) @charlesbluca
  • Fix for strings containing special JSON characters in get_json_object(). (#8556) @nvdbaranec
  • Fix debug compile error in gather_struct_tests.cpp (#8554) @davidwendt
  • String-to-boolean conversion is different from Pandas (#8549) @skirui-source
  • Fix __repr__ output with display.max_rows is None (#8547) @galipremsagar
  • Fix size passed to column constructors in _with_type_metadata (#8539) @shwina
  • Properly retrieve last column when -1 is specified for column index (#8529) @isVoid
  • Fix importing apply from dask (#8517) @galipremsagar
  • Fix offset of the string dictionary length stream (#8515) @vuule
  • Fix double counting of selected columns in CSV reader (#8508) @ochan1
  • Incorrect map size in scatter_to_gather corrupts struct columns (#8507) @gerashegalov
  • replace_nulls properly propagates memory resource to gather calls (#8500) @robertmaynard
  • Disallow groupby aggs for StructColumns (#8499) @charlesbluca
  • Fixes out-of-bounds access for small files in unzip (#8498) @elstehle
  • Adding support for writing empty dataframe (#8490) @shaneding
  • Fix exclusive scan when including nulls and improve testing (#8478) @harrism
  • Add workaround for crash in libcudf debug build using output_indexalator in thrust::lower_bound (#8432) @davidwendt
  • Install only the same Thrust files that Thrust itself installs (#8420) @robertmaynard
  • Add nightly version for ucx-py in ci script (#8419) @galipremsagar
  • Fix null_equality config of rolling_collect_set (#8415) @sperlingxx
  • CollectSetAggregation: implement RollingAggregation interface (#8406) @sperlingxx
  • Handle pre-sliced nested columns in contiguous_split. (#8391) @nvdbaranec
  • Fix bitmask_tests.cpp host accessing device memory (#8370) @davidwendt
  • Fix concurrent_unordered_map to prevent accessing padding bits in pair_type (#8348) @davidwendt
  • BUG FIX: Raise appropriate strings error when concatenating strings column (#8290) @skirui-source
  • Make gpuCI and pre-commit style configurations consistent (#8215) @charlesbluca
  • Add collect list to dask-cudf groupby aggregations (#8045) @charlesbluca

πŸ“– Documentation

  • Update Python UDFs notebook (#8810) @brandon-b-miller
  • Fix dask.dataframe API docs links after reorg (#8772) @jsignell
  • Fix instructions for running cuDF/dask-cuDF tests in CONTRIBUTING.md (#8724) @shwina
  • Translate Markdown documentation to rST and remove recommonmark (#8698) @vyasr
  • Fixed spelling mistakes in libcudf documentation (#8664) @karthikeyann
  • Custom Sphinx Extension: PandasCompat (#8643) @isVoid
  • Fix README.md (#8535) @ajschmidt8
  • Change namespace contains_nulls to struct (#8523) @davidwendt
  • Add info about NVTX ranges to dev guide (#8461) @jrhemstad
  • Fixed documentation bug in groupby agg method (#8325) @ahmet-uyar

πŸš€ New Features

  • Fix concatenating structs (#8811) @shaneding
  • Implement JNI for groupby aggregations M2 and MERGE_M2 (#8763) @ttnghia
  • Bump isort to 5.6.4 and remove isort overrides made for 5.0.7 (#8755) @charlesbluca
  • Implement __setitem__ for StructColumn (#8737) @shaneding
  • Add is_leap_year to DateTimeProperties and DatetimeIndex (#8736) @isVoid
  • Add struct.explode() method (#8729) @shwina
  • Add DataFrame.to_struct() method to convert a DataFrame to a struct Series (#8728) @shwina
  • Add support for list type in ORC writer (#8723) @vuule
  • Fix slicing from struct columns and accessing struct columns (#8719) @shaneding
  • Add datetime::is_leap_year (#8711) @isVoid
  • Accessing struct columns from dask_cudf (#8675) @shaneding
  • Added pct_change to Series (#8650) @TravisHester
  • Add strings support to cudf::shift function (#8648) @davidwendt
  • Support Scatter struct_scalar (#8630) @isVoid
  • Struct scalar from host dictionary (#8629) @shaneding
  • Add dayofyear and day_of_year to Series, DatetimeColumn, and DatetimeIndex (#8626) @beckernick
  • JNI support for capitalize (#8624) @firestarman
  • Add delimiter parameter to cudf::strings::capitalize() (#8620) @davidwendt
  • Add NVBench in CMake (#8619) @PointKernel
  • Change default datetime index resolution to ns to match pandas (#8611) @vyasr
  • ListColumn __setitem__ (#8606) @brandon-b-miller
  • Implement groupby aggregations M2 and MERGE_M2 (#8605) @ttnghia
  • Add sequence_type parameter to cudf::strings::title function (#8602) @davidwendt
  • Adding support for list and struct type in ORC Reader (#8599) @rgsl888prabhu
  • Benchmark for strings::repeat_strings APIs (#8589) @ttnghia
  • Nested scalar support for copy if else (#8588) @gerashegalov
  • User specified decimal columns to float64 (#8587) @jdye64
  • Add get_element for struct column (#8578) @isVoid
  • Python changes for adding __getitem__ for struct (#8577) @shaneding
  • Add strings::repeat_strings API that can repeat each string a different number of times (#8561) @ttnghia
  • Refactor tests/iterator_utilities.hpp functions (#8540) @ttnghia
  • Support MERGE_LISTS and MERGE_SETS in Java package (#8516) @sperlingxx
  • Decimal support csv reader (#8511) @elstehle
  • Add column type tests (#8505) @isVoid
  • Warn when downscaling decimal columns (#8492) @ChrisJar
  • Add JNI for strings::repeat_strings (#8491) @ttnghia
  • Add Index.get_loc for Numerical, String Index support (#8489) @isVoid
  • Expose half_up rounding in cuDF (#8477) @shwina
  • Java APIs to fetch CUDA runtime info (#8465) @sperlingxx
  • Add str.edit_distance_matrix (#8463) @isVoid
  • Support constructing cudf.Scalar objects from host side lists (#8459) @brandon-b-miller
  • Add accurate hash join size functions (#8453) @PointKernel
  • Add cudf::strings::integer_to_hex convert API (#8450) @davidwendt
  • Create objects from iterables that contain cudf.NA (#8442) @brandon-b-miller
  • JNI bindings for sort_lists (#8439) @sperlingxx
  • Expose a Decimal32Dtype in cuDF Python (#8438) @skirui-source
  • Replace all_null() and all_valid() by iterator_all_nulls() and iterator_no_null() in tests (#8437) @ttnghia
  • Implement groupby MERGE_LISTS and MERGE_SETS aggregates (#8436) @ttnghia
  • Add public libcudf match_dictionaries API (#8429) @davidwendt
  • Add move constructors for string_scalar and struct_scalar (#8428) @ttnghia
  • Implement strings::repeat_strings (#8423) @ttnghia
  • STRUCT column support for cudf::merge. (#8422) @nvdbaranec
  • Implement reverse in libcudf (#8410) @shaneding
  • Support multiple input files/buffers for read_json (#8403) @jdye64
  • Improve test coverage for struct search (#8396) @ttnghia
  • Add groupby.fillna (#8362) @isVoid
  • Enable AST-based joining (#8214) @vyasr
  • Generalized null support in user defined functions (#8213) @brandon-b-miller
  • Add compiled binary operation (#8192) @karthikeyann
  • Implement .describe() for DataFrameGroupBy (#8179) @skirui-source
  • ORC - Support reading multiple orc files/buffers in a single operation (#8142) @jdye64
  • Add Python bindings for lists::concatenate_list_elements and expose them as .list.concat() (#8006) @shwina
  • Use Arrow URI FileSystem backed instance to retrieve remote files (#7709) @jdye64
  • Example to build custom application and link to libcudf (#7671) @isVoid
  • Upgrade arrow to 4.0.1 (#7495) @galipremsagar

πŸ› οΈ Improvements

  • Provide a better error message when CUDA::cuda_driver not found (#8794) @robertmaynard
  • Remove anonymous namespace from null_mask.cuh (#8786) @nvdbaranec
  • Allow cudf to be built without libcuda.so existing (#8751) @robertmaynard
  • Pin mimesis to <4.1 (#8745) @galipremsagar
  • Update conda environment name for CI (#8692) @ajschmidt8
  • Remove flatbuffers dependency (#8671) @Ethyling
  • Add options to build Arrow with Python and Parquet support (#8670) @trxcllnt
  • Remove unused cudf::strings::create_offsets (#8663) @davidwendt
  • Update GDS lib version to 1.0.0 (#8654) @pxLi
  • Support for groupby/scan rank and dense_rank aggregations (#8652) @rwlee
  • Fix usage of deprecated arrow ipc API (#8632) @revans2
  • Use absolute imports in cudf (#8631) @galipremsagar
  • ENH Add Java CI build script (#8627) @dillon-cullinan
  • Add DeprecationWarning to ser.str.subword_tokenize (#8603) @VibhuJawa
  • Rewrite binary operations for improved performance and additional type support (#8598) @vyasr
  • Fix mypy errors surfacing because of numpy-1.21.0 (#8595) @galipremsagar
  • Remove unneeded includes from cudf::string_view headers (#8594) @davidwendt
  • Use cmake 3.20.1 as it is now required by rmm (#8586) @robertmaynard
  • Remove device debug symbols from cmake CUDF_CUDA_FLAGS (#8584) @davidwendt
  • Dask-CuDF: use default Dask Dataframe optimizer (#8581) @madsbk
  • Remove checking if an unsigned value is less than zero (#8579) @robertmaynard
  • Remove strings_count parameter from cudf::strings::detail::create_chars_child_column (#8576) @davidwendt
  • Make cudf.api.types imports consistent (#8571) @galipremsagar
  • Modernize libcudf basic example CMakeFile; updates CI build tests (#8568) @isVoid
  • Rename concatenate_tests.cu to .cpp (#8555) @davidwendt
  • enable window lead/lag test on struct (#8548) @wbo4958
  • Add Java methods to split and write column views (#8546) @razajafri
  • Small cleanup (#8534) @codereport
  • Unpin dask version in CI (#8533) @galipremsagar
  • Added optional flag for building Arrow with S3 filesystem support (#8531) @jdye64
  • Minor clean up of various internal column and frame utilities (#8528) @vyasr
  • Rename some copying_test source files .cu to .cpp (#8527) @davidwendt
  • Correct the last warnings and issues when using newer cuda versions (#8525) @robertmaynard
  • Correct unused parameter warnings in transform and unary ops (#8521) @robertmaynard
  • Correct unused parameter warnings in string algorithms (#8509) @robertmaynard
  • Add in JNI APIs for scan, replace_nulls, group_by.scan, and group_by.replace_nulls (#8503) @revans2
  • Fix 21.08 forward-merge conflicts (#8502) @ajschmidt8
  • Fix Cython formatting command in Contributing.md. (#8496) @marlenezw
  • Bug/correct unused parameters in reshape and text (#8495) @robertmaynard
  • Correct unused parameter warnings in partitioning and stream compact (#8494) @robertmaynard
  • Correct unused parameter warnings in labelling and list algorithms (#8493) @robertmaynard
  • Refactor index construction (#8485) @vyasr
  • Correct unused parameter warnings in replace algorithms (#8483) @robertmaynard
  • Correct unused parameter warnings in reduction algorithms (#8481) @robertmaynard
  • Correct unused parameter warnings in io algorithms (#8480) @robertmaynard
  • Correct unused parameter warnings in interop algorithms (#8479) @robertmaynard
  • Correct unused parameter warnings in filling algorithms (#8468) @robertmaynard
  • Correct unused parameter warnings in groupby (#8467) @robertmaynard
  • use libcu++ time_point as timestamp (#8466) @karthikeyann
  • Modify reprog_device::extract to return groups in a single pass (#8460) @davidwendt
  • Update minimum Dask requirement to 2021.6.0 (#8458) @pentschev
  • Fix failures when performing binary operations on DataFrames with empty columns (#8452) @ChrisJar
  • Fix conflicts in 8447 (#8448) @ajschmidt8
  • Add serialization methods for List and StructDtype (#8441) @charlesbluca
  • Replace make_empty_strings_column with make_empty_column (#8435) @davidwendt
  • JNI bindings for get_element (#8433) @revans2
  • Update dask make_meta changes to be compatible with dask upstream (#8426) @galipremsagar
  • Unpin dask version on CI (#8425) @galipremsagar
  • Add benchmark for strings/fixed_point convert APIs (#8417) @davidwendt
  • Adapt cudf::scalar classes to changes in rmm::device_scalar (#8411) @harrism
  • Add benchmark for strings/integers convert APIs (#8402) @davidwendt
  • Enable multi-file partitioning in dask_cudf.read_parquet (#8393) @rjzamora
  • Correct unused parameter warnings in rolling algorithms (#8390) @robertmaynard
  • Correct unused parameters in column round and search (#8389) @robertmaynard
  • Add functionality to apply Dtype metadata to ColumnBase (#8373) @charlesbluca
  • Refactor setting stack size in regex code (#8358) @davidwendt
  • Update Java bindings to 21.08-SNAPSHOT (#8344) @pxLi
  • Replace remaining uses of device_vector (#8343) @harrism
  • Statically link libnvcomp into libcudfjni (#8334) @jlowe
  • Resolve auto merge conflicts for Branch 21.08 from branch 21.06 (#8329) @galipremsagar
  • Minor code refactor for sorted_order (#8326) @wbo4958
  • Remove special Index class from the general index class hierarchy (#8309) @vyasr
  • Add first-class dtype utilities (#8308) @vyasr
  • Add option to link Java bindings with Arrow dynamically (#8307) @jlowe
  • Refactor ColumnMethods and its subclasses to remove column argument and require parent argument (#8306) @shwina
  • Refactor scatter for list columns (#8255) @isVoid
  • Expose pack/unpack API to Python (#8153) @charlesbluca
  • Adding cudf.cut method (#8002) @marlenezw
  • Optimize string gather performance for large strings (#7980) @gaohao95
  • Add peak memory usage tracking to cuIO benchmarks (#7770) @devavret
  • Updating Clang Version to 11.0.0 (#6695) @codereport
cudf - v21.06.01

Published by GPUtester over 3 years ago

cudf - v21.06.00

Published by GPUtester over 3 years ago

🚨 Breaking Changes

  • Add support for make_meta_obj dispatch in dask-cudf (#8342) @galipremsagar
  • Add separator-on-null parameter to strings concatenate APIs (#8282) @davidwendt
  • Introduce a common parent class for NumericalColumn and DecimalColumn (#8278) @vyasr
  • Update ORC statistics API to use C++17 standard library (#8241) @vuule
  • Preserve column hierarchy when getting NULL row from LIST column (#8206) @isVoid
  • Groupby.shift c++ API refactor and python binding (#8131) @isVoid

πŸ› Bug Fixes

  • Fix struct flattening to add a validity column only when the input column has null element (#8374) @ttnghia
  • Compilation fix: Remove redefinition for std::is_same_v() (#8369) @mythrocks
  • Add backward compatibility for dask-cudf to work with other versions of dask (#8368) @galipremsagar
  • Handle empty results with nested types in copy_if_else (#8359) @nvdbaranec
  • Handle nested column types properly for empty parquet files. (#8350) @nvdbaranec
  • Raise error when unsupported arguments are passed to dask_cudf.DataFrame.sort_values (#8349) @galipremsagar
  • Raise NotImplementedError for axis=1 in rank (#8347) @galipremsagar
  • Add support for make_meta_obj dispatch in dask-cudf (#8342) @galipremsagar
  • Update Java string concatenate test for single column (#8330) @tgravescs
  • Use empty_like in scatter (#8314) @revans2
  • Fix concatenate_lists_ignore_null on rows of all_nulls (#8312) @sperlingxx
  • Add separator-on-null parameter to strings concatenate APIs (#8282) @davidwendt
  • COLLECT_LIST support returning empty output columns. (#8279) @mythrocks
  • Update io util to convert path like object to string (#8275) @ayushdg
  • Fix result column types for empty inputs to rolling window (#8274) @mythrocks
  • Actually test equality in assert_groupby_results_equal (#8272) @shwina
  • CMake always explicitly specify a source files extension (#8270) @robertmaynard
  • Fix struct binary search and struct flattening (#8268) @ttnghia
  • Revert "patch thrust to fix intmax num elements limitation in scan_by_key" (#8263) @cwharris
  • upgrade dlpack to 0.5 (#8262) @cwharris
  • Fixes CSV-reader type inference for thousands separator and decimal point (#8261) @elstehle
  • Fix incorrect assertion in Java concat (#8258) @sperlingxx
  • Copy nested types upon construction (#8244) @isVoid
  • Preserve column hierarchy when getting NULL row from LIST column (#8206) @isVoid
  • Clip decimal binary op precision at max precision (#8194) @ChrisJar

πŸ“– Documentation

  • Add docstring for dask_cudf.read_csv (#8355) @galipremsagar
  • Fix cudf release version in readme (#8331) @galipremsagar
  • Fix structs column description in dev docs (#8318) @isVoid
  • Update readme with correct CUDA versions (#8315) @raydouglass
  • Add description of the cuIO GDS integration (#8293) @vuule
  • Remove unused parameter from copy_partition kernel documentation (#8283) @robertmaynard

πŸš€ New Features

  • Add support merging b/w categorical data (#8332) @galipremsagar
  • Java: Support struct scalar (#8327) @sperlingxx
  • added _is_homogeneous property (#8299) @shaneding
  • Added decimal writing for CSV writer (#8296) @kaatish
  • Java: Support creating a scalar from utf8 string (#8294) @firestarman
  • Add Java API for Concatenate strings with separator (#8289) @tgravescs
  • strings::join_list_elements options for empty list inputs (#8285) @ttnghia
  • Return python lists for getitem calls to list type series (#8265) @brandon-b-miller
  • add unit tests for lead/lag on list for row window (#8259) @wbo4958
  • Create a String column from UTF8 String byte arrays (#8257) @firestarman
  • Support scattering list_scalar (#8256) @isVoid
  • Implement lists::concatenate_list_elements (#8231) @ttnghia
  • Support for struct scalars. (#8220) @nvdbaranec
  • Add support for decimal types in ORC writer (#8198) @vuule
  • Support create lists column from a list_scalar (#8185) @isVoid
  • Groupby.shift c++ API refactor and python binding (#8131) @isVoid
  • Add groupby::replace_nulls(replace_policy) api (#7118) @isVoid

πŸ› οΈ Improvements

  • Support Dask + Distributed 2021.05.1 (#8392) @jakirkham
  • Add aliases for string methods (#8353) @shwina
  • Update environment variable used to determine cuda_version (#8321) @ajschmidt8
  • JNI: Refactor the code of making column from scalar (#8310) @firestarman
  • Update CHANGELOG.md links for calver (#8303) @ajschmidt8
  • Merge branch-0.19 into branch-21.06 (#8302) @ajschmidt8
  • use address and length for GDS reads/writes (#8301) @rongou
  • Update cudfjni version to 21.06.0 (#8292) @pxLi
  • Update docs build script (#8284) @ajschmidt8
  • Make device_buffer streams explicit and enforce move construction (#8280) @harrism
  • Introduce a common parent class for NumericalColumn and DecimalColumn (#8278) @vyasr
  • Do not add nulls to the hash table when null_equality::NOT_EQUAL is passed to left_semi_join and left_anti_join (#8277) @nvdbaranec
  • Enable implicit casting when concatenating mixed types (#8276) @ChrisJar
  • Fix CMake FindPackage rmm, pin dev envs' dlpack to v0.3 (#8271) @trxcllnt
  • Update cudfjni version to 21.06 (#8267) @pxLi
  • support RMM aligned resource adapter in JNI (#8266) @rongou
  • Pass compiler environment variables to conda python build (#8260) @Ethyling
  • Remove abc inheritance from Serializable (#8254) @vyasr
  • Move more methods into SingleColumnFrame (#8253) @vyasr
  • Update ORC statistics API to use C++17 standard library (#8241) @vuule
  • Correct unused parameter warnings in dictonary algorithms (#8239) @robertmaynard
  • Correct unused parameters in the copying algorithms (#8232) @robertmaynard
  • IO statistics cleanup (#8191) @kaatish
  • Refactor of rolling_window implementation. (#8158) @nvdbaranec
  • Add a flag for allowing single quotes in JSON strings. (#8144) @nvdbaranec
  • Column refactoring 2 (#8130) @vyasr
  • support space in workspace (#7956) @jolorunyomi
  • Support collect_set on rolling window (#7881) @sperlingxx
cudf - v0.19.2

Published by GPUtester over 3 years ago

🚨 Breaking Changes

  • Allow hash_partition to take a seed value (#7771) @magnatelee
  • Allow merging index column with data column using keyword "on" (#7736) @skirui-source
  • Change JNI API to avoid loading native dependencies when creating sort order classes. (#7729) @revans2
  • Replace device_vector with device_uvector in null_mask (#7715) @harrism
  • Don't identify decimals as strings. (#7710) @vyasr
  • Fix Java Parquet write after writer API changes (#7655) @revans2
  • Convert cudf::concatenate APIs to use spans and device_uvector (#7621) @harrism
  • Update missing docstring examples in python public APIs (#7546) @galipremsagar
  • Remove unneeded step parameter from strings::detail::copy_slice (#7525) @davidwendt
  • Rename ARROW_STATIC_LIB because it conflicts with one in FindArrow.cmake (#7518) @trxcllnt
  • Match Pandas logic for comparing two objects with nulls (#7490) @brandon-b-miller
  • Add struct support to parquet writer (#7461) @devavret
  • Join APIs that return gathermaps (#7454) @shwina
  • fixed_point + cudf::binary_operation API Changes (#7435) @codereport
  • Fix BUG: Exception when PYTHONOPTIMIZE=2 (#7434) @skirui-source
  • Change nvtext::load_vocabulary_file to return a unique ptr (#7424) @davidwendt
  • Refactor strings column factories (#7397) @harrism
  • Use CMAKE_CUDA_ARCHITECTURES (#7391) @robertmaynard
  • Upgrade pandas to 1.2 (#7375) @galipremsagar
  • Rename logical_cast to bit_cast and allow additional conversions (#7373) @ttnghia
  • Rework libcudf CMakeLists.txt to export targets for CPM (#7107) @trxcllnt

πŸ› Bug Fixes

  • unsnap: busy wait a number of cycles (#8073) @vuule
  • Fix returned column type when extracting from an empty list column (#8031) @jlowe
  • Don't reindex an new value on setitem if the original dataframe was empty (#8026) @vyasr
  • Fix a NameError in meta dispatch API (#7996) @galipremsagar
  • Reindex in DataFrame.__setitem__ (#7957) @galipremsagar
  • jitify direct-to-cubin compilation and caching. (#7919) @cwharris
  • Use dynamic cudart for nvcomp in java build (#7896) @abellina
  • fix "incompatible redefinition" warnings (#7894) @cwharris
  • cudf consistently specifies the cuda runtime (#7887) @robertmaynard
  • disable verbose output for jitify_preprocess (#7886) @cwharris
  • CMake jit_preprocess_files function only runs when needed (#7872) @robertmaynard
  • Push DeviceScalar construction into cython for list.contains (#7864) @brandon-b-miller
  • cudf now sets an install rpath of $ORIGIN (#7863) @robertmaynard
  • Don't install Thrust examples, tests, docs, and python files (#7811) @robertmaynard
  • Sort by index in groupby tests more consistently (#7802) @shwina
  • Revert "Update conda recipes pinning of repo dependencies (#7743)" (#7793) @raydouglass
  • Add decimal column handling in copy_type_metadata (#7788) @shwina
  • Add column names validation in parquet writer (#7786) @galipremsagar
  • Fix Java explode outer unit tests (#7782) @jlowe
  • Fix compiler warning about non-POD types passed through ellipsis (#7781) @jrhemstad
  • User resource fix for replace_nulls (#7769) @magnatelee
  • Fix type dispatch for columnar replace_nulls (#7768) @jlowe
  • Add ignore_order parameter to dask-cudf concat dispatch (#7765) @galipremsagar
  • Fix slicing and arrow representations of decimal columns (#7755) @vyasr
  • Fixing issue with explode_outer position not nulling position entries of null rows (#7754) @hyperbolic2346
  • Implement scatter for struct columns (#7752) @ttnghia
  • Fix data corruption in string columns (#7746) @galipremsagar
  • Fix string length in stripe dictionary building (#7744) @kaatish
  • Update conda recipes pinning of repo dependencies (#7743) @mike-wendt
  • Enable dask dispatch to cuDF's is_categorical_dtype for cuDF objects (#7740) @brandon-b-miller
  • Fix dictionary size computation in ORC writer (#7737) @vuule
  • Fix cudf::cast overflow for decimal64 to int32_t or smaller in certain cases (#7733) @codereport
  • Change JNI API to avoid loading native dependencies when creating sort order classes. (#7729) @revans2
  • Disable column_view data accessors for unsupported types (#7725) @jrhemstad
  • Materialize RangeIndex when index=True in parquet writer (#7711) @galipremsagar
  • Don't identify decimals as strings. (#7710) @vyasr
  • Fix return type of DataFrame.argsort (#7706) @galipremsagar
  • Fix/correct cudf installed package requirements (#7688) @robertmaynard
  • Fix SparkMurmurHash3_32 hash inconsistencies with Apache Spark (#7672) @jlowe
  • Fix ORC reader issue with reading empty string columns (#7656) @rgsl888prabhu
  • Fix Java Parquet write after writer API changes (#7655) @revans2
  • Fixing empty null lists throwing explode_outer for a loop. (#7649) @hyperbolic2346
  • Fix internal compiler error during JNI Docker build (#7645) @jlowe
  • Fix Debug build break with device_uvectors in grouped_rolling.cu (#7633) @mythrocks
  • Parquet reader: Fix issue when using skip_rows on non-nested columns containing nulls (#7627) @nvdbaranec
  • Fix ORC reader for empty DataFrame/Table (#7624) @rgsl888prabhu
  • Fix specifying GPU architecture in JNI build (#7612) @jlowe
  • Fix ORC writer OOM issue (#7605) @vuule
  • Fix 0.18 --> 0.19 automerge (#7589) @kkraus14
  • Fix ORC issue with incorrect timestamp nanosecond values (#7581) @vuule
  • Fix missing Dask imports (#7580) @kkraus14
  • CMAKE_CUDA_ARCHITECTURES doesn't change when build-system invokes cmake (#7579) @robertmaynard
  • Another fix for offsets_end() iterator in lists_column_view (#7575) @ttnghia
  • Fix ORC writer output corruption with string columns (#7565) @vuule
  • Fix cudf::lists::sort_lists failing for sliced column (#7564) @ttnghia
  • FIX Fix Anaconda upload args (#7558) @dillon-cullinan
  • Fix index mismatch issue in equality related APIs (#7555) @galipremsagar
  • FIX Revert gpuci_conda_retry on conda file output locations (#7552) @dillon-cullinan
  • Fix offset_end iterator for lists_column_view, which was not correctl… (#7551) @ttnghia
  • Fix no such file dlpack.h error when build libcudf (#7549) @chenrui17
  • Update missing docstring examples in python public APIs (#7546) @galipremsagar
  • Decimal32 Build Fix (#7544) @razajafri
  • FIX Retry conda output location (#7540) @dillon-cullinan
  • fix missing renames of dask git branches from master to main (#7535) @kkraus14
  • Remove detail from device_span (#7533) @rwlee
  • Change dask and distributed branch to main (#7532) @dantegd
  • Update JNI build to use CUDF_USE_ARROW_STATIC (#7526) @jlowe
  • Make sure rmm::rmm CMake target is visibile to cudf users (#7524) @robertmaynard
  • Fix contiguous_split not properly handling output partitions > 2 GB. (#7515) @nvdbaranec
  • Change jit launch to safe_launch (#7510) @devavret
  • Fix comparison between Datetime/Timedelta columns and NULL scalars (#7504) @brandon-b-miller
  • Fix off-by-one error in char-parallel string scalar replace (#7502) @jlowe
  • Fix JNI deprecation of all, put it on the wrong version before (#7501) @revans2
  • Fix Series/Dataframe Mixed Arithmetic (#7491) @brandon-b-miller
  • Fix JNI build after removal of libcudf sub-libraries (#7486) @jlowe
  • Correctly compile benchmarks (#7485) @robertmaynard
  • Fix bool column corruption with ORC Reader (#7483) @rgsl888prabhu
  • Fix __repr__ for categorical dtype (#7476) @galipremsagar
  • Java cleaner synchronization (#7474) @abellina
  • Fix java float/double parsing tests (#7473) @revans2
  • Pass stream and user resource to make_default_constructed_scalar (#7469) @magnatelee
  • Improve stability of dask_cudf.DataFrame.var and dask_cudf.DataFrame.std (#7453) @rjzamora
  • Missing device_storage_dispatch change affecting cudf::gather (#7449) @codereport
  • fix cuFile JNI compile errors (#7445) @rongou
  • Support Series.__setitem__ with key to a new row (#7443) @isVoid
  • Fix BUG: Exception when PYTHONOPTIMIZE=2 (#7434) @skirui-source
  • Make inclusive scan safe for cases with leading nulls (#7432) @magnatelee
  • Fix typo in list_device_view::pair_rep_end() (#7423) @mythrocks
  • Fix string to double conversion and row equivalent comparison (#7410) @ttnghia
  • Fix thrust failure when transfering data from device_vector to host_vector with vectors of size 1 (#7382) @ttnghia
  • Fix std::exeception catch-by-reference gcc9 compile error (#7380) @davidwendt
  • Fix skiprows issue with ORC Reader (#7359) @rgsl888prabhu
  • fix Arrow CMake file (#7358) @rongou
  • Fix lists::contains() for NaN and Decimals (#7349) @mythrocks
  • Handle cupy array in Dataframe.__setitem__ (#7340) @galipremsagar
  • Fix invalid-device-fn error in cudf::strings::replace_re with multiple regex's (#7336) @davidwendt
  • FIX Add codecov upload block to gpu script (#6860) @dillon-cullinan

πŸ“– Documentation

  • Fix join API doxygen (#7890) @shwina
  • Add Resources to README. (#7697) @bdice
  • Add isin examples in Docstring (#7479) @galipremsagar
  • Resolving unlinked type shorthands in cudf doc (#7416) @isVoid
  • Fix typo in regex.md doc page (#7363) @davidwendt
  • Fix incorrect strings_column_view::chars_size documentation (#7360) @jlowe

πŸš€ New Features

  • Enable basic reductions for decimal columns (#7776) @ChrisJar
  • Enable join on decimal columns (#7764) @ChrisJar
  • Allow merging index column with data column using keyword "on" (#7736) @skirui-source
  • Implement DecimalColumn + Scalar and add cudf.Scalars of Decimal64Dtype (#7732) @brandon-b-miller
  • Add support for unique groupby aggregation (#7726) @shwina
  • Expose libcudf's label_bins function to cudf (#7724) @vyasr
  • Adding support for equi-join on struct (#7720) @hyperbolic2346
  • Add decimal column comparison operations (#7716) @isVoid
  • Implement scan operations for decimal columns (#7707) @ChrisJar
  • Enable typecasting between decimal and int (#7691) @ChrisJar
  • Enable decimal support in parquet writer (#7673) @devavret
  • Adds list.unique API (#7664) @isVoid
  • Fix NaN handling in drop_list_duplicates (#7662) @ttnghia
  • Add lists.sort_values API (#7657) @isVoid
  • Add is_integer API that can check for the validity of a string-to-integer conversion (#7642) @ttnghia
  • Adds explode API (#7607) @isVoid
  • Adds list.take, python binding for cudf::lists::segmented_gather (#7591) @isVoid
  • Implement cudf::label_bins() (#7554) @vyasr
  • Add Python bindings for lists::contains (#7547) @skirui-source
  • cudf::row_bit_count() support. (#7534) @nvdbaranec
  • Implement drop_list_duplicates (#7528) @ttnghia
  • Add Python bindings for lists::extract_lists_element (#7505) @skirui-source
  • Add explode_outer and explode_outer_position (#7499) @hyperbolic2346
  • Match Pandas logic for comparing two objects with nulls (#7490) @brandon-b-miller
  • Add struct support to parquet writer (#7461) @devavret
  • Enable type conversion from float to decimal type (#7450) @ChrisJar
  • Add cython for converting strings/fixed-point functions (#7429) @davidwendt
  • Add struct column support to cudf::sort and cudf::sorted_order (#7422) @karthikeyann
  • Implement groupby collect_set (#7420) @ttnghia
  • Merge branch-0.18 into branch-0.19 (#7411) @raydouglass
  • Refactor strings column factories (#7397) @harrism
  • Add groupby scan operations (sort groupby) (#7387) @karthikeyann
  • Add cudf::explode_position (#7376) @hyperbolic2346
  • Add string conversion to/from decimal values libcudf APIs (#7364) @davidwendt
  • Add groupby SUM_OF_SQUARES support (#7362) @karthikeyann
  • Add Series.drop api (#7304) @isVoid
  • get_json_object() implementation (#7286) @nvdbaranec
  • Python API for LIstMethods.len() (#7283) @isVoid
  • Support null_policy::EXCLUDE for COLLECT rolling aggregation (#7264) @mythrocks
  • Add support for special tokens in nvtext::subword_tokenizer (#7254) @davidwendt
  • Fix inplace update of data and add Series.update (#7201) @galipremsagar
  • Implement cudf::group_by (hash) for decimal32 and decimal64 (#7190) @codereport
  • Adding support to specify "level" parameter for Dataframe.rename (#7135) @skirui-source

πŸ› οΈ Improvements

  • fix GDS include path for version 0.95 (#7877) @rongou
  • Update dask + distributed to 2021.4.0 (#7858) @jakirkham
  • Add ability to extract include dirs from CUDF_HOME (#7848) @galipremsagar
  • Add USE_GDS as an option in build script (#7833) @pxLi
  • add an allocate method with stream in java DeviceMemoryBuffer (#7826) @rongou
  • Constrain dask and distributed versions to 2021.3.1 (#7825) @shwina
  • Revert dask versioning of concat dispatch (#7823) @galipremsagar
  • add copy methods in Java memory buffer (#7791) @rongou
  • Update README and CONTRIBUTING for 0.19 (#7778) @robertmaynard
  • Allow hash_partition to take a seed value (#7771) @magnatelee
  • Turn on NVTX by default in java build (#7761) @tgravescs
  • Add Java bindings to join gather map APIs (#7751) @jlowe
  • Add replacements column support for Java replaceNulls (#7750) @jlowe
  • Add Java bindings for row_bit_count (#7749) @jlowe
  • Remove unused JVM array creation (#7748) @jlowe
  • Added JNI support for new is_integer (#7739) @revans2
  • Create and promote library aliases in libcudf installations (#7734) @trxcllnt
  • Support groupby operations for decimal dtypes (#7731) @vyasr
  • Memory map the input file only when GDS compatiblity mode is not used (#7717) @vuule
  • Replace device_vector with device_uvector in null_mask (#7715) @harrism
  • Struct hashing support for SerialMurmur3 and SparkMurmur3 (#7714) @jlowe
  • Add gbenchmark for nvtext replace-tokens function (#7708) @davidwendt
  • Use stream in groupby calls (#7705) @karthikeyann
  • Update codeowners file (#7701) @ajschmidt8
  • Cleanup groupby to use host_span, device_span, device_uvector (#7698) @karthikeyann
  • Add gbenchmark for nvtext ngrams functions (#7693) @davidwendt
  • Misc Python/Cython optimizations (#7686) @shwina
  • Add gbenchmark for nvtext tokenize functions (#7684) @davidwendt
  • Add column_device_view to orc writer (#7676) @kaatish
  • cudf_kafka now uses cuDF CMake export targets (CPM) (#7674) @robertmaynard
  • Add gbenchmark for nvtext normalize functions (#7668) @davidwendt
  • Resolve unnecessary import of thrust/optional.hpp in types.hpp (#7667) @vyasr
  • Feature/optimize accessor copy (#7660) @vyasr
  • Fix find_package(cudf) (#7658) @trxcllnt
  • Work-around for gcc7 compile error on Centos7 (#7652) @davidwendt
  • Add in JNI support for count_elements (#7651) @revans2
  • Fix issues with building cudf in a non-conda environment (#7647) @galipremsagar
  • Refactor ConfigureCUDA to not conditionally insert compiler flags (#7643) @robertmaynard
  • Add gbenchmark for converting strings to/from timestamps (#7641) @davidwendt
  • Handle constructing a cudf.Scalar from a cudf.Scalar (#7639) @shwina
  • Add in JNI support for table partition (#7637) @revans2
  • Add explicit fixed_point merge test (#7635) @codereport
  • Add JNI support for IDENTITY hash partitioning (#7626) @revans2
  • Java support on explode_outer (#7625) @sperlingxx
  • Java support of casting string from/to decimal (#7623) @sperlingxx
  • Convert cudf::concatenate APIs to use spans and device_uvector (#7621) @harrism
  • Add gbenchmark for cudf::strings::translate function (#7617) @davidwendt
  • Use file(COPY ) over file(INSTALL ) so cmake output is reduced (#7616) @robertmaynard
  • Use rmm::device_uvector in place of rmm::device_vector for ORC reader/writer and cudf::io::column_buffer (#7614) @vuule
  • Refactor Java host-side buffer concatenation to expose separate steps (#7610) @jlowe
  • Add gbenchmarks for string substrings functions (#7603) @davidwendt
  • Refactor string conversion check (#7599) @ttnghia
  • JNI: Pass names of children struct columns to native Arrow IPC writer (#7598) @firestarman
  • Revert "ENH Fix stale GHA and prevent duplicates " (#7595) @mike-wendt
  • ENH Fix stale GHA and prevent duplicates (#7594) @mike-wendt
  • Fix auto-detecting GPU architectures (#7593) @trxcllnt
  • Reduce cudf library size (#7583) @robertmaynard
  • Optimize cudf::make_strings_column for long strings (#7576) @davidwendt
  • Always build and export the cudf::cudftestutil target (#7574) @trxcllnt
  • Eliminate literal parameters to uvector::set_element_async and device_scalar::set_value (#7563) @harrism
  • Add gbenchmark for strings::concatenate (#7560) @davidwendt
  • Update Changelog Link (#7550) @ajschmidt8
  • Add gbenchmarks for strings replace regex functions (#7541) @davidwendt
  • Add __repr__ for Column and ColumnAccessor (#7531) @shwina
  • Support Decimal DIV changes in cudf (#7527) @razajafri
  • Remove unneeded step parameter from strings::detail::copy_slice (#7525) @davidwendt
  • Use device_uvector, device_span in sort groupby (#7523) @karthikeyann
  • Add gbenchmarks for strings extract function (#7522) @davidwendt
  • Rename ARROW_STATIC_LIB because it conflicts with one in FindArrow.cmake (#7518) @trxcllnt
  • Reduce compile time/size for scan.cu (#7516) @davidwendt
  • Change device_vector to device_uvector in nvtext source files (#7512) @davidwendt
  • Removed unneeded includes from traits.hpp (#7509) @davidwendt
  • FIX Remove random build directory generation for ccache (#7508) @dillon-cullinan
  • xfail failing pytest in pandas 1.2.3 (#7507) @galipremsagar
  • JNI bit cast (#7493) @revans2
  • Combine rolling window function tests (#7480) @mythrocks
  • Prepare Changelog for Automation (#7477) @ajschmidt8
  • Java support for explode position (#7471) @sperlingxx
  • Update 0.18 changelog entry (#7463) @ajschmidt8
  • JNI: Support skipping nulls for collect aggregation (#7457) @firestarman
  • Join APIs that return gathermaps (#7454) @shwina
  • Remove dependence on managed memory for multimap test (#7451) @jrhemstad
  • Use cuFile for Parquet IO when available (#7444) @vuule
  • Statistics cleanup (#7439) @kaatish
  • Add gbenchmarks for strings filter functions (#7438) @davidwendt
  • fixed_point + cudf::binary_operation API Changes (#7435) @codereport
  • Improve string gather performance (#7433) @jlowe
  • Don't use user resource for a temporary allocation in sort_by_key (#7431) @magnatelee
  • Detail APIs for datetime functions (#7430) @magnatelee
  • Replace thrust::max_element with thrust::reduce in strings findall_re (#7428) @davidwendt
  • Add gbenchmark for strings split/split_record functions (#7427) @davidwendt
  • Update JNI build to use CMAKE_CUDA_ARCHITECTURES (#7425) @jlowe
  • Change nvtext::load_vocabulary_file to return a unique ptr (#7424) @davidwendt
  • Simplify type dispatch with device_storage_dispatch (#7419) @codereport
  • Java support for casting of nested child columns (#7417) @razajafri
  • Improve scalar string replace performance for long strings (#7415) @jlowe
  • Remove unneeded temporary device vector for strings scatter specialization (#7409) @davidwendt
  • bitmask_or implementation with bitmask refactor (#7406) @rwlee
  • Add other cudf::strings::replace functions to current strings replace gbenchmark (#7403) @davidwendt
  • Clean up included headers in device_operators.cuh (#7401) @codereport
  • Move nullable index iterator to indexalator factory (#7399) @davidwendt
  • ENH Pass ccache variables to conda recipe & use Ninja in CI (#7398) @Ethyling
  • upgrade maven-antrun-plugin to support maven parallel builds (#7393) @rongou
  • Add gbenchmark for strings find/contains functions (#7392) @davidwendt
  • Use CMAKE_CUDA_ARCHITECTURES (#7391) @robertmaynard
  • Refactor libcudf strings::replace to use make_strings_children utility (#7384) @davidwendt
  • Added in JNI support for out of core sort algorithm (#7381) @revans2
  • Upgrade pandas to 1.2 (#7375) @galipremsagar
  • Rename logical_cast to bit_cast and allow additional conversions (#7373) @ttnghia
  • jitify 2 support (#7372) @cwharris
  • compile_udf: Cache PTX for similar functions (#7371) @gmarkall
  • Add string scalar replace benchmark (#7369) @jlowe
  • Add gbenchmark for strings contains_re/count_re functions (#7366) @davidwendt
  • Update orc reader and writer fuzz tests (#7357) @galipremsagar
  • Improve url_decode performance for long strings (#7353) @jlowe
  • cudf::ast Small Refactorings (#7352) @codereport
  • Remove std::cout and print in the scatter test function EmptyListsOfNullableStrings. (#7342) @ttnghia
  • Use cudf::detail::make_counting_transform_iterator (#7338) @codereport
  • Change block size parameter from a global to a template param. (#7333) @nvdbaranec
  • Partial clean up of ORC writer (#7324) @vuule
  • Add gbenchmark for cudf::strings::to_lower (#7316) @davidwendt
  • Update Java bindings version to 0.19-SNAPSHOT (#7307) @pxLi
  • Move cudf::test::make_counting_transform_iterator to cudf/detail/iterator.cuh (#7306) @codereport
  • Use string literals in fixed_point release_asserts (#7303) @codereport
  • Fix merge conflicts for #7295 (#7297) @ajschmidt8
  • Add UTF-8 chars to create_random_column<string_view> benchmark utility (#7292) @davidwendt
  • Abstracting block reduce and block scan from cuIO kernels with cub apis (#7278) @rgsl888prabhu
  • Build.sh use cmake --build to drive build system invocation (#7270) @robertmaynard
  • Refactor dictionary support for reductions any/all (#7242) @davidwendt
  • Replace stream.value() with stream for stream_view args (#7236) @karthikeyann
  • Interval index and interval_range (#7182) @marlenezw
  • avro reader integration tests (#7156) @cwharris
  • Rework libcudf CMakeLists.txt to export targets for CPM (#7107) @trxcllnt
  • Adding Interval Dtype (#6984) @marlenezw
  • Cleaning up for loops with make_(counting_)transform_iterator (#6546) @codereport
cudf - v0.19.1

Published by GPUtester over 3 years ago

🚨 Breaking Changes

  • Allow hash_partition to take a seed value (#7771) @magnatelee
  • Allow merging index column with data column using keyword "on" (#7736) @skirui-source
  • Change JNI API to avoid loading native dependencies when creating sort order classes. (#7729) @revans2
  • Replace device_vector with device_uvector in null_mask (#7715) @harrism
  • Don't identify decimals as strings. (#7710) @vyasr
  • Fix Java Parquet write after writer API changes (#7655) @revans2
  • Convert cudf::concatenate APIs to use spans and device_uvector (#7621) @harrism
  • Update missing docstring examples in python public APIs (#7546) @galipremsagar
  • Remove unneeded step parameter from strings::detail::copy_slice (#7525) @davidwendt
  • Rename ARROW_STATIC_LIB because it conflicts with one in FindArrow.cmake (#7518) @trxcllnt
  • Match Pandas logic for comparing two objects with nulls (#7490) @brandon-b-miller
  • Add struct support to parquet writer (#7461) @devavret
  • Join APIs that return gathermaps (#7454) @shwina
  • fixed_point + cudf::binary_operation API Changes (#7435) @codereport
  • Fix BUG: Exception when PYTHONOPTIMIZE=2 (#7434) @skirui-source
  • Change nvtext::load_vocabulary_file to return a unique ptr (#7424) @davidwendt
  • Refactor strings column factories (#7397) @harrism
  • Use CMAKE_CUDA_ARCHITECTURES (#7391) @robertmaynard
  • Upgrade pandas to 1.2 (#7375) @galipremsagar
  • Rename logical_cast to bit_cast and allow additional conversions (#7373) @ttnghia
  • Rework libcudf CMakeLists.txt to export targets for CPM (#7107) @trxcllnt

πŸ› Bug Fixes

  • Fix returned column type when extracting from an empty list column (#8031) @jlowe
  • Don't reindex an new value on setitem if the original dataframe was empty (#8026) @vyasr
  • Fix a NameError in meta dispatch API (#7996) @galipremsagar
  • Reindex in DataFrame.__setitem__ (#7957) @galipremsagar
  • jitify direct-to-cubin compilation and caching. (#7919) @cwharris
  • Use dynamic cudart for nvcomp in java build (#7896) @abellina
  • fix "incompatible redefinition" warnings (#7894) @cwharris
  • cudf consistently specifies the cuda runtime (#7887) @robertmaynard
  • disable verbose output for jitify_preprocess (#7886) @cwharris
  • CMake jit_preprocess_files function only runs when needed (#7872) @robertmaynard
  • Push DeviceScalar construction into cython for list.contains (#7864) @brandon-b-miller
  • cudf now sets an install rpath of $ORIGIN (#7863) @robertmaynard
  • Don't install Thrust examples, tests, docs, and python files (#7811) @robertmaynard
  • Sort by index in groupby tests more consistently (#7802) @shwina
  • Revert "Update conda recipes pinning of repo dependencies (#7743)" (#7793) @raydouglass
  • Add decimal column handling in copy_type_metadata (#7788) @shwina
  • Add column names validation in parquet writer (#7786) @galipremsagar
  • Fix Java explode outer unit tests (#7782) @jlowe
  • Fix compiler warning about non-POD types passed through ellipsis (#7781) @jrhemstad
  • User resource fix for replace_nulls (#7769) @magnatelee
  • Fix type dispatch for columnar replace_nulls (#7768) @jlowe
  • Add ignore_order parameter to dask-cudf concat dispatch (#7765) @galipremsagar
  • Fix slicing and arrow representations of decimal columns (#7755) @vyasr
  • Fixing issue with explode_outer position not nulling position entries of null rows (#7754) @hyperbolic2346
  • Implement scatter for struct columns (#7752) @ttnghia
  • Fix data corruption in string columns (#7746) @galipremsagar
  • Fix string length in stripe dictionary building (#7744) @kaatish
  • Update conda recipes pinning of repo dependencies (#7743) @mike-wendt
  • Enable dask dispatch to cuDF's is_categorical_dtype for cuDF objects (#7740) @brandon-b-miller
  • Fix dictionary size computation in ORC writer (#7737) @vuule
  • Fix cudf::cast overflow for decimal64 to int32_t or smaller in certain cases (#7733) @codereport
  • Change JNI API to avoid loading native dependencies when creating sort order classes. (#7729) @revans2
  • Disable column_view data accessors for unsupported types (#7725) @jrhemstad
  • Materialize RangeIndex when index=True in parquet writer (#7711) @galipremsagar
  • Don't identify decimals as strings. (#7710) @vyasr
  • Fix return type of DataFrame.argsort (#7706) @galipremsagar
  • Fix/correct cudf installed package requirements (#7688) @robertmaynard
  • Fix SparkMurmurHash3_32 hash inconsistencies with Apache Spark (#7672) @jlowe
  • Fix ORC reader issue with reading empty string columns (#7656) @rgsl888prabhu
  • Fix Java Parquet write after writer API changes (#7655) @revans2
  • Fixing empty null lists throwing explode_outer for a loop. (#7649) @hyperbolic2346
  • Fix internal compiler error during JNI Docker build (#7645) @jlowe
  • Fix Debug build break with device_uvectors in grouped_rolling.cu (#7633) @mythrocks
  • Parquet reader: Fix issue when using skip_rows on non-nested columns containing nulls (#7627) @nvdbaranec
  • Fix ORC reader for empty DataFrame/Table (#7624) @rgsl888prabhu
  • Fix specifying GPU architecture in JNI build (#7612) @jlowe
  • Fix ORC writer OOM issue (#7605) @vuule
  • Fix 0.18 --> 0.19 automerge (#7589) @kkraus14
  • Fix ORC issue with incorrect timestamp nanosecond values (#7581) @vuule
  • Fix missing Dask imports (#7580) @kkraus14
  • CMAKE_CUDA_ARCHITECTURES doesn't change when build-system invokes cmake (#7579) @robertmaynard
  • Another fix for offsets_end() iterator in lists_column_view (#7575) @ttnghia
  • Fix ORC writer output corruption with string columns (#7565) @vuule
  • Fix cudf::lists::sort_lists failing for sliced column (#7564) @ttnghia
  • FIX Fix Anaconda upload args (#7558) @dillon-cullinan
  • Fix index mismatch issue in equality related APIs (#7555) @galipremsagar
  • FIX Revert gpuci_conda_retry on conda file output locations (#7552) @dillon-cullinan
  • Fix offset_end iterator for lists_column_view, which was not correctl… (#7551) @ttnghia
  • Fix no such file dlpack.h error when build libcudf (#7549) @chenrui17
  • Update missing docstring examples in python public APIs (#7546) @galipremsagar
  • Decimal32 Build Fix (#7544) @razajafri
  • FIX Retry conda output location (#7540) @dillon-cullinan
  • fix missing renames of dask git branches from master to main (#7535) @kkraus14
  • Remove detail from device_span (#7533) @rwlee
  • Change dask and distributed branch to main (#7532) @dantegd
  • Update JNI build to use CUDF_USE_ARROW_STATIC (#7526) @jlowe
  • Make sure rmm::rmm CMake target is visibile to cudf users (#7524) @robertmaynard
  • Fix contiguous_split not properly handling output partitions > 2 GB. (#7515) @nvdbaranec
  • Change jit launch to safe_launch (#7510) @devavret
  • Fix comparison between Datetime/Timedelta columns and NULL scalars (#7504) @brandon-b-miller
  • Fix off-by-one error in char-parallel string scalar replace (#7502) @jlowe
  • Fix JNI deprecation of all, put it on the wrong version before (#7501) @revans2
  • Fix Series/Dataframe Mixed Arithmetic (#7491) @brandon-b-miller
  • Fix JNI build after removal of libcudf sub-libraries (#7486) @jlowe
  • Correctly compile benchmarks (#7485) @robertmaynard
  • Fix bool column corruption with ORC Reader (#7483) @rgsl888prabhu
  • Fix __repr__ for categorical dtype (#7476) @galipremsagar
  • Java cleaner synchronization (#7474) @abellina
  • Fix java float/double parsing tests (#7473) @revans2
  • Pass stream and user resource to make_default_constructed_scalar (#7469) @magnatelee
  • Improve stability of dask_cudf.DataFrame.var and dask_cudf.DataFrame.std (#7453) @rjzamora
  • Missing device_storage_dispatch change affecting cudf::gather (#7449) @codereport
  • fix cuFile JNI compile errors (#7445) @rongou
  • Support Series.__setitem__ with key to a new row (#7443) @isVoid
  • Fix BUG: Exception when PYTHONOPTIMIZE=2 (#7434) @skirui-source
  • Make inclusive scan safe for cases with leading nulls (#7432) @magnatelee
  • Fix typo in list_device_view::pair_rep_end() (#7423) @mythrocks
  • Fix string to double conversion and row equivalent comparison (#7410) @ttnghia
  • Fix thrust failure when transfering data from device_vector to host_vector with vectors of size 1 (#7382) @ttnghia
  • Fix std::exeception catch-by-reference gcc9 compile error (#7380) @davidwendt
  • Fix skiprows issue with ORC Reader (#7359) @rgsl888prabhu
  • fix Arrow CMake file (#7358) @rongou
  • Fix lists::contains() for NaN and Decimals (#7349) @mythrocks
  • Handle cupy array in Dataframe.__setitem__ (#7340) @galipremsagar
  • Fix invalid-device-fn error in cudf::strings::replace_re with multiple regex's (#7336) @davidwendt
  • FIX Add codecov upload block to gpu script (#6860) @dillon-cullinan

πŸ“– Documentation

  • Fix join API doxygen (#7890) @shwina
  • Add Resources to README. (#7697) @bdice
  • Add isin examples in Docstring (#7479) @galipremsagar
  • Resolving unlinked type shorthands in cudf doc (#7416) @isVoid
  • Fix typo in regex.md doc page (#7363) @davidwendt
  • Fix incorrect strings_column_view::chars_size documentation (#7360) @jlowe

πŸš€ New Features

  • Enable basic reductions for decimal columns (#7776) @ChrisJar
  • Enable join on decimal columns (#7764) @ChrisJar
  • Allow merging index column with data column using keyword "on" (#7736) @skirui-source
  • Implement DecimalColumn + Scalar and add cudf.Scalars of Decimal64Dtype (#7732) @brandon-b-miller
  • Add support for unique groupby aggregation (#7726) @shwina
  • Expose libcudf's label_bins function to cudf (#7724) @vyasr
  • Adding support for equi-join on struct (#7720) @hyperbolic2346
  • Add decimal column comparison operations (#7716) @isVoid
  • Implement scan operations for decimal columns (#7707) @ChrisJar
  • Enable typecasting between decimal and int (#7691) @ChrisJar
  • Enable decimal support in parquet writer (#7673) @devavret
  • Adds list.unique API (#7664) @isVoid
  • Fix NaN handling in drop_list_duplicates (#7662) @ttnghia
  • Add lists.sort_values API (#7657) @isVoid
  • Add is_integer API that can check for the validity of a string-to-integer conversion (#7642) @ttnghia
  • Adds explode API (#7607) @isVoid
  • Adds list.take, python binding for cudf::lists::segmented_gather (#7591) @isVoid
  • Implement cudf::label_bins() (#7554) @vyasr
  • Add Python bindings for lists::contains (#7547) @skirui-source
  • cudf::row_bit_count() support. (#7534) @nvdbaranec
  • Implement drop_list_duplicates (#7528) @ttnghia
  • Add Python bindings for lists::extract_lists_element (#7505) @skirui-source
  • Add explode_outer and explode_outer_position (#7499) @hyperbolic2346
  • Match Pandas logic for comparing two objects with nulls (#7490) @brandon-b-miller
  • Add struct support to parquet writer (#7461) @devavret
  • Enable type conversion from float to decimal type (#7450) @ChrisJar
  • Add cython for converting strings/fixed-point functions (#7429) @davidwendt
  • Add struct column support to cudf::sort and cudf::sorted_order (#7422) @karthikeyann
  • Implement groupby collect_set (#7420) @ttnghia
  • Merge branch-0.18 into branch-0.19 (#7411) @raydouglass
  • Refactor strings column factories (#7397) @harrism
  • Add groupby scan operations (sort groupby) (#7387) @karthikeyann
  • Add cudf::explode_position (#7376) @hyperbolic2346
  • Add string conversion to/from decimal values libcudf APIs (#7364) @davidwendt
  • Add groupby SUM_OF_SQUARES support (#7362) @karthikeyann
  • Add Series.drop api (#7304) @isVoid
  • get_json_object() implementation (#7286) @nvdbaranec
  • Python API for LIstMethods.len() (#7283) @isVoid
  • Support null_policy::EXCLUDE for COLLECT rolling aggregation (#7264) @mythrocks
  • Add support for special tokens in nvtext::subword_tokenizer (#7254) @davidwendt
  • Fix inplace update of data and add Series.update (#7201) @galipremsagar
  • Implement cudf::group_by (hash) for decimal32 and decimal64 (#7190) @codereport
  • Adding support to specify "level" parameter for Dataframe.rename (#7135) @skirui-source

πŸ› οΈ Improvements

  • fix GDS include path for version 0.95 (#7877) @rongou
  • Update dask + distributed to 2021.4.0 (#7858) @jakirkham
  • Add ability to extract include dirs from CUDF_HOME (#7848) @galipremsagar
  • Add USE_GDS as an option in build script (#7833) @pxLi
  • add an allocate method with stream in java DeviceMemoryBuffer (#7826) @rongou
  • Constrain dask and distributed versions to 2021.3.1 (#7825) @shwina
  • Revert dask versioning of concat dispatch (#7823) @galipremsagar
  • add copy methods in Java memory buffer (#7791) @rongou
  • Update README and CONTRIBUTING for 0.19 (#7778) @robertmaynard
  • Allow hash_partition to take a seed value (#7771) @magnatelee
  • Turn on NVTX by default in java build (#7761) @tgravescs
  • Add Java bindings to join gather map APIs (#7751) @jlowe
  • Add replacements column support for Java replaceNulls (#7750) @jlowe
  • Add Java bindings for row_bit_count (#7749) @jlowe
  • Remove unused JVM array creation (#7748) @jlowe
  • Added JNI support for new is_integer (#7739) @revans2
  • Create and promote library aliases in libcudf installations (#7734) @trxcllnt
  • Support groupby operations for decimal dtypes (#7731) @vyasr
  • Memory map the input file only when GDS compatiblity mode is not used (#7717) @vuule
  • Replace device_vector with device_uvector in null_mask (#7715) @harrism
  • Struct hashing support for SerialMurmur3 and SparkMurmur3 (#7714) @jlowe
  • Add gbenchmark for nvtext replace-tokens function (#7708) @davidwendt
  • Use stream in groupby calls (#7705) @karthikeyann
  • Update codeowners file (#7701) @ajschmidt8
  • Cleanup groupby to use host_span, device_span, device_uvector (#7698) @karthikeyann
  • Add gbenchmark for nvtext ngrams functions (#7693) @davidwendt
  • Misc Python/Cython optimizations (#7686) @shwina
  • Add gbenchmark for nvtext tokenize functions (#7684) @davidwendt
  • Add column_device_view to orc writer (#7676) @kaatish
  • cudf_kafka now uses cuDF CMake export targets (CPM) (#7674) @robertmaynard
  • Add gbenchmark for nvtext normalize functions (#7668) @davidwendt
  • Resolve unnecessary import of thrust/optional.hpp in types.hpp (#7667) @vyasr
  • Feature/optimize accessor copy (#7660) @vyasr
  • Fix find_package(cudf) (#7658) @trxcllnt
  • Work-around for gcc7 compile error on Centos7 (#7652) @davidwendt
  • Add in JNI support for count_elements (#7651) @revans2
  • Fix issues with building cudf in a non-conda environment (#7647) @galipremsagar
  • Refactor ConfigureCUDA to not conditionally insert compiler flags (#7643) @robertmaynard
  • Add gbenchmark for converting strings to/from timestamps (#7641) @davidwendt
  • Handle constructing a cudf.Scalar from a cudf.Scalar (#7639) @shwina
  • Add in JNI support for table partition (#7637) @revans2
  • Add explicit fixed_point merge test (#7635) @codereport
  • Add JNI support for IDENTITY hash partitioning (#7626) @revans2
  • Java support on explode_outer (#7625) @sperlingxx
  • Java support of casting string from/to decimal (#7623) @sperlingxx
  • Convert cudf::concatenate APIs to use spans and device_uvector (#7621) @harrism
  • Add gbenchmark for cudf::strings::translate function (#7617) @davidwendt
  • Use file(COPY ) over file(INSTALL ) so cmake output is reduced (#7616) @robertmaynard
  • Use rmm::device_uvector in place of rmm::device_vector for ORC reader/writer and cudf::io::column_buffer (#7614) @vuule
  • Refactor Java host-side buffer concatenation to expose separate steps (#7610) @jlowe
  • Add gbenchmarks for string substrings functions (#7603) @davidwendt
  • Refactor string conversion check (#7599) @ttnghia
  • JNI: Pass names of children struct columns to native Arrow IPC writer (#7598) @firestarman
  • Revert "ENH Fix stale GHA and prevent duplicates " (#7595) @mike-wendt
  • ENH Fix stale GHA and prevent duplicates (#7594) @mike-wendt
  • Fix auto-detecting GPU architectures (#7593) @trxcllnt
  • Reduce cudf library size (#7583) @robertmaynard
  • Optimize cudf::make_strings_column for long strings (#7576) @davidwendt
  • Always build and export the cudf::cudftestutil target (#7574) @trxcllnt
  • Eliminate literal parameters to uvector::set_element_async and device_scalar::set_value (#7563) @harrism
  • Add gbenchmark for strings::concatenate (#7560) @davidwendt
  • Update Changelog Link (#7550) @ajschmidt8
  • Add gbenchmarks for strings replace regex functions (#7541) @davidwendt
  • Add __repr__ for Column and ColumnAccessor (#7531) @shwina
  • Support Decimal DIV changes in cudf (#7527) @razajafri
  • Remove unneeded step parameter from strings::detail::copy_slice (#7525) @davidwendt
  • Use device_uvector, device_span in sort groupby (#7523) @karthikeyann
  • Add gbenchmarks for strings extract function (#7522) @davidwendt
  • Rename ARROW_STATIC_LIB because it conflicts with one in FindArrow.cmake (#7518) @trxcllnt
  • Reduce compile time/size for scan.cu (#7516) @davidwendt
  • Change device_vector to device_uvector in nvtext source files (#7512) @davidwendt
  • Removed unneeded includes from traits.hpp (#7509) @davidwendt
  • FIX Remove random build directory generation for ccache (#7508) @dillon-cullinan
  • xfail failing pytest in pandas 1.2.3 (#7507) @galipremsagar
  • JNI bit cast (#7493) @revans2
  • Combine rolling window function tests (#7480) @mythrocks
  • Prepare Changelog for Automation (#7477) @ajschmidt8
  • Java support for explode position (#7471) @sperlingxx
  • Update 0.18 changelog entry (#7463) @ajschmidt8
  • JNI: Support skipping nulls for collect aggregation (#7457) @firestarman
  • Join APIs that return gathermaps (#7454) @shwina
  • Remove dependence on managed memory for multimap test (#7451) @jrhemstad
  • Use cuFile for Parquet IO when available (#7444) @vuule
  • Statistics cleanup (#7439) @kaatish
  • Add gbenchmarks for strings filter functions (#7438) @davidwendt
  • fixed_point + cudf::binary_operation API Changes (#7435) @codereport
  • Improve string gather performance (#7433) @jlowe
  • Don't use user resource for a temporary allocation in sort_by_key (#7431) @magnatelee
  • Detail APIs for datetime functions (#7430) @magnatelee
  • Replace thrust::max_element with thrust::reduce in strings findall_re (#7428) @davidwendt
  • Add gbenchmark for strings split/split_record functions (#7427) @davidwendt
  • Update JNI build to use CMAKE_CUDA_ARCHITECTURES (#7425) @jlowe
  • Change nvtext::load_vocabulary_file to return a unique ptr (#7424) @davidwendt
  • Simplify type dispatch with device_storage_dispatch (#7419) @codereport
  • Java support for casting of nested child columns (#7417) @razajafri
  • Improve scalar string replace performance for long strings (#7415) @jlowe
  • Remove unneeded temporary device vector for strings scatter specialization (#7409) @davidwendt
  • bitmask_or implementation with bitmask refactor (#7406) @rwlee
  • Add other cudf::strings::replace functions to current strings replace gbenchmark (#7403) @davidwendt
  • Clean up included headers in device_operators.cuh (#7401) @codereport
  • Move nullable index iterator to indexalator factory (#7399) @davidwendt
  • ENH Pass ccache variables to conda recipe & use Ninja in CI (#7398) @Ethyling
  • upgrade maven-antrun-plugin to support maven parallel builds (#7393) @rongou
  • Add gbenchmark for strings find/contains functions (#7392) @davidwendt
  • Use CMAKE_CUDA_ARCHITECTURES (#7391) @robertmaynard
  • Refactor libcudf strings::replace to use make_strings_children utility (#7384) @davidwendt
  • Added in JNI support for out of core sort algorithm (#7381) @revans2
  • Upgrade pandas to 1.2 (#7375) @galipremsagar
  • Rename logical_cast to bit_cast and allow additional conversions (#7373) @ttnghia
  • jitify 2 support (#7372) @cwharris
  • compile_udf: Cache PTX for similar functions (#7371) @gmarkall
  • Add string scalar replace benchmark (#7369) @jlowe
  • Add gbenchmark for strings contains_re/count_re functions (#7366) @davidwendt
  • Update orc reader and writer fuzz tests (#7357) @galipremsagar
  • Improve url_decode performance for long strings (#7353) @jlowe
  • cudf::ast Small Refactorings (#7352) @codereport
  • Remove std::cout and print in the scatter test function EmptyListsOfNullableStrings. (#7342) @ttnghia
  • Use cudf::detail::make_counting_transform_iterator (#7338) @codereport
  • Change block size parameter from a global to a template param. (#7333) @nvdbaranec
  • Partial clean up of ORC writer (#7324) @vuule
  • Add gbenchmark for cudf::strings::to_lower (#7316) @davidwendt
  • Update Java bindings version to 0.19-SNAPSHOT (#7307) @pxLi
  • Move cudf::test::make_counting_transform_iterator to cudf/detail/iterator.cuh (#7306) @codereport
  • Use string literals in fixed_point release_asserts (#7303) @codereport
  • Fix merge conflicts for #7295 (#7297) @ajschmidt8
  • Add UTF-8 chars to create_random_column<string_view> benchmark utility (#7292) @davidwendt
  • Abstracting block reduce and block scan from cuIO kernels with cub apis (#7278) @rgsl888prabhu
  • Build.sh use cmake --build to drive build system invocation (#7270) @robertmaynard
  • Refactor dictionary support for reductions any/all (#7242) @davidwendt
  • Replace stream.value() with stream for stream_view args (#7236) @karthikeyann
  • Interval index and interval_range (#7182) @marlenezw
  • avro reader integration tests (#7156) @cwharris
  • Rework libcudf CMakeLists.txt to export targets for CPM (#7107) @trxcllnt
  • Adding Interval Dtype (#6984) @marlenezw
  • Cleaning up for loops with make_(counting_)transform_iterator (#6546) @codereport
cudf - v0.19.0

Published by GPUtester over 3 years ago

🚨 Breaking Changes

  • Allow hash_partition to take a seed value (#7771) @magnatelee
  • Allow merging index column with data column using keyword "on" (#7736) @skirui-source
  • Change JNI API to avoid loading native dependencies when creating sort order classes. (#7729) @revans2
  • Replace device_vector with device_uvector in null_mask (#7715) @harrism
  • Don't identify decimals as strings. (#7710) @vyasr
  • Fix Java Parquet write after writer API changes (#7655) @revans2
  • Convert cudf::concatenate APIs to use spans and device_uvector (#7621) @harrism
  • Update missing docstring examples in python public APIs (#7546) @galipremsagar
  • Remove unneeded step parameter from strings::detail::copy_slice (#7525) @davidwendt
  • Rename ARROW_STATIC_LIB because it conflicts with one in FindArrow.cmake (#7518) @trxcllnt
  • Match Pandas logic for comparing two objects with nulls (#7490) @brandon-b-miller
  • Add struct support to parquet writer (#7461) @devavret
  • Join APIs that return gathermaps (#7454) @shwina
  • fixed_point + cudf::binary_operation API Changes (#7435) @codereport
  • Fix BUG: Exception when PYTHONOPTIMIZE=2 (#7434) @skirui-source
  • Change nvtext::load_vocabulary_file to return a unique ptr (#7424) @davidwendt
  • Refactor strings column factories (#7397) @harrism
  • Use CMAKE_CUDA_ARCHITECTURES (#7391) @robertmaynard
  • Upgrade pandas to 1.2 (#7375) @galipremsagar
  • Rename logical_cast to bit_cast and allow additional conversions (#7373) @ttnghia
  • Rework libcudf CMakeLists.txt to export targets for CPM (#7107) @trxcllnt

πŸ› Bug Fixes

  • Fix a NameError in meta dispatch API (#7996) @galipremsagar
  • Reindex in DataFrame.__setitem__ (#7957) @galipremsagar
  • jitify direct-to-cubin compilation and caching. (#7919) @cwharris
  • Use dynamic cudart for nvcomp in java build (#7896) @abellina
  • fix "incompatible redefinition" warnings (#7894) @cwharris
  • cudf consistently specifies the cuda runtime (#7887) @robertmaynard
  • disable verbose output for jitify_preprocess (#7886) @cwharris
  • CMake jit_preprocess_files function only runs when needed (#7872) @robertmaynard
  • Push DeviceScalar construction into cython for list.contains (#7864) @brandon-b-miller
  • cudf now sets an install rpath of $ORIGIN (#7863) @robertmaynard
  • Don't install Thrust examples, tests, docs, and python files (#7811) @robertmaynard
  • Sort by index in groupby tests more consistently (#7802) @shwina
  • Revert "Update conda recipes pinning of repo dependencies (#7743)" (#7793) @raydouglass
  • Add decimal column handling in copy_type_metadata (#7788) @shwina
  • Add column names validation in parquet writer (#7786) @galipremsagar
  • Fix Java explode outer unit tests (#7782) @jlowe
  • Fix compiler warning about non-POD types passed through ellipsis (#7781) @jrhemstad
  • User resource fix for replace_nulls (#7769) @magnatelee
  • Fix type dispatch for columnar replace_nulls (#7768) @jlowe
  • Add ignore_order parameter to dask-cudf concat dispatch (#7765) @galipremsagar
  • Fix slicing and arrow representations of decimal columns (#7755) @vyasr
  • Fixing issue with explode_outer position not nulling position entries of null rows (#7754) @hyperbolic2346
  • Implement scatter for struct columns (#7752) @ttnghia
  • Fix data corruption in string columns (#7746) @galipremsagar
  • Fix string length in stripe dictionary building (#7744) @kaatish
  • Update conda recipes pinning of repo dependencies (#7743) @mike-wendt
  • Enable dask dispatch to cuDF's is_categorical_dtype for cuDF objects (#7740) @brandon-b-miller
  • Fix dictionary size computation in ORC writer (#7737) @vuule
  • Fix cudf::cast overflow for decimal64 to int32_t or smaller in certain cases (#7733) @codereport
  • Change JNI API to avoid loading native dependencies when creating sort order classes. (#7729) @revans2
  • Disable column_view data accessors for unsupported types (#7725) @jrhemstad
  • Materialize RangeIndex when index=True in parquet writer (#7711) @galipremsagar
  • Don't identify decimals as strings. (#7710) @vyasr
  • Fix return type of DataFrame.argsort (#7706) @galipremsagar
  • Fix/correct cudf installed package requirements (#7688) @robertmaynard
  • Fix SparkMurmurHash3_32 hash inconsistencies with Apache Spark (#7672) @jlowe
  • Fix ORC reader issue with reading empty string columns (#7656) @rgsl888prabhu
  • Fix Java Parquet write after writer API changes (#7655) @revans2
  • Fixing empty null lists throwing explode_outer for a loop. (#7649) @hyperbolic2346
  • Fix internal compiler error during JNI Docker build (#7645) @jlowe
  • Fix Debug build break with device_uvectors in grouped_rolling.cu (#7633) @mythrocks
  • Parquet reader: Fix issue when using skip_rows on non-nested columns containing nulls (#7627) @nvdbaranec
  • Fix ORC reader for empty DataFrame/Table (#7624) @rgsl888prabhu
  • Fix specifying GPU architecture in JNI build (#7612) @jlowe
  • Fix ORC writer OOM issue (#7605) @vuule
  • Fix 0.18 --> 0.19 automerge (#7589) @kkraus14
  • Fix ORC issue with incorrect timestamp nanosecond values (#7581) @vuule
  • Fix missing Dask imports (#7580) @kkraus14
  • CMAKE_CUDA_ARCHITECTURES doesn't change when build-system invokes cmake (#7579) @robertmaynard
  • Another fix for offsets_end() iterator in lists_column_view (#7575) @ttnghia
  • Fix ORC writer output corruption with string columns (#7565) @vuule
  • Fix cudf::lists::sort_lists failing for sliced column (#7564) @ttnghia
  • FIX Fix Anaconda upload args (#7558) @dillon-cullinan
  • Fix index mismatch issue in equality related APIs (#7555) @galipremsagar
  • FIX Revert gpuci_conda_retry on conda file output locations (#7552) @dillon-cullinan
  • Fix offset_end iterator for lists_column_view, which was not correctl… (#7551) @ttnghia
  • Fix no such file dlpack.h error when build libcudf (#7549) @chenrui17
  • Update missing docstring examples in python public APIs (#7546) @galipremsagar
  • Decimal32 Build Fix (#7544) @razajafri
  • FIX Retry conda output location (#7540) @dillon-cullinan
  • fix missing renames of dask git branches from master to main (#7535) @kkraus14
  • Remove detail from device_span (#7533) @rwlee
  • Change dask and distributed branch to main (#7532) @dantegd
  • Update JNI build to use CUDF_USE_ARROW_STATIC (#7526) @jlowe
  • Make sure rmm::rmm CMake target is visibile to cudf users (#7524) @robertmaynard
  • Fix contiguous_split not properly handling output partitions > 2 GB. (#7515) @nvdbaranec
  • Change jit launch to safe_launch (#7510) @devavret
  • Fix comparison between Datetime/Timedelta columns and NULL scalars (#7504) @brandon-b-miller
  • Fix off-by-one error in char-parallel string scalar replace (#7502) @jlowe
  • Fix JNI deprecation of all, put it on the wrong version before (#7501) @revans2
  • Fix Series/Dataframe Mixed Arithmetic (#7491) @brandon-b-miller
  • Fix JNI build after removal of libcudf sub-libraries (#7486) @jlowe
  • Correctly compile benchmarks (#7485) @robertmaynard
  • Fix bool column corruption with ORC Reader (#7483) @rgsl888prabhu
  • Fix __repr__ for categorical dtype (#7476) @galipremsagar
  • Java cleaner synchronization (#7474) @abellina
  • Fix java float/double parsing tests (#7473) @revans2
  • Pass stream and user resource to make_default_constructed_scalar (#7469) @magnatelee
  • Improve stability of dask_cudf.DataFrame.var and dask_cudf.DataFrame.std (#7453) @rjzamora
  • Missing device_storage_dispatch change affecting cudf::gather (#7449) @codereport
  • fix cuFile JNI compile errors (#7445) @rongou
  • Support Series.__setitem__ with key to a new row (#7443) @isVoid
  • Fix BUG: Exception when PYTHONOPTIMIZE=2 (#7434) @skirui-source
  • Make inclusive scan safe for cases with leading nulls (#7432) @magnatelee
  • Fix typo in list_device_view::pair_rep_end() (#7423) @mythrocks
  • Fix string to double conversion and row equivalent comparison (#7410) @ttnghia
  • Fix thrust failure when transfering data from device_vector to host_vector with vectors of size 1 (#7382) @ttnghia
  • Fix std::exeception catch-by-reference gcc9 compile error (#7380) @davidwendt
  • Fix skiprows issue with ORC Reader (#7359) @rgsl888prabhu
  • fix Arrow CMake file (#7358) @rongou
  • Fix lists::contains() for NaN and Decimals (#7349) @mythrocks
  • Handle cupy array in Dataframe.__setitem__ (#7340) @galipremsagar
  • Fix invalid-device-fn error in cudf::strings::replace_re with multiple regex's (#7336) @davidwendt
  • FIX Add codecov upload block to gpu script (#6860) @dillon-cullinan

πŸ“– Documentation

  • Fix join API doxygen (#7890) @shwina
  • Add Resources to README. (#7697) @bdice
  • Add isin examples in Docstring (#7479) @galipremsagar
  • Resolving unlinked type shorthands in cudf doc (#7416) @isVoid
  • Fix typo in regex.md doc page (#7363) @davidwendt
  • Fix incorrect strings_column_view::chars_size documentation (#7360) @jlowe

πŸš€ New Features

  • Enable basic reductions for decimal columns (#7776) @ChrisJar
  • Enable join on decimal columns (#7764) @ChrisJar
  • Allow merging index column with data column using keyword "on" (#7736) @skirui-source
  • Implement DecimalColumn + Scalar and add cudf.Scalars of Decimal64Dtype (#7732) @brandon-b-miller
  • Add support for unique groupby aggregation (#7726) @shwina
  • Expose libcudf's label_bins function to cudf (#7724) @vyasr
  • Adding support for equi-join on struct (#7720) @hyperbolic2346
  • Add decimal column comparison operations (#7716) @isVoid
  • Implement scan operations for decimal columns (#7707) @ChrisJar
  • Enable typecasting between decimal and int (#7691) @ChrisJar
  • Enable decimal support in parquet writer (#7673) @devavret
  • Adds list.unique API (#7664) @isVoid
  • Fix NaN handling in drop_list_duplicates (#7662) @ttnghia
  • Add lists.sort_values API (#7657) @isVoid
  • Add is_integer API that can check for the validity of a string-to-integer conversion (#7642) @ttnghia
  • Adds explode API (#7607) @isVoid
  • Adds list.take, python binding for cudf::lists::segmented_gather (#7591) @isVoid
  • Implement cudf::label_bins() (#7554) @vyasr
  • Add Python bindings for lists::contains (#7547) @skirui-source
  • cudf::row_bit_count() support. (#7534) @nvdbaranec
  • Implement drop_list_duplicates (#7528) @ttnghia
  • Add Python bindings for lists::extract_lists_element (#7505) @skirui-source
  • Add explode_outer and explode_outer_position (#7499) @hyperbolic2346
  • Match Pandas logic for comparing two objects with nulls (#7490) @brandon-b-miller
  • Add struct support to parquet writer (#7461) @devavret
  • Enable type conversion from float to decimal type (#7450) @ChrisJar
  • Add cython for converting strings/fixed-point functions (#7429) @davidwendt
  • Add struct column support to cudf::sort and cudf::sorted_order (#7422) @karthikeyann
  • Implement groupby collect_set (#7420) @ttnghia
  • Merge branch-0.18 into branch-0.19 (#7411) @raydouglass
  • Refactor strings column factories (#7397) @harrism
  • Add groupby scan operations (sort groupby) (#7387) @karthikeyann
  • Add cudf::explode_position (#7376) @hyperbolic2346
  • Add string conversion to/from decimal values libcudf APIs (#7364) @davidwendt
  • Add groupby SUM_OF_SQUARES support (#7362) @karthikeyann
  • Add Series.drop api (#7304) @isVoid
  • get_json_object() implementation (#7286) @nvdbaranec
  • Python API for LIstMethods.len() (#7283) @isVoid
  • Support null_policy::EXCLUDE for COLLECT rolling aggregation (#7264) @mythrocks
  • Add support for special tokens in nvtext::subword_tokenizer (#7254) @davidwendt
  • Fix inplace update of data and add Series.update (#7201) @galipremsagar
  • Implement cudf::group_by (hash) for decimal32 and decimal64 (#7190) @codereport
  • Adding support to specify "level" parameter for Dataframe.rename (#7135) @skirui-source

πŸ› οΈ Improvements

  • fix GDS include path for version 0.95 (#7877) @rongou
  • Update dask + distributed to 2021.4.0 (#7858) @jakirkham
  • Add ability to extract include dirs from CUDF_HOME (#7848) @galipremsagar
  • Add USE_GDS as an option in build script (#7833) @pxLi
  • add an allocate method with stream in java DeviceMemoryBuffer (#7826) @rongou
  • Constrain dask and distributed versions to 2021.3.1 (#7825) @shwina
  • Revert dask versioning of concat dispatch (#7823) @galipremsagar
  • add copy methods in Java memory buffer (#7791) @rongou
  • Update README and CONTRIBUTING for 0.19 (#7778) @robertmaynard
  • Allow hash_partition to take a seed value (#7771) @magnatelee
  • Turn on NVTX by default in java build (#7761) @tgravescs
  • Add Java bindings to join gather map APIs (#7751) @jlowe
  • Add replacements column support for Java replaceNulls (#7750) @jlowe
  • Add Java bindings for row_bit_count (#7749) @jlowe
  • Remove unused JVM array creation (#7748) @jlowe
  • Added JNI support for new is_integer (#7739) @revans2
  • Create and promote library aliases in libcudf installations (#7734) @trxcllnt
  • Support groupby operations for decimal dtypes (#7731) @vyasr
  • Memory map the input file only when GDS compatiblity mode is not used (#7717) @vuule
  • Replace device_vector with device_uvector in null_mask (#7715) @harrism
  • Struct hashing support for SerialMurmur3 and SparkMurmur3 (#7714) @jlowe
  • Add gbenchmark for nvtext replace-tokens function (#7708) @davidwendt
  • Use stream in groupby calls (#7705) @karthikeyann
  • Update codeowners file (#7701) @ajschmidt8
  • Cleanup groupby to use host_span, device_span, device_uvector (#7698) @karthikeyann
  • Add gbenchmark for nvtext ngrams functions (#7693) @davidwendt
  • Misc Python/Cython optimizations (#7686) @shwina
  • Add gbenchmark for nvtext tokenize functions (#7684) @davidwendt
  • Add column_device_view to orc writer (#7676) @kaatish
  • cudf_kafka now uses cuDF CMake export targets (CPM) (#7674) @robertmaynard
  • Add gbenchmark for nvtext normalize functions (#7668) @davidwendt
  • Resolve unnecessary import of thrust/optional.hpp in types.hpp (#7667) @vyasr
  • Feature/optimize accessor copy (#7660) @vyasr
  • Fix find_package(cudf) (#7658) @trxcllnt
  • Work-around for gcc7 compile error on Centos7 (#7652) @davidwendt
  • Add in JNI support for count_elements (#7651) @revans2
  • Fix issues with building cudf in a non-conda environment (#7647) @galipremsagar
  • Refactor ConfigureCUDA to not conditionally insert compiler flags (#7643) @robertmaynard
  • Add gbenchmark for converting strings to/from timestamps (#7641) @davidwendt
  • Handle constructing a cudf.Scalar from a cudf.Scalar (#7639) @shwina
  • Add in JNI support for table partition (#7637) @revans2
  • Add explicit fixed_point merge test (#7635) @codereport
  • Add JNI support for IDENTITY hash partitioning (#7626) @revans2
  • Java support on explode_outer (#7625) @sperlingxx
  • Java support of casting string from/to decimal (#7623) @sperlingxx
  • Convert cudf::concatenate APIs to use spans and device_uvector (#7621) @harrism
  • Add gbenchmark for cudf::strings::translate function (#7617) @davidwendt
  • Use file(COPY ) over file(INSTALL ) so cmake output is reduced (#7616) @robertmaynard
  • Use rmm::device_uvector in place of rmm::device_vector for ORC reader/writer and cudf::io::column_buffer (#7614) @vuule
  • Refactor Java host-side buffer concatenation to expose separate steps (#7610) @jlowe
  • Add gbenchmarks for string substrings functions (#7603) @davidwendt
  • Refactor string conversion check (#7599) @ttnghia
  • JNI: Pass names of children struct columns to native Arrow IPC writer (#7598) @firestarman
  • Revert "ENH Fix stale GHA and prevent duplicates " (#7595) @mike-wendt
  • ENH Fix stale GHA and prevent duplicates (#7594) @mike-wendt
  • Fix auto-detecting GPU architectures (#7593) @trxcllnt
  • Reduce cudf library size (#7583) @robertmaynard
  • Optimize cudf::make_strings_column for long strings (#7576) @davidwendt
  • Always build and export the cudf::cudftestutil target (#7574) @trxcllnt
  • Eliminate literal parameters to uvector::set_element_async and device_scalar::set_value (#7563) @harrism
  • Add gbenchmark for strings::concatenate (#7560) @davidwendt
  • Update Changelog Link (#7550) @ajschmidt8
  • Add gbenchmarks for strings replace regex functions (#7541) @davidwendt
  • Add __repr__ for Column and ColumnAccessor (#7531) @shwina
  • Support Decimal DIV changes in cudf (#7527) @razajafri
  • Remove unneeded step parameter from strings::detail::copy_slice (#7525) @davidwendt
  • Use device_uvector, device_span in sort groupby (#7523) @karthikeyann
  • Add gbenchmarks for strings extract function (#7522) @davidwendt
  • Rename ARROW_STATIC_LIB because it conflicts with one in FindArrow.cmake (#7518) @trxcllnt
  • Reduce compile time/size for scan.cu (#7516) @davidwendt
  • Change device_vector to device_uvector in nvtext source files (#7512) @davidwendt
  • Removed unneeded includes from traits.hpp (#7509) @davidwendt
  • FIX Remove random build directory generation for ccache (#7508) @dillon-cullinan
  • xfail failing pytest in pandas 1.2.3 (#7507) @galipremsagar
  • JNI bit cast (#7493) @revans2
  • Combine rolling window function tests (#7480) @mythrocks
  • Prepare Changelog for Automation (#7477) @ajschmidt8
  • Java support for explode position (#7471) @sperlingxx
  • Update 0.18 changelog entry (#7463) @ajschmidt8
  • JNI: Support skipping nulls for collect aggregation (#7457) @firestarman
  • Join APIs that return gathermaps (#7454) @shwina
  • Remove dependence on managed memory for multimap test (#7451) @jrhemstad
  • Use cuFile for Parquet IO when available (#7444) @vuule
  • Statistics cleanup (#7439) @kaatish
  • Add gbenchmarks for strings filter functions (#7438) @davidwendt
  • fixed_point + cudf::binary_operation API Changes (#7435) @codereport
  • Improve string gather performance (#7433) @jlowe
  • Don't use user resource for a temporary allocation in sort_by_key (#7431) @magnatelee
  • Detail APIs for datetime functions (#7430) @magnatelee
  • Replace thrust::max_element with thrust::reduce in strings findall_re (#7428) @davidwendt
  • Add gbenchmark for strings split/split_record functions (#7427) @davidwendt
  • Update JNI build to use CMAKE_CUDA_ARCHITECTURES (#7425) @jlowe
  • Change nvtext::load_vocabulary_file to return a unique ptr (#7424) @davidwendt
  • Simplify type dispatch with device_storage_dispatch (#7419) @codereport
  • Java support for casting of nested child columns (#7417) @razajafri
  • Improve scalar string replace performance for long strings (#7415) @jlowe
  • Remove unneeded temporary device vector for strings scatter specialization (#7409) @davidwendt
  • bitmask_or implementation with bitmask refactor (#7406) @rwlee
  • Add other cudf::strings::replace functions to current strings replace gbenchmark (#7403) @davidwendt
  • Clean up included headers in device_operators.cuh (#7401) @codereport
  • Move nullable index iterator to indexalator factory (#7399) @davidwendt
  • ENH Pass ccache variables to conda recipe & use Ninja in CI (#7398) @Ethyling
  • upgrade maven-antrun-plugin to support maven parallel builds (#7393) @rongou
  • Add gbenchmark for strings find/contains functions (#7392) @davidwendt
  • Use CMAKE_CUDA_ARCHITECTURES (#7391) @robertmaynard
  • Refactor libcudf strings::replace to use make_strings_children utility (#7384) @davidwendt
  • Added in JNI support for out of core sort algorithm (#7381) @revans2
  • Upgrade pandas to 1.2 (#7375) @galipremsagar
  • Rename logical_cast to bit_cast and allow additional conversions (#7373) @ttnghia
  • jitify 2 support (#7372) @cwharris
  • compile_udf: Cache PTX for similar functions (#7371) @gmarkall
  • Add string scalar replace benchmark (#7369) @jlowe
  • Add gbenchmark for strings contains_re/count_re functions (#7366) @davidwendt
  • Update orc reader and writer fuzz tests (#7357) @galipremsagar
  • Improve url_decode performance for long strings (#7353) @jlowe
  • cudf::ast Small Refactorings (#7352) @codereport
  • Remove std::cout and print in the scatter test function EmptyListsOfNullableStrings. (#7342) @ttnghia
  • Use cudf::detail::make_counting_transform_iterator (#7338) @codereport
  • Change block size parameter from a global to a template param. (#7333) @nvdbaranec
  • Partial clean up of ORC writer (#7324) @vuule
  • Add gbenchmark for cudf::strings::to_lower (#7316) @davidwendt
  • Update Java bindings version to 0.19-SNAPSHOT (#7307) @pxLi
  • Move cudf::test::make_counting_transform_iterator to cudf/detail/iterator.cuh (#7306) @codereport
  • Use string literals in fixed_point release_asserts (#7303) @codereport
  • Fix merge conflicts for #7295 (#7297) @ajschmidt8
  • Add UTF-8 chars to create_random_column<string_view> benchmark utility (#7292) @davidwendt
  • Abstracting block reduce and block scan from cuIO kernels with cub apis (#7278) @rgsl888prabhu
  • Build.sh use cmake --build to drive build system invocation (#7270) @robertmaynard
  • Refactor dictionary support for reductions any/all (#7242) @davidwendt
  • Replace stream.value() with stream for stream_view args (#7236) @karthikeyann
  • Interval index and interval_range (#7182) @marlenezw
  • avro reader integration tests (#7156) @cwharris
  • Rework libcudf CMakeLists.txt to export targets for CPM (#7107) @trxcllnt
  • Adding Interval Dtype (#6984) @marlenezw
  • Cleaning up for loops with make_(counting_)transform_iterator (#6546) @codereport
cudf - v0.18.1

Published by GPUtester over 3 years ago

cudf - [NIGHTLY] v0.18.0

Published by rapids-bot[bot] over 3 years ago

πŸ”— Links

🚨 Breaking Changes

  • Default groupby to sort=False (#7180) @isVoid
  • Add libcudf API for parsing of ORC statistics (#7136) @vuule
  • Replace ORC writer api with class (#7099) @rgsl888prabhu
  • Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
  • Replace parquet writer api with class (#7058) @rgsl888prabhu
  • Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
  • Fix default parameter values of write_csv and write_parquet (#6967) @vuule
  • Align Series.groupby API to match Pandas (#6964) @kkraus14
  • Share factorize implementation with Index and cudf module (#6885) @brandon-b-miller

πŸ› Bug Fixes

  • Fix null-bounds calculation for ranged window queries (#7568) @mythrocks
  • Remove incorrect std::move call on return variable (#7319) @davidwendt
  • Fix failing CI ORC test (#7313) @vuule
  • Disallow constructing frames from a ColumnAccessor (#7298) @shwina
  • fix java cuFile tests (#7296) @rongou
  • Fix style issues related to NumPy (#7279) @shwina
  • Fix bug when iloc slice terminates at before-the-zero position (#7277) @isVoid
  • Fix copying dtype metadata after calling libcudf functions (#7271) @shwina
  • Move lists utility function definition out of header (#7266) @mythrocks
  • Throw if bool column would cause incorrect result when writing to ORC (#7261) @vuule
  • Use uvector in replace_nulls; Fix sort_helper::grouped_value doc (#7256) @isVoid
  • Remove floating point types from cudf::sort fast-path (#7250) @davidwendt
  • Disallow picking output columns from nested columns. (#7248) @devavret
  • Fix loc for Series with a MultiIndex (#7243) @shwina
  • Fix Arrow column test leaks (#7241) @tgravescs
  • Fix test column vector leak (#7238) @kuhushukla
  • Fix some bugs in java scalar support for decimal (#7237) @revans2
  • Improve assert_eq handling of scalar (#7220) @isVoid
  • Fix missing null_count() comparison in test framework and related failures (#7219) @nvdbaranec
  • Remove floating point types from radix sort fast-path (#7215) @davidwendt
  • Fixing parquet benchmarks (#7214) @rgsl888prabhu
  • Handle various parameter combinations in replace API (#7207) @galipremsagar
  • Export mock aws credentials for s3 tests (#7176) @ayushdg
  • Add MultiIndex.rename API (#7172) @isVoid
  • Fix importing list & struct types in from_arrow (#7162) @galipremsagar
  • Fixing parquet precision writing failing if scale is equal to precision (#7146) @hyperbolic2346
  • Update s3 tests to use moto_server (#7144) @ayushdg
  • Fix JIT cache multi-process test flakiness in slow drives (#7142) @devavret
  • Fix compilation errors in libcudf (#7138) @galipremsagar
  • Fix compilation failure caused by -Wall addition. (#7134) @codereport
  • Add informative error message for sep in CSV writer (#7095) @galipremsagar
  • Add JIT cache per compute capability (#7090) @devavret
  • Implement __hash__ method for ListDtype (#7081) @galipremsagar
  • Only upload packages that were built (#7077) @raydouglass
  • Fix comparisons between Series and cudf.NA (#7072) @brandon-b-miller
  • Handle nan values correctly in Series.one_hot_encoding (#7059) @galipremsagar
  • Add unstack() support for non-multiindexed dataframes (#7054) @isVoid
  • Fix read_orc for decimal type (#7034) @rgsl888prabhu
  • Fix backward compatibility of loading a 0.16 pkl file (#7033) @galipremsagar
  • Decimal casts in JNI became a NOOP (#7032) @revans2
  • Restore usual instance/subclass checking to cudf.DateOffset (#7029) @shwina
  • Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
  • Fix to_csv delimiter handling of timestamp format (#7023) @davidwendt
  • Pin librdkakfa to gcc 7 compatible version (#7021) @raydouglass
  • Fix fillna & dropna to also consider np.nan as a missing value (#7019) @galipremsagar
  • Fix round operator's HALF_EVEN computation for negative integers (#7014) @nartal1
  • Skip Thrust sort patch if already applied (#7009) @harrism
  • Fix cudf::hash_partition for decimal32 and decimal64 (#7006) @codereport
  • Fix Thrust unroll patch command (#7002) @harrism
  • Fix loc behaviour when key of incorrect type is used (#6993) @shwina
  • Fix int to datetime conversion in csv_read (#6991) @kaatish
  • fix excluding cufile tests by default (#6988) @rongou
  • Fix java cufile tests when cufile is not installed (#6987) @revans2
  • Make cudf::round for fixed_point when scale = -decimal_places a no-op (#6975) @codereport
  • Fix type comparison for java (#6970) @revans2
  • Fix default parameter values of write_csv and write_parquet (#6967) @vuule
  • Align Series.groupby API to match Pandas (#6964) @kkraus14
  • Fix timestamp parsing in ORC reader for timezones without transitions (#6959) @vuule
  • Fix typo in numerical.py (#6957) @rgsl888prabhu
  • fixed_point_value double-shifts in fixed_point construction (#6950) @codereport
  • fix libcu++ include path for jni (#6948) @rongou
  • Fix groupby agg/apply behaviour when no key columns are provided (#6945) @shwina
  • Avoid inserting null elements into join hash table when nulls are treated as unequal (#6943) @hyperbolic2346
  • Fix cudf::merge gtest for dictionary columns (#6942) @davidwendt
  • Pass numeric scalars of the same dtype through numeric binops (#6938) @brandon-b-miller
  • Fix N/A detection for empty fields in CSV reader (#6922) @vuule
  • Fix rmm_mode=managed parameter for gtests (#6912) @davidwendt
  • Fix nullmask offset handling in parquet and orc writer (#6889) @kaatish
  • Correct the sampling range when sampling with replacement (#6884) @ChrisJar
  • Handle nested string columns with no children in contiguous_split. (#6864) @nvdbaranec
  • Fix columns & index handling in dataframe constructor (#6838) @galipremsagar

πŸ“– Documentation

  • Update readme (#7318) @shwina
  • Fix typo in cudf.core.column.string.extract docs (#7253) @adelevie
  • Update doxyfile project number (#7161) @davidwendt
  • Update 10 minutes to cuDF and CuPy with new APIs (#7158) @ChrisJar
  • Cross link RMM & libcudf Doxygen docs (#7149) @ajschmidt8
  • Add documentation for support dtypes in all IO formats (#7139) @galipremsagar
  • Add groupby docs (#7100) @shwina
  • Update cudf python docstrings with new null representation (&lt;NA&gt;) (#7050) @galipremsagar
  • Make Doxygen comments formatting consistent (#7041) @vuule
  • Add docs for working with missing data (#7010) @galipremsagar
  • Remove warning in from_dlpack and to_dlpack methods (#7001) @miguelusque
  • libcudf Developer Guide (#6977) @harrism
  • Add JNI wrapper for the cuFile API (GDS) (#6940) @rongou

πŸš€ New Features

  • Support numeric_only field for rank() (#7213) @isVoid
  • Add support for cudf::binary_operation TRUE_DIV for decimal32 and decimal64 (#7198) @codereport
  • Implement COLLECT rolling window aggregation (#7189) @mythrocks
  • Add support for array-like inputs in cudf.get_dummies (#7181) @galipremsagar
  • Default groupby to sort=False (#7180) @isVoid
  • Add libcudf lists column count_elements API (#7173) @davidwendt
  • Implement cudf::group_by (sort) for decimal32 and decimal64 (#7169) @codereport
  • Add encoding and compression argument to CSV writer (#7168) @VibhuJawa
  • cudf::rolling_window SUM support for decimal32 and decimal64 (#7147) @codereport
  • Adding support for explode to cuDF (#7140) @hyperbolic2346
  • Add libcudf API for parsing of ORC statistics (#7136) @vuule
  • update GDS/cuFile location for 0.9 release (#7131) @rongou
  • Add Segmented sort (#7122) @karthikeyann
  • Add cudf::binary_operation NULL_MIN, NULL_MAX & NULL_EQUALS for decimal32 and decimal64 (#7119) @codereport
  • Add scale and value methods to fixed_point (#7109) @codereport
  • Replace ORC writer api with class (#7099) @rgsl888prabhu
  • Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
  • Improve digitize API (#7071) @isVoid
  • Add List types support in data generator (#7064) @galipremsagar
  • cudf::scan support for decimal32 and decimal64 (#7063) @codereport
  • cudf::rolling ROW_NUMBER support for decimal32 and decimal64 (#7061) @codereport
  • Replace parquet writer api with class (#7058) @rgsl888prabhu
  • Support contains() on lists of primitives (#7039) @mythrocks
  • Implement cudf::rolling for decimal32 and decimal64 (#7037) @codereport
  • Add ffill and bfill to string columns (#7036) @isVoid
  • Enable round in cudf for DataFrame and Series (#7022) @ChrisJar
  • Extend replace_nulls_policy to string and dictionary type (#7004) @isVoid
  • Add segmented_gather(list_column, gather_list) (#7003) @karthikeyann
  • Add method field to fillna for fixed width columns (#6998) @isVoid
  • Manual merge of branch 0.17 into branch 0.18 (#6995) @shwina
  • Implement cudf::reduce for decimal32 and decimal64 (part 2) (#6980) @codereport
  • Add Ufunc alias look up for appropriate numpy ufunc dispatching (#6973) @VibhuJawa
  • Add pytest-xdist to dev environment.yml (#6958) @galipremsagar
  • Add Index.set_names api (#6929) @galipremsagar
  • Add replace_null API with replace_policy parameter, fixed_width column support (#6907) @isVoid
  • Share factorize implementation with Index and cudf module (#6885) @brandon-b-miller
  • Implement update() function (#6883) @skirui-source
  • Add groupby idxmin, idxmax aggregation (#6856) @karthikeyann
  • Implement cudf::reduce for decimal32 and decimal64 (part 1) (#6814) @codereport
  • Implement cudf.DateOffset for months (#6775) @brandon-b-miller
  • Add Python DecimalColumn (#6715) @shwina
  • Add dictionary support to libcudf groupby functions (#6585) @davidwendt

πŸ› οΈ Improvements

  • Update stale GHA with exemptions & new labels (#7395) @mike-wendt
  • Add GHA to mark issues/prs as stale/rotten (#7388) @Ethyling
  • Unpin from numpy < 1.20 (#7335) @shwina
  • Prepare Changelog for Automation (#7309) @galipremsagar
  • Prepare Changelog for Automation (#7272) @ajschmidt8
  • Add JNI support for converting Arrow buffers to CUDF ColumnVectors (#7222) @tgravescs
  • Add coverage for skiprows and num_rows in parquet reader fuzz testing (#7216) @galipremsagar
  • Define and implement more behavior for merging on categorical variables (#7209) @brandon-b-miller
  • Add CudfSeriesGroupBy to optimize dask_cudf groupby-mean (#7194) @rjzamora
  • Add dictionary column support to rolling_window (#7186) @davidwendt
  • Modify the semantics of end pointers in cuIO to match standard library (#7179) @vuule
  • Adding unit tests for fixed_point with extremely large scales (#7178) @codereport
  • Fast path single column sort (#7167) @davidwendt
  • Fix -Werror=sign-compare errors in device code (#7164) @trxcllnt
  • Refactor cudf::string_view host and device code (#7159) @davidwendt
  • Enable logic for GPU auto-detection in cudfjni (#7155) @gerashegalov
  • Java bindings for Fixed-point type support for Parquet (#7153) @razajafri
  • Add Java interface for the new API 'explode' (#7151) @firestarman
  • Replace offsets with iterators in cuIO utilities and CSV parser (#7150) @vuule
  • Add gbenchmarks for reduction aggregations any() and all() (#7129) @davidwendt
  • Update JNI for contiguous_split packed results (#7127) @jlowe
  • Add JNI and Java bindings for list_contains (#7125) @kuhushukla
  • Add Java unit tests for window aggregate 'collect' (#7121) @firestarman
  • verify window operations on decimal with java tests (#7120) @sperlingxx
  • Adds in JNI support for creating an list column from existing columns (#7112) @revans2
  • Build libcudf with -Wall (#7105) @trxcllnt
  • Add column_device_view pointers to EncColumnDesc (#7097) @kaatish
  • Add pyorc to dev environment (#7085) @galipremsagar
  • JNI support for creating struct column from existing columns and fixed bug in struct with no children (#7084) @revans2
  • Fastpath single strings column in cudf::sort (#7075) @davidwendt
  • Upgrade nvcomp to 1.2.1 (#7069) @rongou
  • Refactor ORC ProtobufReader to make it more extendable (#7055) @vuule
  • Add Java tests for decimal casts (#7051) @sperlingxx
  • Auto-label PRs based on their content (#7044) @jolorunyomi
  • Create sort gbenchmark for strings column (#7040) @davidwendt
  • Refactor io memory fetches to use hostdevice_vector methods (#7035) @ChrisJar
  • Spark Murmur3 hash functionality (#7024) @rwlee
  • Fix libcudf strings logic where size_type is used to access INT32 column data (#7020) @davidwendt
  • Adding decimal writing support to parquet (#7017) @hyperbolic2346
  • Add compression="infer" as default for dask_cudf.read_csv (#7013) @rjzamora
  • Correct ORC docstring; other minor cuIO improvements (#7012) @vuule
  • Reduce number of hostdevice_vector allocations in parquet reader (#7005) @devavret
  • Check output size overflow on strings gather (#6997) @davidwendt
  • Improve representation of MultiIndex (#6992) @galipremsagar
  • Disable some pragma unroll statements in thrust sort.h (#6982) @davidwendt
  • Minor cudf::round internal refactoring (#6976) @codereport
  • Add Java bindings for URL conversion (#6972) @jlowe
  • Enable strict_decimal_types in parquet reading (#6969) @sperlingxx
  • Add in basic support to JNI for logical_cast (#6954) @revans2
  • Remove duplicate file array_tests.cpp (#6953) @karthikeyann
  • Add null mask fixed_point_column_wrapper constructors (#6951) @codereport
  • Update Java bindings version to 0.18-SNAPSHOT (#6949) @jlowe
  • Use simplified rmm::exec_policy (#6939) @harrism
  • Add null count test for apply_boolean_mask (#6903) @harrism
  • Implement DataFrame.quantile for datetime and timedelta data types (#6902) @ChrisJar
  • Remove **kwargs from string/categorical methods (#6750) @shwina
  • Refactor rolling.cu to reduce compile time (#6512) @mythrocks
  • Add static type checking via Mypy (#6381) @shwina
  • Update to official libcu++ on Github (#6275) @trxcllnt
cudf - v0.18.0

Published by GPUtester over 3 years ago

Breaking Changes 🚨

  • Default groupby to sort=False (#7180) @isVoid
  • Add libcudf API for parsing of ORC statistics (#7136) @vuule
  • Replace ORC writer api with class (#7099) @rgsl888prabhu
  • Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
  • Replace parquet writer api with class (#7058) @rgsl888prabhu
  • Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
  • Fix default parameter values of write_csv and write_parquet (#6967) @vuule
  • Align Series.groupby API to match Pandas (#6964) @kkraus14
  • Share factorize implementation with Index and cudf module (#6885) @brandon-b-miller

Bug Fixes πŸ›

  • Remove incorrect std::move call on return variable (#7319) @davidwendt
  • Fix failing CI ORC test (#7313) @vuule
  • Disallow constructing frames from a ColumnAccessor (#7298) @shwina
  • fix java cuFile tests (#7296) @rongou
  • Fix style issues related to NumPy (#7279) @shwina
  • Fix bug when iloc slice terminates at before-the-zero position (#7277) @isVoid
  • Fix copying dtype metadata after calling libcudf functions (#7271) @shwina
  • Move lists utility function definition out of header (#7266) @mythrocks
  • Throw if bool column would cause incorrect result when writing to ORC (#7261) @vuule
  • Use uvector in replace_nulls; Fix sort_helper::grouped_value doc (#7256) @isVoid
  • Remove floating point types from cudf::sort fast-path (#7250) @davidwendt
  • Disallow picking output columns from nested columns. (#7248) @devavret
  • Fix loc for Series with a MultiIndex (#7243) @shwina
  • Fix Arrow column test leaks (#7241) @tgravescs
  • Fix test column vector leak (#7238) @kuhushukla
  • Fix some bugs in java scalar support for decimal (#7237) @revans2
  • Improve assert_eq handling of scalar (#7220) @isVoid
  • Fix missing null_count() comparison in test framework and related failures (#7219) @nvdbaranec
  • Remove floating point types from radix sort fast-path (#7215) @davidwendt
  • Fixing parquet benchmarks (#7214) @rgsl888prabhu
  • Handle various parameter combinations in replace API (#7207) @galipremsagar
  • Export mock aws credentials for s3 tests (#7176) @ayushdg
  • Add MultiIndex.rename API (#7172) @isVoid
  • Fix importing list & struct types in from_arrow (#7162) @galipremsagar
  • Fixing parquet precision writing failing if scale is equal to precision (#7146) @hyperbolic2346
  • Update s3 tests to use moto_server (#7144) @ayushdg
  • Fix JIT cache multi-process test flakiness in slow drives (#7142) @devavret
  • Fix compilation errors in libcudf (#7138) @galipremsagar
  • Fix compilation failure caused by -Wall addition. (#7134) @codereport
  • Add informative error message for sep in CSV writer (#7095) @galipremsagar
  • Add JIT cache per compute capability (#7090) @devavret
  • Implement __hash__ method for ListDtype (#7081) @galipremsagar
  • Only upload packages that were built (#7077) @raydouglass
  • Fix comparisons between Series and cudf.NA (#7072) @brandon-b-miller
  • Handle nan values correctly in Series.one_hot_encoding (#7059) @galipremsagar
  • Add unstack() support for non-multiindexed dataframes (#7054) @isVoid
  • Fix read_orc for decimal type (#7034) @rgsl888prabhu
  • Fix backward compatibility of loading a 0.16 pkl file (#7033) @galipremsagar
  • Decimal casts in JNI became a NOOP (#7032) @revans2
  • Restore usual instance/subclass checking to cudf.DateOffset (#7029) @shwina
  • Add days check to cudf::is_timestamp using cuda::std::chrono classes (#7028) @davidwendt
  • Fix to_csv delimiter handling of timestamp format (#7023) @davidwendt
  • Pin librdkakfa to gcc 7 compatible version (#7021) @raydouglass
  • Fix fillna & dropna to also consider np.nan as a missing value (#7019) @galipremsagar
  • Fix round operator's HALF_EVEN computation for negative integers (#7014) @nartal1
  • Skip Thrust sort patch if already applied (#7009) @harrism
  • Fix cudf::hash_partition for decimal32 and decimal64 (#7006) @codereport
  • Fix Thrust unroll patch command (#7002) @harrism
  • Fix loc behaviour when key of incorrect type is used (#6993) @shwina
  • Fix int to datetime conversion in csv_read (#6991) @kaatish
  • fix excluding cufile tests by default (#6988) @rongou
  • Fix java cufile tests when cufile is not installed (#6987) @revans2
  • Make cudf::round for fixed_point when scale = -decimal_places a no-op (#6975) @codereport
  • Fix type comparison for java (#6970) @revans2
  • Fix default parameter values of write_csv and write_parquet (#6967) @vuule
  • Align Series.groupby API to match Pandas (#6964) @kkraus14
  • Fix timestamp parsing in ORC reader for timezones without transitions (#6959) @vuule
  • Fix typo in numerical.py (#6957) @rgsl888prabhu
  • fixed_point_value double-shifts in fixed_point construction (#6950) @codereport
  • fix libcu++ include path for jni (#6948) @rongou
  • Fix groupby agg/apply behaviour when no key columns are provided (#6945) @shwina
  • Avoid inserting null elements into join hash table when nulls are treated as unequal (#6943) @hyperbolic2346
  • Fix cudf::merge gtest for dictionary columns (#6942) @davidwendt
  • Pass numeric scalars of the same dtype through numeric binops (#6938) @brandon-b-miller
  • Fix N/A detection for empty fields in CSV reader (#6922) @vuule
  • Fix rmm_mode=managed parameter for gtests (#6912) @davidwendt
  • Fix nullmask offset handling in parquet and orc writer (#6889) @kaatish
  • Correct the sampling range when sampling with replacement (#6884) @ChrisJar
  • Handle nested string columns with no children in contiguous_split. (#6864) @nvdbaranec
  • Fix columns & index handling in dataframe constructor (#6838) @galipremsagar

Documentation πŸ“–

  • Update readme (#7318) @shwina
  • Fix typo in cudf.core.column.string.extract docs (#7253) @adelevie
  • Update doxyfile project number (#7161) @davidwendt
  • Update 10 minutes to cuDF and CuPy with new APIs (#7158) @ChrisJar
  • Cross link RMM & libcudf Doxygen docs (#7149) @ajschmidt8
  • Add documentation for support dtypes in all IO formats (#7139) @galipremsagar
  • Add groupby docs (#7100) @shwina
  • Update cudf python docstrings with new null representation (&lt;NA&gt;) (#7050) @galipremsagar
  • Make Doxygen comments formatting consistent (#7041) @vuule
  • Add docs for working with missing data (#7010) @galipremsagar
  • Remove warning in from_dlpack and to_dlpack methods (#7001) @miguelusque
  • libcudf Developer Guide (#6977) @harrism
  • Add JNI wrapper for the cuFile API (GDS) (#6940) @rongou

New Features πŸš€

  • Support numeric_only field for rank() (#7213) @isVoid
  • Add support for cudf::binary_operation TRUE_DIV for decimal32 and decimal64 (#7198) @codereport
  • Implement COLLECT rolling window aggregation (#7189) @mythrocks
  • Add support for array-like inputs in cudf.get_dummies (#7181) @galipremsagar
  • Default groupby to sort=False (#7180) @isVoid
  • Add libcudf lists column count_elements API (#7173) @davidwendt
  • Implement cudf::group_by (sort) for decimal32 and decimal64 (#7169) @codereport
  • Add encoding and compression argument to CSV writer (#7168) @VibhuJawa
  • cudf::rolling_window SUM support for decimal32 and decimal64 (#7147) @codereport
  • Adding support for explode to cuDF (#7140) @hyperbolic2346
  • Add libcudf API for parsing of ORC statistics (#7136) @vuule
  • update GDS/cuFile location for 0.9 release (#7131) @rongou
  • Add Segmented sort (#7122) @karthikeyann
  • Add cudf::binary_operation NULL_MIN, NULL_MAX & NULL_EQUALS for decimal32 and decimal64 (#7119) @codereport
  • Add scale and value methods to fixed_point (#7109) @codereport
  • Replace ORC writer api with class (#7099) @rgsl888prabhu
  • Pack/unpack functionality to convert tables to and from a serialized format. (#7096) @nvdbaranec
  • Improve digitize API (#7071) @isVoid
  • Add List types support in data generator (#7064) @galipremsagar
  • cudf::scan support for decimal32 and decimal64 (#7063) @codereport
  • cudf::rolling ROW_NUMBER support for decimal32 and decimal64 (#7061) @codereport
  • Replace parquet writer api with class (#7058) @rgsl888prabhu
  • Support contains() on lists of primitives (#7039) @mythrocks
  • Implement cudf::rolling for decimal32 and decimal64 (#7037) @codereport
  • Add ffill and bfill to string columns (#7036) @isVoid
  • Enable round in cudf for DataFrame and Series (#7022) @ChrisJar
  • Extend replace_nulls_policy to string and dictionary type (#7004) @isVoid
  • Add segmented_gather(list_column, gather_list) (#7003) @karthikeyann
  • Add method field to fillna for fixed width columns (#6998) @isVoid
  • Manual merge of branch 0.17 into branch 0.18 (#6995) @shwina
  • Implement cudf::reduce for decimal32 and decimal64 (part 2) (#6980) @codereport
  • Add Ufunc alias look up for appropriate numpy ufunc dispatching (#6973) @VibhuJawa
  • Add pytest-xdist to dev environment.yml (#6958) @galipremsagar
  • Add Index.set_names api (#6929) @galipremsagar
  • Add replace_null API with replace_policy parameter, fixed_width column support (#6907) @isVoid
  • Share factorize implementation with Index and cudf module (#6885) @brandon-b-miller
  • Implement update() function (#6883) @skirui-source
  • Add groupby idxmin, idxmax aggregation (#6856) @karthikeyann
  • Implement cudf::reduce for decimal32 and decimal64 (part 1) (#6814) @codereport
  • Implement cudf.DateOffset for months (#6775) @brandon-b-miller
  • Add Python DecimalColumn (#6715) @shwina
  • Add dictionary support to libcudf groupby functions (#6585) @davidwendt

Improvements πŸ› οΈ

  • Update stale GHA with exemptions & new labels (#7395) @mike-wendt
  • Add GHA to mark issues/prs as stale/rotten (#7388) @Ethyling
  • Unpin from numpy < 1.20 (#7335) @shwina
  • Prepare Changelog for Automation (#7309) @galipremsagar
  • Prepare Changelog for Automation (#7272) @ajschmidt8
  • Add JNI support for converting Arrow buffers to CUDF ColumnVectors (#7222) @tgravescs
  • Add coverage for skiprows and num_rows in parquet reader fuzz testing (#7216) @galipremsagar
  • Define and implement more behavior for merging on categorical variables (#7209) @brandon-b-miller
  • Add CudfSeriesGroupBy to optimize dask_cudf groupby-mean (#7194) @rjzamora
  • Add dictionary column support to rolling_window (#7186) @davidwendt
  • Modify the semantics of end pointers in cuIO to match standard library (#7179) @vuule
  • Adding unit tests for fixed_point with extremely large scales (#7178) @codereport
  • Fast path single column sort (#7167) @davidwendt
  • Fix -Werror=sign-compare errors in device code (#7164) @trxcllnt
  • Refactor cudf::string_view host and device code (#7159) @davidwendt
  • Enable logic for GPU auto-detection in cudfjni (#7155) @gerashegalov
  • Java bindings for Fixed-point type support for Parquet (#7153) @razajafri
  • Add Java interface for the new API 'explode' (#7151) @firestarman
  • Replace offsets with iterators in cuIO utilities and CSV parser (#7150) @vuule
  • Add gbenchmarks for reduction aggregations any() and all() (#7129) @davidwendt
  • Update JNI for contiguous_split packed results (#7127) @jlowe
  • Add JNI and Java bindings for list_contains (#7125) @kuhushukla
  • Add Java unit tests for window aggregate 'collect' (#7121) @firestarman
  • verify window operations on decimal with java tests (#7120) @sperlingxx
  • Adds in JNI support for creating an list column from existing columns (#7112) @revans2
  • Build libcudf with -Wall (#7105) @trxcllnt
  • Add column_device_view pointers to EncColumnDesc (#7097) @kaatish
  • Add pyorc to dev environment (#7085) @galipremsagar
  • JNI support for creating struct column from existing columns and fixed bug in struct with no children (#7084) @revans2
  • Fastpath single strings column in cudf::sort (#7075) @davidwendt
  • Upgrade nvcomp to 1.2.1 (#7069) @rongou
  • Refactor ORC ProtobufReader to make it more extendable (#7055) @vuule
  • Add Java tests for decimal casts (#7051) @sperlingxx
  • Auto-label PRs based on their content (#7044) @jolorunyomi
  • Create sort gbenchmark for strings column (#7040) @davidwendt
  • Refactor io memory fetches to use hostdevice_vector methods (#7035) @ChrisJar
  • Spark Murmur3 hash functionality (#7024) @rwlee
  • Fix libcudf strings logic where size_type is used to access INT32 column data (#7020) @davidwendt
  • Adding decimal writing support to parquet (#7017) @hyperbolic2346
  • Add compression="infer" as default for dask_cudf.read_csv (#7013) @rjzamora
  • Correct ORC docstring; other minor cuIO improvements (#7012) @vuule
  • Reduce number of hostdevice_vector allocations in parquet reader (#7005) @devavret
  • Check output size overflow on strings gather (#6997) @davidwendt
  • Improve representation of MultiIndex (#6992) @galipremsagar
  • Disable some pragma unroll statements in thrust sort.h (#6982) @davidwendt
  • Minor cudf::round internal refactoring (#6976) @codereport
  • Add Java bindings for URL conversion (#6972) @jlowe
  • Enable strict_decimal_types in parquet reading (#6969) @sperlingxx
  • Add in basic support to JNI for logical_cast (#6954) @revans2
  • Remove duplicate file array_tests.cpp (#6953) @karthikeyann
  • Add null mask fixed_point_column_wrapper constructors (#6951) @codereport
  • Update Java bindings version to 0.18-SNAPSHOT (#6949) @jlowe
  • Use simplified rmm::exec_policy (#6939) @harrism
  • Add null count test for apply_boolean_mask (#6903) @harrism
  • Implement DataFrame.quantile for datetime and timedelta data types (#6902) @ChrisJar
  • Remove **kwargs from string/categorical methods (#6750) @shwina
  • Refactor rolling.cu to reduce compile time (#6512) @mythrocks
  • Add static type checking via Mypy (#6381) @shwina
  • Update to official libcu++ on Github (#6275) @trxcllnt
cudf - v0.17.0

Published by GPUtester almost 4 years ago

v0.17.0 Release

cudf - v0.16.0

Published by GPUtester almost 4 years ago

v0.16.0 Release

cudf - v0.15.0

Published by raydouglass about 4 years ago

v0.15.0 Release

Package Rankings
Top 5.32% on Pypi.org
Top 8.17% on Proxy.golang.org
Top 4.8% on Repo1.maven.org