cupy

NumPy & SciPy for GPU

MIT License

Downloads
758.5K
Stars
7.7K
Committers
370

Bot releases are hidden (Show)

cupy - v2.4.0

Published by mitmul over 6 years ago

This is the release note of v2.4.0. See here for the complete list of solved issues and merged PRs.

Documentation

  • Improve the document of rand/randn with examples (#911)
  • Fix typo (#947)

Bug fix

  • Fix array index overflow (#925)
  • Fix scalar broadcast (#936)

Installation

  • Exclude Cython files from sdist (#929)
cupy - v4.0.0b3

Published by kmaehashi over 6 years ago

This is the release of v4.0.0b3. See here for the complete list of solved issues and merged PRs.

Highlights

  • cupy.random.shuffle performance has been improved. See #603 for details.
  • More functions for sparse matrix have been added.

New Features

  • Fast shuffle (#603, thanks @anaruse!)
  • Implement sum for cupy.sparse (#712)
  • Implement conversion to complex scalar (#783, thanks @kohr-h!)
  • Support scalar values in atleast_nd (#819)
  • Implement size function (#827)
  • Implement eliminate_zeros for csc matrix (#831)
  • Implement rand for cupy.sparse (#836)
  • Implement coo initializer for SciPy sparse matrix (#857)

Improvements

  • Accept None in set_allocator and set_pinned_memory_allocator (#885)

Bug Fixes

  • Fix repeat that behaved differently from NumPy (#670)
  • Fix return type of linalg.norm for complex input (#781, thanks @kohr-h!)
  • Fix ctxGetCurrent (#837)
  • Make a copy when SVD is calculated (#844)
  • Fix memory pool to correctly detect being not established (#856)
  • Fix out argument in fusion (#868)
  • Fix cupy.size overflow in 32-bit (#883)
  • Fix downcast of size in CArray and CIndexer (#892)

Examples

  • Add an example of option pricing with multiple GPUs (#513)
  • Improve the conjugate gradient example (#852)
  • Fix typo (#858, thanks @juniorrojas!)

Documentation

  • Add reference to paper (#867)
  • Fix docs to use NumPy 1.14 textual representation (#876)
  • Fix memory allocator docs (#887)

Installation

  • Change NVCC compiler options for CUDA9 (#835, thanks @anaruse!)

Tests

  • Import testing/parameterized.py from Chainer (#784)
  • Remove unnecessary try catch (#815)
cupy - v2.3.0

Published by bkvogel over 6 years ago

This is the release of v2.3.0. See here for the complete list of solved issues and merged PRs.

New Features

  • Support None as an argument for clip method (#807)
  • Implement coo initializer for scipy sparse matrix (#859)

Bug Fixes

  • Make order=None work like default order (#773, thanks @kohr-h!)
  • Fix repeat behave differently with numpy (#848)
  • Fix ctxGetCurrent not returning context ptr (#855)
  • Make a copy when SVD is calculated (#860)
  • Fix memory pool to correctly detect that memory pool is not established (#861)
  • Fix out argument in fusion (#871)
  • Fix downcast of size in CArray and CIndexer (#897)

Examples

  • Fix typo (#863, thanks @juniorrojas!)

Documentation

  • Fix docs to use NumPy 1.14 textual representation (#879)
  • Add reference to paper (#881)

Installation

  • Change nvcc compiler options for CUDA9 (#880, thanks @anaruse!)

Tests

  • Import testing/parameterized.py from Chainer (#841)
  • Remove unnecessary try catch (#862)
cupy - v4.0.0b2

Published by delta2323 almost 7 years ago

This is the release of v4.0.0b2. See here for the complete list of solved issues and merged PRs.

New Features

  • LSQR (#745, thanks @KotaroSetoyama!)
  • Incorporate persistent RNN functions of cuDNN v6 (#549, thanks @aonotas!)
  • cupy.moveaxis (#684, thanks @fukatani!)
  • default_casting option to ufuncs (#720)

Improvements

  • Support None as an argument for clip method (#802)

Bug Fixes

  • Fix matmul to raise ValueError on invalid shapes (#737)
  • Fix RandomState.choice reproducibility (#741)
  • Fix a bug in stack (#749)
  • Make order=None work like default order (#764, thanks @kohr-h!)
  • Fix to correctly display filename of CUDA dump (#777)
  • Fix a bug about grouped convolution (#785, thanks @anaruse!)
  • Fix order=None of unravel_index (#791, thanks @Hakuyume!)
  • Cast CUPY_SEED environment variable to uint64 (#805, thanks @toslunar!)

Tests

  • Add test cases for tie of argmin and argmax (#774)
  • Fix checks for exceptions with inheritance (#788)
  • Skip a test using numpy.stack in numpy 1.9 (#803)
  • Skip a test using numpy.matmul in numpy 1.9 (#817)

Documentation

  • Improve dependency description in documentation (#767)
  • Fix typo in environment variable reference (#769)
  • Document Compute Capability requirement (#772)
  • Improve documentation of ElementwiseKernel.__call__ (#786)
  • Add documentation for accept_error argument in testing.numpy_cupy_raises (#787)
  • Fix wrong documentation in testing.numpy_cupy_equal (#804)
  • Expose signatures of methods to reference (#808)

Others

  • Add pytest cache directory to gitignore (#797)
cupy - v2.2.0

Published by beam2d almost 7 years ago

This is the release of v2.2.0. See here for the complete list of solved issues and merged PRs.

New Features

  • Add default casting option to ufunc (#812)

Improvements

  • Remove cuDNN overhead (#748)

Bug Fixes

  • Fix: RandomState.choice reproducibility (#775)
  • fix to correctly display filename of CUDA dump (#778)
  • Fix stack function bug (#798)
  • Fix matmul to raise ValueError on invalid shapes (#816)
  • Cast CUPY_SEED environment variable to uint64 (#822, thanks @toslunar!)
  • Stop using dtype option of randint which is introduced in NumPy v1.11 (#830)

Documentation

  • improve dependency description of docs (#770)
  • fix typo in environment variable reference (#771)
  • Add docs for missing argument (#790)
  • Expose signatures to Reference (#809)
  • fix ElementwiseKernel.__call__ docs (#810)
  • fix wrong comment in test helper (#814)

Tests

  • add test for argmin/argmax tie (#776)
  • fix test to pass with numpy 1.9 (#806)
  • Allow derived errors to pass equality tests (#811)
  • Fix matmul test for old NumPy (#818)
  • add pytest cache directory to gitignore (#800)
  • Allow derived errors to pass equality tests (#811)
cupy - v2.1.0.1

Published by beam2d almost 7 years ago

This is a hotfix of v2.1. It contains a fix of the issue #766 that cupy.random functions raises an error when either CUPY_SEED or CHAINER_SEED is set. This issue has been fixed via #805, which is cherry-picked for this hot-fix.

This release does not contain any other updates from v2.1.0.

cupy - v2.1.0

Published by hvy almost 7 years ago

This is the release of v2.1.0. See here for the complete list of solved issues and merged PRs.

New features

  • Add argpartition (#608)
  • Add window functions (#612, thanks @ishihara1989!)
    • blackman, hamming, hanning
  • Support sparse.coo_matrix initialization with other types of sparse matrices (#626)
  • Line memory profiler using memory hook and traceback (#630)
  • Support dtype argument in random.randint (#706)
  • cuDNN grouped convolution (#721, thanks @anaruse!)

Improvements

  • Performance improvements
    • Optimize sparse.csc_matrix.__mul__ (#625)
    • Cythonize memory hook (#728)
  • Support uint32 sampling up to 0xffffffff in random.RandomState.interval (#633)
  • Fix random.RandomState.seed to only accept integer types (#709)
  • Fix typo in IndexError error message (#683)
  • Fix interface for cuDNN find algorithm APIs (#664)

Bug fixes

  • Fix OverflowError passing large integer to elementwise operation (#615)
  • Fix indexing zero-dimensional array with boolean mask (#645)
  • Setup Python’s builtin random state in testing.fix_random (#648)
  • Use v6 RNN API when using cuDNN7 to avoid incompatibility (#665, thanks @anaruse!)
  • Set arch option for NVRTC, as the option is neccessary on some GPUs (#696, thanks @grafi-tt!)
  • Fix memory pool for multi-threaded applications (#697)
  • Fix var and std to correctly handle ddof argument (#711, thanks @stevendbrown!)
  • Fix advanced indexing to not alter the indices (#723, thanks @yuyu2172!)

Documentation

  • Fix a link in README.md to the contribution guide (#629)
  • Remove unrelated “see also” from testing.numpy_cupy_raises (#637, thanks @Hakuyume!)
  • Write note about environment variables for installation (#641)
  • Fix reference page of linalg (#651)
  • Fix doctest for Python 3.5 (#663)
  • Add intersphinx mapping to Chainer (#666)
  • Fix typo and heading in documentation (#667)
  • Update testing section in the contribution guide (#716)
  • Fix a link in README.md to the forum (#754, thanks @muupan!)
  • Fix incorrect heading “CuPy” instead of “NumPy” in license page (#674)

Test

  • Use the latest Cython in Travis CI (#636)
  • Fix typo (#647, thanks @Hakuyume!)
  • Move to PyTest
    • Move to PyTest (#659)
    • Remove nose dependency in tests (#676)
    • Use pytest-warnings to check deprecated warnings (#729)
  • Fix doctest for Python 3.5 (#663)
  • Allow filtering test cases by number of GPUs with CUPY_TEST_GPU_LIMIT environment variable (#677)
  • Ignore ComplexWarning in numpy.pad for NumPy 1.11 or older (#690)
  • Fix NumPy warning for bool and complex operations (#708)
  • Fix test of where to use different seeds for different arrays (#710)
  • Skip some dtypes in test_einsum (#740)
  • Skip some tests for old NumPy (#746)

Others

  • Improve version embedding (#652)
cupy - v4.0.0b1

Published by kmaehashi almost 7 years ago

This is the release of v4.0.0b1. See here for the complete list of solved issues and merged PRs.

Announcements

As the version number indicates, we decided to name the next major version of CuPy v4 instead of v3 to align the versioning with Chainer.
From this version, you can install compatible versions of Chainer and CuPy by specifying the same version number for both.

New features

  • Add FFT functions under cupy.fft (#477)
    • Standard FFTs: fft, ifft, fft2, ifft2, fftn, ifftn
    • Real FFTs: rfft, irfft, rfft2, irfft2., rfftn, irfftn
    • Hermitian FFTs: hfft, ihfft
    • Helper routines: fftfreq, rfftfreq, fftshift, ifftshift
  • Add random.RandomState.tomaxint (#389)
  • Add sparse.csr_matrix.eliminate_zeros and sparse.coo_matrix.eliminate_zeros (#398)
  • Add linalg.tensorinv (#464)
  • Add unravel_index (#632, thanks @Hakuyume!)
  • Add percentile (#643)
  • Add random.set_random_state (#704)
  • Support ellipsis in einsum (#410, thanks @fukatani!)
  • Support dtype argument in random.randint (#567)
  • Support sparse.coo_matrix initialization with other types of sparse matrices (#573)
  • Better CUDA support
    • Change max dimension size of CUDA grid to make use of Compute Capability >= 3 (#616, thanks @anaruse!)
    • Support CUDA stream with stream memory pool (#306, #732)
    • cuDNN grouped convolution (#581, thanks @anaruse!)

Bug fixes

  • Fix indexing zero-dimensional array with boolean mask (#580)
  • Fix memory pool for multi-threaded applications (#606)
  • Setup Python’s builtin random state in testing.fix_random (#640)
  • Use v6 RNN API when using cuDNN7 to avoid incompatibility (#660, thanks @anaruse!)
  • Set arch option for NVRTC, as the option is necessary on some GPUs (#687, thanks @grafi-tt!)
  • Fix var and std to correctly handle ddof argument (#693, thanks @stevendbrown!)
  • Fix advanced indexing to not alter the indices (#713, thanks @yuyu2172!)
  • Fix bit-width issue in random.RandomState.tomaxint for Windows (#658)

Improvements

  • Performance improvements
    • Improved performance of concatenate by using continuous copies (#452, thanks @uchida!)
    • Optimize sparse.csc_matrix.__mul__ (#572)
    • Cythonize cuDNN wrapper (#512)
    • Cythonize memory hook (#722)
    • Avoid implicit conversion into PyInt in linear_launch (#673)
    • Eliminate a redundant check in memory pool (#731)
  • Support uint32 sampling up to 0xffffffff in random.RandomState.interval (#583)
  • Fix random.RandomState.seed to only accept integer types (#688)
  • Fix typo in IndexError error message (#681)
  • Fix interface for cuDNN find algorithm APIs (#624)

Examples

  • Add an example of option pricing using Monte-Carlo simulation (#493)

Documentation

  • Update testing section in the contribution guide (#671)
  • Write note about environment variables for installation (#534)
  • Remove unrelated “see also” from testing.numpy_cupy_raises (#634, thanks @Hakuyume!)
  • Fix reference page of linalg (#650)
  • Fix typo and heading in documentation (#654)
  • Add intersphinx mapping to Chainer (#655)
  • Fix a link in README.md to the contribution guide (#628)
  • Fix a link in README.md to the forum (#752, thanks @muupan!)
  • Fix incorrect heading “CuPy” instead of “NumPy” in license page (#656)

Test

  • Move to PyTest
    • Move to PyTest (#623)
    • Remove nose dependency in tests (#672)
    • Use pytest-warnings to check deprecated warnings (#675)
  • Fix NumPy warning for bool and complex operations (#496)
  • Use the latest Cython in Travis CI (#597)
  • Fix typo (#631, thanks @Hakuyume!)
  • Fix doctest for Python 3.5 (#644)
  • Allow filtering test cases by number of GPUs with CUPY_TEST_GPU_LIMIT environment variable (#662)
  • Fix test_einsum (#679)
  • Ignore ComplexWarning in numpy.pad for NumPy 1.11 or older (#689)
  • Fix test of where to use different seeds for different arrays (#703)
  • Avoid deprecation warnings (#718)
  • Skip some dtypes in test_einsum (#726)
  • Skip test_fft for NumPy 1.9 or older (#727)
  • Skip some tests for old NumPy (#744)

Others

  • Improve version embedding (#639)
cupy - v2.0.0

Published by bkvogel about 7 years ago

This is a major release of CuPy v2.0.0. All of the updates since the previous major version (v1.0.0) can be found in the release notes below:

Important Updates

Supports the latest versions of the following libraries

  • CUDA9 support (#353, thanks @anaruse!)
  • cuDNN7 support (#362, thanks @anaruse!)
  • NCCL2 support (#363, thanks @anaruse!)
  • NumPy 1.13 (#347)

In v2.0.0a1

  • We started using NVRTC instead of NVCC for kernel compilation. This change enables CuPy to run in an environment where CUDA is installed but NVCC is not available. Note that some features depending on Thrust (e.g. sorting functions) cannot be used if NVCC is not available at the installation.
  • Many functions for sorting, linear algebra, and others are added.

In v2.0.0b1

  • Sparse matrix. cupy.sparse is a module that implements scipy.sparse API using CUDA and cuSPARSE. We now have basic features for using sparse matrices on GPU.
  • New memory allocator (#168). The memory pool implementation is greatly updated. It is based on best-fit allocation with coalescing. When there are a large number of allocations with different sizes (e.g. NLP applications), the memory usage is improved and the number of re-allocations is reduced (which also reduces the running time).

In v2.0.0rc1

  • Complex numbers (#232)
  • Many New functions.

Bug fixes

  • Fix cupy.nonzero for corner cases (#504)
  • Fix simple reduction for corner cases (#505)
  • Fix multithread problem in PooledMemory (#507)
  • Resolve dealloc problem and multithread problem in PinnedMemory (#510)
  • Avoid using global state in RandomState.choice (#560)
  • Fix broadcast for corner cases (#577)
  • Fix csrmm2 to support transa (#601)
  • Fix csrmv (#607)

Improvements

  • Fix get_array_module to be aware of spmatrix (#586)
  • Show NVRTC error code (#538)
  • Optimize RandomState.interval (#585)
  • Fix random.normal double memory consumption (#592)
  • Check kernel name validity (#596)

Installation

  • Avoid Cython 0.27.0 (#579)
  • Import memory_hooks (#506)
  • Support NVCC environment variable (#537)

Documentation

  • Fix warnings (#535)
  • Add documentation of cupy.all and cupy.any function (#514)
  • Fix documentation of fusion functions (#517)
  • Treat sphinx warnings as errors (#519)
  • Correct URLs in documentation (#561, thanks @aonotas!)
  • Fix Cython requirement for documentation build (#566)

Tests

  • Fix doctest warnings (#500)
  • Use mock.patch instead of directly replacing function with Mock (#610)
  • Remove print() in tests (#509)
  • Travis fails with Cython 0.27. Use Cython 0.26.1 for a while (#539)
  • Add corner test cases for indexing (#576)
  • Add unit tests for csrgemm (#602)

Others

  • Avoid duplicate loop index (#520)
cupy - v3.0.0a1

Published by gwtnb about 7 years ago

This is the release of CuPy v3.0.0a1. See here for the complete list of solved issues and merged PRs.

New features

  • Memory pool is now used as the default allocator even if CuPy is used without Chainer (#472).
  • Add line memory profiler using memory hook and traceback (#265)
  • Add cuDNN support for dropout. (#479)
  • Add cudnnGetTensor4dDescriptor for fp16 BatchNormalization support in Chainer (#492, thanks @anaruse!)
  • Add Tensor-Core support (cuDNN and cuBLAS) (#494 and #495, thanks @anaruse!)
  • Add window functions (#555, thanks @ishihara1989!)
  • Add cupy.sparse.random (#557)
  • Add cupy.argpartition (#294)

Bug fixes

  • Fix multithread problem in PooledMemory (#480)
  • Resolve dealloc problem and multithread problem in PinnedMemory (#481)
  • Fix cupy.nonzero for corner cases (#498)
  • Fix simple reduction for corner cases (#499)
  • Fix broadcast for corner cases (#543)
  • Fix broadcast_arrays return type (#545)
  • Avoid using global state in RandomState.choice (#556)
  • Fix csrmm2 to support transa (#565)
  • Fix csrmv (#571)
  • Avoid using dtype option in numpy.random.randint which is introduced in NumPy v1.11 (#574)

Improvements

  • Fix get_array_module to be aware of spmatrix (#568)
  • Use vector to improve free memory searching in malloc (#476)
  • Fix Cython warning on variable declaration (#491)
  • Check kernel name validity (#522)
  • Show NVRTC error code (#531)
  • Optimize RandomState.interval (#559)
  • Fix random.normal double memory consumption (#562)

Installation

  • Import memory_hooks (#502)
  • Avoid Cython 0.27.0 (#550)
  • Change minimum Cython version to 0.26.1 (#365, #530, #548)
  • Support NVCC environment variable (#501)

Documentation

  • Fix documentation of fusion functions (#497)
  • Add documentation of cupy.all and cupy.any function (#511)
  • Correct URLs in documentation (#547)
  • Fix typo (#614, thanks @fukatani!)

Examples

  • Add an example of option pricing using Black-Scholes equation (#473)
cupy - v1.0.3

Published by mitmul about 7 years ago

This release includes bug fixes and improvements to the documentation and tests. See the list for the complete list of solved issues and merged PRs.

Bug fixes

  • Avoid decoding nvcc output with UTF-8 to remove UnicodeDecodeError. (#378, #379)
  • Bug in view with different itemsize. (#403, thanks @boeddeker!)
  • Avoid to call python methods in __dealloc__ and use __del__ instead. (#411)
  • Fix ndarray.view when the itemsize of the dtype changes. (#416)
  • Fix inconsistency of ndarray.diagonal between NumPy and CuPy. (#436)

Improvements

  • Make a compilation error readable. (#380)
  • Add semicolons to the reduction kernel template. (#396)

Documentation

  • Remove unsupported strides argument from docstring. (#366)
  • Hide source link for alien objects. (#373)
  • Fix the document of matmul. (#412)
  • Use double backslashes in str literals. (#429)
  • Clear doctest warnings. (#457)
  • Sort out navigation menu. (#460)
  • Fix a grammatical error in tutorial. (#463)

Tests

  • Use randint instead of random_integer that is deprecated. (#430)
  • Add testing.assert_warns and test deprecation warning of Memory.free_all_free. (#431)
  • Skip some tests for RandomState when NumPy < 1.11.0. (#438)
  • Loosen the torelance of tests for binary operators. (#461)
  • Fix typo in test names. (#395)
cupy - v2.0.0rc1

Published by bkvogel about 7 years ago

This is the release of CuPy v2.0.0rc1. See here for the complete list of solved issues and merged PRs.

Changes that break compatibility

  • Change the default value of the order argument of copy from ’C’ to ’K’ (#159)
  • Add order and subok arguments to array (#167). It breaks the compatibility of positional arguments.

New features

  • Complex numbers (#232)
  • Memory hook (#264). It can be used to observe the memory allocation/deallocation events.
  • New functions
    • Complex routines: angle, conj, imag, real (#232)
    • einsum (#199, thanks @fukatani!)
    • Linear algebra: linalg.solve (#207), linalg.tensorsolve (#215), linalg.inv (#441), linalg.pinv (#459)
    • Random numbers: random.shuffle (#216, thanks @KotaroSetoyama!)
    • Sorting: partition (#270)
  • New features in sparse matrices
    • Support dia_matrix (#313, #321, #320, #450)
    • Sparse matrix creation methods: eye (#399), spdiags (#388) and identity (#358)
    • csr_matrix and csc_matrix are improved: __mul__ (#239), __rmul__ (#300), __getitem__ (#240, #301, #302), dot (#351, #352)
    • Initializers of csr_matrix, csc_matrix, and coo_matrix support shape argument (#316, #375)
    • Sparse matrices can have duplicated elements (#326, #371)
    • order argument in toarray method of csc and csr (#311)
    • __pow__ (#359)
    • Conversion from a dense array to a sparse matrix (#335)
    • Support conversion from scipy.sparse matrix to cupy.sparse (#370)
  • Added supports of new libraries
    • NumPy 1.13 (#347)
    • CUDA9 support (#353, thanks @anaruse!)
    • cuDNN7 support (#362, thanks @anaruse!)
    • NCCL2 support (#363, thanks @anaruse!)
  • argsort for arrays of rank two or more (#288)
  • Fix race-condition on memory pool (#382)
  • Implemented copy option of array conversion methods and wrote tests (#408)
  • Enable saving CUDA source with environment variable (#415)
  • Basic support of CUDA unified memory (#447)
  • Use original function name as fusion kernel name (#448)
  • Support replace=False in random.choice (#453)
  • Add a sync option to time_range (#474, thanks @anaruse!)

Bug fixes

  • Fix bug of empty coo_matrix (#328)
  • Fix default behavior of methods in spmatrix (#356)
  • Made dummy implementation to prevent infinite loop (#364)
  • Avoid to call python methods in __dealloc__(), use __del__() instead. (#381)
  • Fix race-condition on memory pool (#382)
  • Fix view when the itemsize of the dtype changes (#406, thanks @boeddeker!)
  • Use double backslash in str literal (#418)
  • Improved pow test (#421)
  • Use randint instead of random_integer, which is deprecated (#425)
  • Fix diagonal (#428, thanks @fukatani!)
  • Use six.assertRegex (#432)
  • Fix for numpy1.13 (#445)
  • Fix tocsc behavior for an empty dia matrix (#451)

Improvements

  • Tell the memory size when cudaErrorMemoryAllocation occurred (#314)
  • Simplify nogil (#164)
  • Skip cross compile on setup.py develop to build faster (#309)
  • Remove device memory allocation out of memory pool (#337)
  • Avoid importing NumPy docstring (#355)
  • Improve header handling (#367)
  • Remove redundant code in cupy_thrust.cu (#369)
  • Improve _tril() and _triu() with an ElementwiseKernel (#377)
  • Remove unnecessary condition (#383)
  • Add semicolons to the reduction kernel template (#386)
  • Remove redundant transpose (#390)
  • Fix usage about ElementwiseKernel (#391)
  • Remove duplicated preamble definition. (#402)
  • Fix cumsum (#414)
  • Use AxisError to maintain compatibility to multiple versions of NumPy (#437)
  • doc: Sort out navigation menu (#444)
  • Improve tensordot_core (#465)
  • Simplify flip (#468)
  • Use None instead of set() to improve memory allocation performance (#475)

Installation

  • Skip cross compile on setup.py develop to build faster (#309)
  • Fix double declaration of tuple_less (#368)
  • Made a cutomized version of sdist command to use cython (#446)

Documentation

  • Fix a grammatical error in tutorial (#267)
  • Add cupy.sparse reference (#299, #303)
  • Cleanup README.md (#334)
  • Hide source link for alien objects (#354)
  • Avoid importing NumPy docstring (#355)
  • Remove unsupported strides argument from docstring (#361)
  • Fix matmul arguments (#384, thanks @hvy!)
  • Add link to our contribution guide (#392)
  • Update docstring of linalg.einsum (#405)
  • Write docstring of A property and its test (#407)
  • Use double backslash in str literal (#418)
  • Fix typo in sparse.spdiags docstring. (#426)
  • Remove "Edit on GitHub" link (#434)
  • Reorganize navigation menu (#444)
  • Clear doctest warnings (#455)
  • Add documents of linalg (#456)
  • Write docstring of sparse.issparse (#470)

Examples

  • Conjugate Gradient (#94, thanks @KotaroSetoyama!)

Tests

  • Example test (#297)
  • Add test for cuda.cusolver_enabled flag (#374)
  • Write tests for operators for sparse matrices (#401)
  • Write docstring of A property and its test (#407)
  • Fix test for random generator (#413)
  • Fix cumsum test (#414)
  • Add test for transpose when axes is not None (#420)
  • Improved pow test (#421)
  • Changed order argument for unknown order test as SciPy causes DeprecationWarning (#422)
  • Add tests for asfptype (#423)
  • Add assert_warns (#424)
  • Use randint instead of random_integer that is deprecated (#425)
  • Use six.assertRegex (#432)
  • Show error message when an error occurs on example test (#433)
  • Fix tests on Windows (#435)
  • Fix tolerance of arithmetic tests (#443)
  • Added test for __iter__ of csr_matrix (#449)
  • Fix tocsc behavior for an empty dia matrix (#451)
  • Fix test for tensorsolve (#454)
  • Skip NumPy clip tests in Windows (#467)
  • Fix typo in test function names (#394)

Others

  • Configure flake8 to ignore the .git directory (#339)
cupy - v2.0.0b1

Published by niboshi about 7 years ago

This is a minor release. See https://github.com/cupy/cupy/milestone/8?closed=1 for the complete list of solved issues and merged PRs.

New features

Sparse matrix

cupy.sparse is a module that implements scipy.sparse API using CUDA and cuSPARSE. We now have basic features for using sparse matrices on GPU.

  • CSR and CSC (#226)
  • COO matrix (#234)
  • Conversion method from compressed matrix (csr, csc) to coordinate format (coo) (#235)
  • CSR and CSC copy (#236)
  • __add__, __radd__, __sub__ and __rsub__ for CSR and CSC (#238)
  • Fix toarray in cupy.sparse.spmatrix (#312)
  • Return NotImplemented instead of NotImplementedError (#330)
  • Use csc2dense to convert csr-matrix to dense (#305)

We are planning to add more features to cupy.sparse in upcoming releases.

New memory allocator (#168)

The memory pool implementation is greatly updated. It is based on best-fit allocation with coalescing. When there are a large number of allocations with different sizes (e.g. NLP applications), the memory usage is improved and the number of re-allocations is reduced (which also reduces the running time).

For example, the memory usage of the sequence-to-sequence code using Chainer (chainer/chainer#2070) is reduced from 12GiB (which means the process is using all of the available GPU memory) to 3GiB, and the number of memory reallocations from 20 times to 0 times.

It may increase the memory usage in some cases, although the amount of additional usage is small in practice (see the benchmark results in #168).

You can use this memory allocator by calling cupy.cuda.set_allocator(cupy.cuda.MemoryPool().malloc) (when using Chainer, it is called by default).

Other features

  • Implement cupy.linalg.det (#96)
  • Support cupy.sort to sort arrays along arbitrary axis (#229)
  • Implemented RangeStart and RangeEnd for NVIDIA visual profiler (nvvp) (#246)
  • Introduce cupy.is_available() which takes account of device availability (#247)
  • Implement cupy.msort (#251, #329)

Bug fixes

  • Fix cupy.copyto function to treat multiple GPUs correctly (#220)
  • Restore kernel type check (#253)
  • Fix deepcopy with multiple devices (#254)
  • Fix cupy.argsort for non-contiguous arrays (#284)
  • Fix ldexp on Windows (#278)

Improvements

  • Improve cupy.argsort performance (#285)

Installation

  • Remove old cuDNN support (#219)
  • Add compile options to build on Windows (#244)
  • Remove duplicated build options (#280)
  • Avoid creating garbage file on setup (#287)
  • Fix setup for cusolver (#292)
  • Use cupy.cuda.thrust_enabled to check Thrust enabled (#224)

Documentation

  • Updated difference with NumPy on reduction function behavior (#144)
  • Fix spelling in tutorial (#268)
  • Fix test instruction in README (#310)
  • Fix links to GitHub source pages (#332)

Examples

  • Add Gaussian Mixture Model (GMM) example (#29, thanks @KotaroSetoyama!)
  • Make grid size to integer for SGEMM example (#289, thanks @yuyu2172!)
  • Use absolute path in SGEMM example (#291)
  • Updated README for SGEMM example (#245, thanks @yuyu2172!)

Tests

  • Use cupy.testing.for_all_dtypes (#269)
  • Enable style check for Python code in Travis (#273)
  • Refactor cupy.argsort tests (#282)

Others

  • Small fixes for cupy.argsort (#223)
cupy - v1.0.2

Published by bkvogel about 7 years ago

This release includes bug fixes and improvements to the documentation and tests. See the list for the complete list of solved issues and merged PRs.

Enhancement

  • Change allocation_unit_size from 256 to 512 (#256)
  • Avoid synchronize in array function (#257)
  • Deterministic test (#217)
    • Note that this change includes an additional public function; we prioritized stabilizing tests more than keeping the rule of not introducing new features in stable updates.

Bug fixes

  • Fix out argument in fusion ufunc (#242)
  • Fix array method on multi GPU (#258)
  • Fix deepcopy with multiple devices (#263)
  • Fix multi-device copyto (#275)
  • Fix link args for cusolver (#315)

Installation

  • Add compile option to build on Windows (#279)
  • Do not create a.out on running python setup.py develop (#293)
  • Fix link args for cusolver (#315)

Documentation

  • Fix spelling in tutorial (#272)
  • Fix difference of reduction functions (#324)
  • Fix GitHub link (#333)

Tests

  • Make tests deterministic when possible (#217)
  • Add unit tests for cupy.array (#259)
  • Fix Numpy VisibleDeprecationWarning in indexing tests (#261)
  • Add retry to unit tests of decomposition functions (#262)
  • Fix travis test to enable style check for normal Python code (#290)
  • Skip bool unary negative test (#341)

Other

  • Add include option for covreage (#286)
  • Ignore generated reference (#318)
  • Add tags file to .gitignore (#325)
cupy - v1.0.1

Published by delta2323 over 7 years ago

This release includes bug fixes and improvements on documents and tests. See the list for the complete list of solved issues and merged PRs.

Release Notes

Enhancement

  • Workaround to "No supported gcc/g++ host compiler found” error in Ubuntu 17.04 (#243)

Bug fixes

  • Make memory pool thread-safe (#109, thanks @kmaehashi!)
  • Fix fusion to reject NumPy arrays (#241)
  • Fix thread safety of cupy.random.get_random_state (#77, #99)

Documents

  • Fix markdown in the tutorial (#106, thanks @hvy!)
  • Write about advanced indexing support (#126, thanks @yuyu2172!)
  • Remove description about discrepancy with NumPy regarding exponential of boolean arrays, which was resolved in NumPy 1.13.0 (#146)
  • Fix typo in the tutorial (#153, thanks @ignisan!)
  • Other documentation improvements (#125, #189, #173, #210)

Examples

  • Fix color argument in the k-means example (#107)

Install

  • Skip installing thrust support in case nvcc not found in PATH. (#116)
  • Other install improvement: (#143)

Others

  • Improvement on tests (#81, #124, #178, #179, #211, #212, #217, #230)
  • Improvement on website (#149, #194, #195)
cupy - v2.0.0a1

Published by unnonouno over 7 years ago

This is the release of CuPy v2.0.0a1. See here for the complete list of solved issues and merged PRs.

Release Notes

Important updates

  • We start using NVRTC instead of NVCC for kernel compilation. This change enables CuPy to run on an environment where CUDA is installed but NVCC is not available. Note that some features depending on Thrust (e.g. sorting functions) cannot be used if NVCC is not available at the installation.
  • Many functions for sorting, linear algebra, and others are added

New features

  • Use NVRTC instead of NVCC to compile kernels (#33, #62)
  • Sorting functions
    • cupy.msort (#150)
    • cupy.lexsort (#132)
    • cupy.argsort (#67)
    • cupy.sort sorting arrays with two or more rank along last axis (#186, #187)
    • Make cupy.sort support arrays with rank two or more. (#152)
  • Linear algebra functions
    • cupy.linalg.slogdet (#95)
    • cupy.linalg.matrix_rank (#97)
    • cupy.linalg.eigh and cupy.linalg.eigvalsh (#46)
  • Preliminary features to support sparse matrices
    • Note: the sparse matrix itself cannot be used in this version, yet; we are planning to make it usable in the next beta.
    • cupy.sparse.spmatrix, a base class of sparse matrices (#40)
    • Add cuSPARSE APIs (#39)
  • cupy.mgrid and cupy.ogrid (#145, thanks @iory!)
  • cupy.random.multinomial (#85)
  • cupy.cumprod (#110, thanks @ronekko!)
  • Support cuDNN v6 dilated convolution (#133, thanks @anaruse!)
  • Add total_bytes(), free_bytes(), and used_bytes() methods to memory pool (#184)
  • Support order option in astype (#111) and copy (#112)
  • cupy.fuse now does not require parentheses (#43)
  • Add ndim to CArray and CIndexer (#160, #161)

Enhancement

  • Improve memory deallocation (#174)
  • Skip installing thrust support in case nvcc not found in PATH. (#91)
  • Improve asynchronous host to device copy (#123)
  • Change the allocation unit size from 256 to 512 (#176)
  • Workaround to "No supported gcc/g++ host compiler found” error in Ubuntu 17.04 (#198)
  • Avoid synchronization in cupy.array for 0-dim values (#157)
  • Make cupy.count_nonzero return an array instead of int to avoid device-to-host synchronization (#154)
  • Check type in assert_array_list_equal (#205)
  • Improve performance (#169, #171, #172, #193, #206)
  • Improve testing utility (#218, #231)
  • Refactor cupy.atleast_nd functions (#142)

Bug fixes

  • Fix out argument in fusion (#209, #213)
  • Fix cupy.array on multiple GPU environment (#122, #135)
  • Fix usages of copy argument of ndarray.astype (#118, #121)
  • Make memory pool thread-safe (#105, thanks @kmaehashi!)
  • Fix fusion to reject NumPy arrays (#151)
  • Fix thread safety of cupy.random.get_random_state (#77, #78)

Documents

  • Fix tutorial (#93, thanks @hvy!)
  • Add links to GitHub source pages (#131)
  • Fix typo (#148, thanks @ignisan!)
  • Write about advanced indexing support (#88, thanks @yuyu2172!)
  • Remove description about discrepancy with NumPy regarding exponential of boolean arrays, which was resolved in NumPy 1.13.0 (#140)
  • Add missing documentation of cupy.cumsum (#90, thanks @ronekko!)
  • Add documentation of __getitem__ and __setitem__ for ndarray (#89, thanks @yuyu2172!)
  • Minor improvement for README and the document (#45, #49, #117, #134, #138, #155 thanks @ClimbsRocks!, #165, #177, #166)

Examples

  • Add SGEMM example (#114, #188, thanks @yuyu2172!)
  • Fix color argument in the k-means example (#103)

Tests

  • Stabilize cupy.random.choice test (#98, #104)
  • Fix Numpy VisibleDeprecationWarning in indexing tests (#202)
  • Make random tests deterministic (#81, #82)
  • Retry unit tests of decomposition functions (#129)
  • Fix bug of histogram in RandomState.interval test (#175)

Others

  • Add SciPy license (#196)
  • Fix error message in setup script (#139)
cupy - v1.0.0

Published by bkvogel over 7 years ago

This is the release of CuPy v1.

This release also contains updates of CuPy included in Chainer v1.23.0 and v1.24.0. See the release note of Chainer v1.23.0 and the release note of Chainer v1.24.0 for the details.

Announcements

The set of supported versions of CUDA and cuDNN is changed from Chainer v1.x as follows.

  • CUDA 7.0 and later
  • cuDNN 4.0 and later

Release Notes

Note: We had originally planned to include NVRTC support for the just-in-time compilation of kernels via pynvrtc, but we found that there is no guarantee on pynvrtc being compatible with old versions of CUDA, so we decided to make our own wrapper instead. Unfortunately, it cannot be included in this version. We are planning to add NVRTC support in the next version.

New features

  • Add cupy.sort function (#55, #66, #68)
  • 64bit address support on CUDA (#31)
  • Support CUPY_SEED enviroment variable (#44)

Enhancement

  • Refactor carray.cuh file (#53, #56, #57)
  • Support lock-free cache of compiled nvcc binary (#37)
  • Allow cupy.copyto from Python scalar (#38)
  • Improve setup process (#65, #69, #70, #73, #76, #80)

Bug fixes

  • Fix cupy.random.choise (#84)

Documents

  • Update tutorial (#25)
  • Update installation guide (#60)
  • Many fixes (#20, #21, #22, #34, #47, )

Examples

  • Add KMeans example (#35)

Tests

  • Improve test stability (#48, #50)
cupy - v1.0.0b1

Published by beam2d over 7 years ago

This is the beta release of CuPy v1.

This release only contains updates of Chainer v1.22.0 and minor updates of documentation and installation. See the release note of Chainer v1.22.0 for the details.

cupy - v1.0.0a1

Published by beam2d over 7 years ago

This is the first alpha of CuPy v1!

At the moment, the API and the implementation is equivalent to CuPy included in Chainer v1.21.0, while the installation step is slightly different (you need to install cupy package instead of chainer). See our official documentation for the installation procedure.

Note that the development is currently runinng at pfnet/chainer repository. We will work on adding a new feature only included in this independent CuPy code base. We will also catch up with the updates on pfnet/chainer repository, and so feel free to send any issues and patches for CuPy to pfnet/chainer repository.

Package Rankings
Top 0.96% on Pypi.org
Top 5.87% on Conda-forge.org
Top 8.17% on Proxy.golang.org
Top 19.57% on Anaconda.org
Badges
Extracted from project README
pypi Conda GitHub license Matrix Twitter Medium