AMDGPU.jl

AMD GPU (ROCm) programming in Julia

OTHER License

Stars
281
Committers
22

Bot releases are visible (Hide)

AMDGPU.jl - v0.8.4

Published by github-actions[bot] 9 months ago

AMDGPU v0.8.4

Diff since v0.8.3

Merged pull requests:

  • Adapt to GPUArrays@10 (#580) (@pxl-th)
AMDGPU.jl - v0.8.3

Published by github-actions[bot] 10 months ago

AMDGPU v0.8.3

Diff since v0.8.2

Merged pull requests:

  • [rocSPARSE] Update sv! and sm! (#567) (@amontoison)
  • Use correct warpId in device-side RNG (#568) (@pxl-th)
  • Initial ROCm 6 enablement (#572) (@pxl-th)
  • Update rocSPARSE to ROCm 6 (#573) (@pxl-th)
  • Use the stage preprocess in rocsparse_spmv (#574) (@amontoison)
  • Add a generator for ROCsolver (#575) (@amontoison)
  • Implement device side rng in RDNA3 plus fix it on julia master (#576) (@gbaraldi)
  • Fix repr test (#578) (@pxl-th)

Closed issues:

  • AMDGPU fails test and crashes when initialized (#570)
  • Update rocSPARSE to ROCm 6.0 (#571)
AMDGPU.jl - v0.8.2

Published by github-actions[bot] 11 months ago

AMDGPU v0.8.2

Diff since v0.8.1

Merged pull requests:

  • [rocSPARSE] Add a structure MatInfo for IC(0) and ILU(0) preconditioners (#558) (@amontoison)
  • Define comparison method for HIPContext (#561) (@pxl-th)
  • Improve type inference (#562) (@pxl-th)
  • Refactor alloc/retry (#563) (@pxl-th)
  • Fix functional (#565) (@pxl-th)
  • Use regular malloc/free (#566) (@pxl-th)

Closed issues:

  • has_rocm_gpu() fails (#564)
AMDGPU.jl - v0.8.1

Published by github-actions[bot] 11 months ago

AMDGPU v0.8.1

Diff since v0.8.0

Merged pull requests:

  • Implement device-side RNG (#380) (@utkarsh530)
  • Fix path detection in ubuntu like systems (#545) (@gbaraldi)
  • Simplify ROCm discovery (#548) (@pxl-th)
  • [rocSPARSE] Add new constructors (#550) (@amontoison)
  • Check context is valid before freeing streams, arrays. (#552) (@pxl-th)
  • [rocSPARSE] Update helpers.jl (#554) (@amontoison)
  • Use Atomix.jl for atomics (#555) (@pxl-th)
  • Reset exception holder immediately after exception (#556) (@pxl-th)
  • Fix exception reporting (#557) (@pxl-th)
  • Cleanup (#559) (@pxl-th)

Closed issues:

  • Implement sparse BLAS routines (#15)
  • Implement iterative solvers (#13)
  • Create a Docker image for AMDGPU.jl (#33)
  • Implement batched off-thread HSA signal waiting (#128)
  • HSA_STATUS_ERROR_INVALID_CODE_OBJECT on gfx803 (#192)
  • hsa_executable_freeze can hang during high GPU load (#208)
  • Implement copy!() (#218)
  • ROCM/Hip not downloading (?) when ]added (#230)
  • mapreducedim! is not implemented for AnyROCArray Types (#234)
  • Test of AMDGPU fails on 5900HX - hipErrorNoBinaryForGpu (#244)
  • Don't disable ROCm external library type definitions when non-functional (#350)
  • AMDGPU.jl doesn't seem to work with 7900 series GPUs (#371)
  • Support for rand from Julia Base on device code (#378)
  • Detect hardware queue limit and use to limit queue pool size (#403)
  • AMDGPU on windows (#465)
  • Rely on Atomix.jl for atomics (#547)
AMDGPU.jl - v0.8.0

Published by github-actions[bot] 11 months ago

AMDGPU v0.8.0

This release brings initial suport for Windows (see requirements).
Removed "mixed-mode", everything is done automatically under-the-hood.

Diff since v0.7.4

Merged pull requests:

  • ROCm discovery for Windows (#542) (@pxl-th)
  • Fix kernel compilation on Windows (#543) (@pxl-th)
  • [Windows] Fix D2H memcopy & don't test unsupported functionality (#544) (@pxl-th)

Closed issues:

  • Fails to load on AMD Ryzen 9 7950X integrated graphics (#401)
  • Support for ROCm 5.7.1 (#522)
  • Mixed Device Libs Not Detected if Not in Project (#534)
AMDGPU.jl - v0.7.4

Published by github-actions[bot] 11 months ago

AMDGPU v0.7.4

Diff since v0.7.3

Merged pull requests:

  • Update preconditioners.jl (#533) (@amontoison)
  • [rocSPARSE] Interface the generic routines (#535) (@amontoison)
  • Defer freeing hostcall buffers & add 1.10 CI (#538) (@pxl-th)
  • Have separate free! method for hostcalls (#539) (@pxl-th)
  • Switch to artifact device libraries if ROCm 5.5+ is detected (#540) (@pxl-th)
  • Fix artifact discovery in global project (#541) (@pxl-th)

Closed issues:

  • Investigate GPUArrays tests suite error (#515)
  • Multiple workers hang test suite on Julia 1.10 (#521)
  • [rocSPARSE] ILU(0) and IC(0) preconditioners are not working (#532)
  • Hostcall tests hang (#537)
AMDGPU.jl - v0.7.3

Published by github-actions[bot] 12 months ago

AMDGPU v0.7.3

Diff since v0.7.2

Merged pull requests:

  • Fix ISA parsing (#531) (@pxl-th)

Closed issues:

  • AMDGPU 0.7.x target error on Frontier (#530)
AMDGPU.jl - v0.7.2

Published by github-actions[bot] 12 months ago

AMDGPU v0.7.2

Diff since v0.7.1

Merged pull requests:

  • Revert devlib linking opt (#529) (@pxl-th)
AMDGPU.jl - v0.7.1

Published by github-actions[bot] 12 months ago

AMDGPU v0.7.1

Diff since v0.7.0

Merged pull requests:

  • Fix intial device fetching (#528) (@pxl-th)

Closed issues:

  • Support for multi-GPU nodes broken in 0.7 (#527)
AMDGPU.jl - v0.7.0

Published by github-actions[bot] 12 months ago

AMDGPU v0.7.0

Diff since v0.6.1

Merged pull requests:

  • Enable 5.4 JLLs on LLVM <16 (#503) (@jpsamaroo)
  • Use refs instead of pointers to get a slightly friendlier abi (#504) (@gbaraldi)
  • Bump actions/checkout from 3 to 4 (#506) (@dependabot[bot])
  • Add ROCm mixed mode (#508) (@pxl-th)
  • Do runtime ROCm discovery (#509) (@pxl-th)
  • Switch tests to ReTestItems.jl (#511) (@pxl-th)
  • Use non-blocking synchronization by default (#512) (@pxl-th)
  • Bump GPUCompiler to 0.25 (#513) (@pxl-th)
  • Add a method for getrf! (#514) (@amontoison)
  • Use branches instead of 'ifelse' (#519) (@pxl-th)
  • Interface getrf_batched and getri_batched (#520) (@amontoison)
  • Bring back CI (#523) (@pxl-th)
  • Add workgroup synchronization primitives (#524) (@pxl-th)
  • Use HIP for retrieving GCN arch (#525) (@pxl-th)
  • Mention Julia 1.10+ requirement for Navi 3 (#526) (@pxl-th)

Closed issues:

  • Runtime Locking (#64)
  • 2x slower AMDGPU.jl kernel compared to HIP (#331)
  • sincos() x3.5 slower than separate sin()/cos() calls (#341)
  • HSA memory fault using AMDGPU.rand() on device ≠ 1 (#386)
  • WARNING: could not import AMDGPU.device_libs_path into Compiler (#434)
  • sincos intrinsic is broken with GPUCompiler 0.24 (#502)
  • Navi 3 causes malloc(): unsorted double linked list corrupted (#518)
AMDGPU.jl - v0.6.1

Published by github-actions[bot] about 1 year ago

AMDGPU v0.6.1

Diff since v0.6.0

Merged pull requests:

  • Fix rocrand rng offset (#493) (@tgymnich)
  • Bump GPUCompiler to 0.24 (#495) (@pxl-th)
  • [rocSOLVER] Add a method for geqrf! (#497) (@amontoison)
  • [rocSOLVER] Interface omgqr! (#498) (@amontoison)
  • Fix REPL display (#501) (@pxl-th)

Closed issues:

  • Precompilation fails (#499)
  • Synchronization in REPL (#500)
AMDGPU.jl - v0.6.0

Published by github-actions[bot] about 1 year ago

AMDGPU v0.6.0

Diff since v0.5.7

Closed issues:

  • Functions to map to/from HIP agent IDs (#5)
  • Use refcounting for memory management (#207)
  • Make unsafe_copy3d! TLS compatible (#421)

Merged pull requests:

  • Allow specifying buffer type in ctor (#486) (@pxl-th)
  • Remove default device stuff (#487) (@pxl-th)
  • [rosSPARSE] Support matrix-vector products with COO format (#488) (@amontoison)
  • Add more multi-gpu tests (#489) (@pxl-th)
  • SpMV supports CSC matrices (#490) (@amontoison)
  • Correctly switch to TLS context (#491) (@pxl-th)
  • Cleanup logging (#492) (@pxl-th)
AMDGPU.jl - v0.5.7

Published by github-actions[bot] about 1 year ago

AMDGPU v0.5.7

Diff since v0.5.6

Merged pull requests:

  • Add reverse kernels (#485) (@pxl-th)
AMDGPU.jl - v0.5.6

Published by github-actions[bot] about 1 year ago

AMDGPU v0.5.6

Diff since v0.5.5

Closed issues:

  • Implement exponential back-off for signal wait (#84)
  • Implement occupancy estimator (#112)
  • AMDGPU test errors on gfx908 (Ubuntu 20.04, ROCm 4.2, Julia 1.6.1) (#138)
  • randn(Float32, 111) and rand(Float32, 111) fail (#161)
  • Feature request: allow hsa_amd_memory_copy_async to pick a queue (#204)
  • HSA memory test hang the GPU in CI (#226)
  • AMDGPU.agents() doesn't see GPU (#236)

Merged pull requests:

  • enable dependabot for GitHub actions (#474) (@ranocha)
  • Bump actions/cache from 1 to 3 (#475) (@dependabot[bot])
  • Bump codecov/codecov-action from 1 to 3 (#476) (@dependabot[bot])
  • Bump actions/checkout from 2 to 3 (#477) (@dependabot[bot])
  • Fix typo in tests & bump GPUCompiler (#479) (@pxl-th)
  • Add sorting kernels (#480) (@pxl-th)
  • Switch to GPUArrays buffer management (#481) (@pxl-th)
  • Julia 1.10 enablement (#482) (@pxl-th)
  • Add rotate, reflect, axby functions (#484) (@pxl-th)
AMDGPU.jl - v0.5.5

Published by pxl-th about 1 year ago

Diff since v0.5.4

Merged pull requests:

AMDGPU.jl - v0.5.4

Published by github-actions[bot] about 1 year ago

AMDGPU v0.5.4

Diff since v0.5.3

Closed issues:

  • accumulate function missing? (#317)

Merged pull requests:

  • Fix tests (#466) (@pxl-th)
  • Remove HSA refcounter (#467) (@pxl-th)
  • Move module declarations to their own files (#468) (@pxl-th)
  • Fix ROCm discovery (#469) (@pxl-th)
  • Implement accumulate kernel (#470) (@pxl-th)
AMDGPU.jl - v0.5.3

Published by github-actions[bot] about 1 year ago

AMDGPU v0.5.3

Diff since v0.5.2

Closed issues:

  • AMDGPU.jl master is broken on Julia 1.7 (#372)
  • Failure calling upon calling Enzyme autodiff_deferred (#444)
  • Segmentation fault on hipStreamDestroy (#449)
  • Setting HIP_VISIBLE_DEVICES to an invalid ID fails in an unhelpful way (#450)
  • hipErrorSharedObjectInitFailed (#451)
  • Unexpected error: ccall requires compiler when using QR (#461)

Merged pull requests:

  • Add AMDGPU.@sync macro (#454) (@luraess)
  • Add rocSOLVER routines (#456) (@pxl-th)
  • Add missing HIP error code (#457) (@pxl-th)
  • Add env variable if Navi 2 detected (#458) (@pxl-th)
  • Update docs (#459) (@pxl-th)
  • Update doc (#460) (@luraess)
  • blas: Improve error on missing rocBLAS (#462) (@jpsamaroo)
  • rocSPARSE support (#463) (@pxl-th)
  • Check libraries are functional once during init (#464) (@pxl-th)
AMDGPU.jl - v0.5.2

Published by github-actions[bot] about 1 year ago

AMDGPU v0.5.2

Diff since v0.5.1

Merged pull requests:

  • Show library versions in 'versioninfo()' (#452) (@pxl-th)
  • Optimize device lib linking (#453) (@pxl-th)
AMDGPU.jl - v0.5.1

Published by github-actions[bot] over 1 year ago

AMDGPU v0.5.1

Diff since v0.5.0

Closed issues:

  • Implement Neural Network primitives (#11)
  • [Mark/Wait] Use HIP events to do fine-grained sync (#127)
  • Implement memory reclaim mechanism similar to CUDA's (#134)
  • NNlibAMDGPU.jl ? (#143)
  • Deprecation warning unsafe_length() (#183)
  • Test suite failures due to segfaults on Julia 1.8 (#261)
  • HSA memory region query test fail (#275)
  • ROCBlas support for gfx1031, 1032, and 1033 (#314)

Merged pull requests:

  • Update rocFFT library (#443) (@pxl-th)
  • Allow specifying what tests to run (#445) (@pxl-th)
  • Check pointer is still pinned before unregistering it (#446) (@pxl-th)
  • Implement 3D async copy (#447) (@pxl-th)
  • Remove artifacts from dependencies (#448) (@pxl-th)
AMDGPU.jl - v0.5.0

Published by github-actions[bot] over 1 year ago

AMDGPU v0.5.0

Diff since v0.4.15

Closed issues:

  • Test failures locally on 1.9.0-beta4 -- Radeon 6800XT (#400)
  • Update HIP errors codes (#404)
  • Optimize wait! for HSA kernel launches (#405)
  • rocBLAS synchronization issue? (#418)
  • First install with JULIA_AMDGPU_DISABLE_ARTIFACTS leads to broken config (#424)
  • Cannot unsafe_wrap a device array if lock=false (#436)

Merged pull requests:

  • Use HIP as kernel backend instead of HSA (#423) (@pxl-th)
  • fix(docs): Wrong symbol in functional docs (#431) (@kunzaatko)
  • Update to GPUCompiler 0.21 & LLVM 6 (#437) (@pxl-th)
  • Fix docs for HIP (#439) (@luraess)
  • Run tests on multiple workers again (#441) (@pxl-th)
  • Specialize ROCArray on buffer type (#442) (@pxl-th)