Metal.jl

Metal programming in Julia

MIT License

Stars
322
Committers
13

Bot releases are visible (Hide)

Metal.jl - v1.3.0 Latest Release

Published by github-actions[bot] about 2 months ago

Metal v1.3.0

Diff since v1.2.0

Merged pull requests:

  • Fix typo in docs (#384) (@christiangnrd)
  • Bump minimal Julia requirement to v1.10. (#385) (@maleadt)
  • Remove Requires dependency (#386) (@christiangnrd)
  • Reflection: Figure out kernel names by looking at metallib section. (#390) (@maleadt)
  • Add tests for broadcasting minimum and maximum (#391) (@tgymnich)
  • Don't export MTL (#392) (@christiangnrd)
  • Add erfinv (#394) (@tgymnich)
  • Add expm1 (#395) (@tgymnich)
  • Cleanup some imports (#398) (@christiangnrd)
  • Remove type-pirated function (#399) (@christiangnrd)
  • Unexport some high-level MPS functionality from MPS (#400) (@christiangnrd)
  • Adapt to new REPL precompile changes (JuliaLang/julia#55210) (#403) (@christiangnrd)
  • Bump GPUCompiler. (#404) (@maleadt)
  • Bump LLVM compat (#407) (@maleadt)
  • Make 1.11 CI success mandatory. (#408) (@maleadt)

Closed issues:

  • Audit exports/public symbols (#359)
  • Compilation failure on 1.11 (#370)
  • MTLBinaryArchive (#387)
  • Metal.code_agx() failing in MacOS 15 Beta 3 (#388)
  • Test for min / max broadcasting issue (#389)
  • Type piracy (#396)
  • Potentially unused code in gpuarrays.jl (#397)
  • Shared vs SharedStorage in examples/unified_memory (#405)
  • Unsuported call to an unknown function when calling Distributions (#406)
Metal.jl - v1.2.0

Published by github-actions[bot] 3 months ago

Metal v1.2.0

Diff since v1.1.0

Merged pull requests:

  • Avoid constructing MulAddMuls on Julia v1.12+ (#295) (@dkarrasch)
  • Trigger the runtime profiler when a test times out. (#330) (@maleadt)
  • Add MPSMatrixSoftMax (#333) (@christiangnrd)
  • Reorganize and add some MPS tests (#335) (@christiangnrd)
  • Typo fix (#336) (#337) (@101001000)
  • Add error message for running Metal.jl under Rosetta (#339) (@tgymnich)
  • Add MPSCommandBuffer (#340) (@christiangnrd)
  • Bump julia-actions/setup-julia from 1 to 2 (#341) (@dependabot[bot])
  • Revert error message for Rosetta (#342) (@tgymnich)
  • Update to ObjectiveC.jl v3. (#343) (@maleadt)
  • Add autoreleasepools to MPS interface methods. (#344) (@maleadt)
  • Don't redundantly return the cmdbuf from commit methods. (#345) (@maleadt)
  • Whitespace fixes (#346) (@christiangnrd)
  • CompatHelper: bump compat for LLVM to 7, (keep existing compat) (#347) (@github-actions[bot])
  • CompatHelper: add new compat entry for SpecialFunctions in [weakdeps] at version 2, (keep existing compat) (#352) (@github-actions[bot])
  • [NFC] Fix indentation (#353) (@christiangnrd)
  • Bump LLVM downgrader (#354) (@maleadt)
  • Don't export non-existent contents (#356) (@christiangnrd)
  • Remove/fix unused exports (#357) (@christiangnrd)
  • Unexport SimpleVersion and AS (#360) (@christiangnrd)
  • Add support for opaque pointers (#361) (@maleadt)
  • Docstrings (#362) (@christiangnrd)
  • Initial MacOS 15 support (#365) (@christiangnrd)
  • Replace current_device() with device() (#366) (@christiangnrd)
  • Support reading metallib v1.2.8 files from macOS 15. (#367) (@maleadt)
  • Add metallib (dis)assembly helper scripts. (#368) (@maleadt)
  • Simplify testing of examples. (#369) (@maleadt)
  • Temporarily allow 1.11 to fail. (#371) (@maleadt)
  • CompatHelper: add new compat entry for PrecompileTools at version 1, (keep existing compat) (#372) (@github-actions[bot])
  • Define complex sqrt (#374) (@mtfishman)
  • Check the macOS version during initialization. (#375) (@maleadt)
  • CompatHelper: bump compat for LLVM to 8, (keep existing compat) (#376) (@github-actions[bot])
  • Add accumulate implementation (#377) (@chengchingwen)
  • fix derived device array (#378) (@chengchingwen)
  • avoid ReshapedArray using Int128 in metal kernel (#379) (@chengchingwen)
  • improve type stability of derived array (#380) (@chengchingwen)
  • add findall implementation (#382) (@zhenwu0728)
  • Bump version (#383) (@christiangnrd)

Closed issues:

  • Tests sporadically timing out on 1.11 (#329)
  • ReshapedArray indexing broken because of Int128 operation (#332)
  • KernelAbstractions copyto! typo (#336)
  • Segmentation Faults (#338)
  • Port accmulate! and findall from CUDA.jl (#348)
  • Tests failing with GPUCompiler v0.26.5 and LLVM v7.1 (#350)
  • downgrades LLVM (#355)
  • sqrt(::Complex) unsupported due to conversion exceptions (#364)
Metal.jl - v1.1.0

Published by github-actions[bot] 6 months ago

Metal v1.1.0

Diff since v1.0.0

Merged pull requests:

  • Add resize! (#279) (@mtfishman)
  • Initial MTLTexture support (#280) (@christiangnrd)
  • Avoid redundant pointer conversions for threadgroup memory. (#283) (@maleadt)
  • Re-implement metallib generation in Julia. (#284) (@maleadt)
  • CompatHelper: add new compat entry for SHA at version 0.7, (keep existing compat) (#286) (@github-actions[bot])
  • Support more of the metallib format (#288) (@maleadt)
  • Address potentiallly buggy mtl behaviour. (#290) (@christiangnrd)
  • CompatHelper: add new compat entry for CodecBzip2 at version 0.8, (keep existing compat) (#292) (@github-actions[bot])
  • Remove an unneeded pointer method. (#293) (@maleadt)
  • Use NSAutoreleasePool to clean up memory. (#294) (@maleadt)
  • adapt_storage-related improvements (#296) (@christiangnrd)
  • CompatHelper: bump compat for ObjectiveC to 2, (keep existing compat) (#297) (@github-actions[bot])
  • Add support for signposts (#300) (@maleadt)
  • Retain NSError we rethrow to avoid an UAF. (#302) (@maleadt)
  • Minor mapreduce improvements (#303) (@maleadt)
  • Specialize broadcast to avoid integer divisions. (#304) (@maleadt)
  • Better Support for Unified Memory (#305) (@tgymnich)
  • Add 1.11 CI (#306) (@christiangnrd)
  • Remove unused files (#307) (@tgymnich)
  • Skip profiling tests on macOS 14.4/M1. (#310) (@maleadt)
  • Increase test timeout limit to accomodate 1.8 (#311) (@christiangnrd)
  • Test all storage modes (#314) (@christiangnrd)
  • Fix doctests (#315) (@christiangnrd)
  • Fix KernelAbstractions for Unified Memory (#316) (@tgymnich)
  • CompatHelper: add new compat entry for Preferences at version 1, (keep existing compat) (#318) (@github-actions[bot])
  • Minor cleanup (#319) (@christiangnrd)
  • Create MtlArray using memory allocated by Array (#320) (@christiangnrd)
  • Re-enable profiling tests on M1/14.4 when using Xcode 15.3. (#322) (@maleadt)
  • Small typo and doc fixup (#325) (@christiangnrd)
  • BFloat16s.jl extension and related improvements (#326) (@christiangnrd)
  • Support for Julia 1.11 (#327) (@maleadt)

Closed issues:

  • Validation-related back-end crash on macOS Ventura (#34)
  • slow broadcast copy in 2D (#41)
  • Poor performance of mapreduce (#46)
  • Multiplication with SubArrays (#47)
  • Add support to creating MtlArray using a memory allocated by Array (#62)
  • Improve use of unified memory (#86)
  • Use Autoreleasepools with Metal (#103)
  • Unknown RFLT tag generated by macOS 13 Metal compiler (#167)
  • mapreduce allocates a lot on the CPU (#211)
  • Legalization errors with vectorized code (#257)
  • Compilation Failure due to undefined symbols (#276)
  • resize!, append! not defined (#277)
  • tag new version (#278)
  • Panic during profiling tests on 14.4 beta (#281)
  • M3 backend cannot handle atomics with complicated pointer conversions (#282)
  • Int128 does not compile (#287)
  • Two suspicious mtl-related behaviours (#289)
  • LU factorization: add allowsingular keyword argument (#299)
  • Autorelease changes lead to use after free with errors (#301)
  • Reductions don't work on Shared Arrays (#312)
Metal.jl - v1.0.0

Published by github-actions[bot] 9 months ago

Metal v1.0.0

Diff since v0.5.1

Merged pull requests:

  • Matrix batches (#158) (@tgymnich)
  • Add 1.10 CI. (#256) (@maleadt)
  • Update manifest (#258) (@github-actions[bot])
  • CompatHelper: bump compat for GPUCompiler to 0.25, (keep existing compat) (#259) (@github-actions[bot])
  • Bump actions/checkout from 3 to 4 (#260) (@dependabot[bot])
  • Update manifest (#261) (@github-actions[bot])
  • CompatHelper: bump compat for CEnum to 0.5, (keep existing compat) (#262) (@github-actions[bot])
  • Update manifest (#263) (@github-actions[bot])
  • CompatHelper: add new compat entry for Artifacts at version 1, (keep existing compat) (#264) (@github-actions[bot])
  • Reduce launch overhead by generating code to encode arguments. (#265) (@maleadt)
  • Remove unused function argument (#266) (@tgymnich)
  • Introduce application tracing profiler (#267) (@maleadt)
  • Remove content(::MTLBuffer), use convert intead. (#268) (@maleadt)
  • Allow more kwargs syntax with kernel launches (#269) (@maleadt)
  • Don't re-use the IO object when shelling out to Python. (#271) (@maleadt)
  • Preserve storage mode when broadcasting. (#273) (@maleadt)

Closed issues:

  • Support for macOS Sonoma (#201)
  • Error with Julia 1.10 (#274)
Metal.jl - v0.5.1

Published by github-actions[bot] about 1 year ago

Metal v0.5.1

Diff since v0.5.0

Merged pull requests:

  • MPSMatrix improvements (#157) (@tgymnich)
  • Update manifest (#221) (@github-actions[bot])
  • Update manifest (#222) (@github-actions[bot])
  • Update manifest (#224) (@github-actions[bot])
  • Update manifest (#227) (@github-actions[bot])
  • CompatHelper: bump compat for ObjectiveC to 1, (keep existing compat) (#228) (@github-actions[bot])
  • Update manifest (#230) (@github-actions[bot])
  • Fix argument types in sincos (#232) (@fjebaker)
  • Update manifest (#233) (@github-actions[bot])
  • Improve docs (#235) (@christiangnrd)
  • Remove linear algebra section of MPS docs (#237) (@christiangnrd)
  • CompatHelper: bump compat for GPUCompiler to 0.22, (keep existing compat) (#238) (@github-actions[bot])
  • Port openlibm log1pf as log1p (#239) (@sotlampr)
  • Port openlibm erf (#240) (@tgymnich)
  • Remove 1.6-era override mechanism. (#241) (@maleadt)
  • CompatHelper: add new compat entry for Requires at version 1, (keep existing compat) (#242) (@github-actions[bot])
  • Update manifest (#243) (@github-actions[bot])
  • enable dependabot for GitHub actions (#244) (@ranocha)
  • Bump actions/checkout from 2 to 3 (#245) (@dependabot[bot])
  • Bump peter-evans/create-pull-request from 3 to 5 (#246) (@dependabot[bot])
  • Show METAL_CAPTURE_ENABLED in Metal.versioninfo() when the environment variable is set (#248) (@christiangnrd)
  • Update manifest (#249) (@github-actions[bot])
  • Adapt to GPUCompiler.jl, and other small updates. (#250) (@maleadt)
  • Switch to GPUArrays buffer management. (#251) (@maleadt)
  • Update manifest (#252) (@github-actions[bot])
  • Update manifest (#253) (@github-actions[bot])
  • Bump GPUCompiler (#255) (@maleadt)

Closed issues:

  • Random access indexing into MtlArray views cause scalar indexing (#149)
  • Q: How to debug kernels - KA.@print? (#223)
  • Crash during MTLDispatchListApply (#225)
  • Unable to compile trig functions through ForwardDiff (#229)
  • symbol multiply defined! Bug/crash on Julia master, fine on 1.10 (#231)
  • log1p fails on MtlArray{Float32} (#234)
  • When precompiling, UndefVarError: CompilerConfig not defined (#247)
Metal.jl - v0.5.0

Published by github-actions[bot] over 1 year ago

Metal v0.5.0

Diff since v0.4.1

Metal.jl 0.5 is a feature release, bringing initial support for atomic operations (#168).
Low-level atomics that mimic Metal C are supported (atomic_store_explicit,
atomic_load_explicit, etc), as well as a higher-level Metal.@atomic that can be used to
update array values similar to how CUDA.jl's @atomic works. This uses native atomics when
supported, and falls back to a compare-exchange loop otherwise.

Minor changes include an update for the @device_code_agx disassembler, the addition of a
type variable to MtlArray encoding the storage mode (#194), and support for MPSVector
(#199) which should accelerate matrix/vector multiplications.

Also note that Metal.jl now disallows the construction of Float64 arrays, as these are not
support by the Metal libraries.

Closed issues:

  • Support for atomics (#79)
  • Make MtlArray storage mode a type parameter (#190)
  • Long stacktrace when trying to create Float64 rand arrays (#205)
  • allowscalar equivalent for Metal.jl (#206)
  • Define map! ? (#219)

Merged pull requests:

  • Implement atomics using compiler intrinsics (#168) (@maleadt)
  • Parameterize MtlArray storage mode (#194) (@christiangnrd)
  • Implement MPSVector (#199) (@tgymnich)
  • Update manifest (#200) (@github-actions[bot])
  • Add Metal 3.1 to MTLLanguageVersion (#202) (@christiangnrd)
  • Update manifest (#203) (@github-actions[bot])
  • CompatHelper: bump compat for GPUCompiler to 0.21, (keep existing compat) (#204) (@github-actions[bot])
  • Update manifest (#207) (@github-actions[bot])
  • Disallow Float64 arrays entirely. (#209) (@maleadt)
  • Adapt to LLVM.jl 6. (#213) (@maleadt)
  • Update manifest (#215) (@github-actions[bot])
  • Bump disassembler. (#216) (@maleadt)
Metal.jl - v0.4.1

Published by github-actions[bot] over 1 year ago

Metal v0.4.1

Diff since v0.4.0

Closed issues:

  • Command buffer callbacks can cause bus error during thread adoption (#138)
  • how to set up Project.toml (#185)
  • Metal.rand() creates a CPU array (#187)
  • fill! for Int8 errors when the value is negative (#192)

Merged pull requests:

  • Refactor matmatmul code for faster load time (#186) (@dkarrasch)
  • Add *.DS_Store to .gitignore (#188) (@christiangnrd)
  • Add GPUArrays out-of-place random methods (#189) (@tgymnich)
  • Revert "Don't rely on thread adoption for command buffer callbacks." (#191) (@maleadt)
  • Fix fill! with negative Int8 values (#193) (@christiangnrd)
  • disambiguate gemm_wrapper! with LinAlg.jl (#195) (@dkarrasch)
  • Add type annotations for character args in matmatmul (#196) (@dkarrasch)
  • Handle missing adjoint case. (#197) (@maleadt)
  • Fix transposed matmul. (#198) (@maleadt)
Metal.jl - v0.4.0

Published by github-actions[bot] over 1 year ago

Metal v0.4.0

Diff since v0.3.0

Closed issues:

  • Restore mtlcall (#17)
  • mapreduce has poor performance (#87)
  • Native code reflection (#95)
  • rand! with Bools sometimes fails in tests in 1.9 (#141)
  • LLVM assertion failures (#153)
  • Time macro similar to CUDA.@time (#160)
  • bug in rand!? (#162)
  • Why not support threadIdx().x, blockIdx().x, blockDim().x etc? (#163)
  • Incorrect(?) darwin version in 1.8 with Metal.versioninfo() (#179)

Merged pull requests:

  • Add native code reflection. (#96) (@maleadt)
  • Move MPSKernels into a dedicated file (#155) (@tgymnich)
  • [LU decomposition] Fix types (#156) (@tgymnich)
  • Update manifest (#161) (@github-actions[bot])
  • Implement Time macro (#164) (@christiangnrd)
  • Fix some references to CUDA (#165) (@christiangnrd)
  • Fix GPUArrays RNG interface implementation. (#166) (@maleadt)
  • Bump the LLVM back-end. (#169) (@maleadt)
  • Update manifest (#170) (@github-actions[bot])
  • Update manifest (#171) (@github-actions[bot])
  • Update manifest (#172) (@github-actions[bot])
  • Bump GPUCompiler to v0.20 (#173) (@christiangnrd)
  • Detect mapreduce threadgroup limits instead of guessing. (#176) (@maleadt)
  • Remove reference to no longer used library in README.md (#177) (@christiangnrd)
  • Report package versions as part of versioninfo() (#180) (@christiangnrd)
  • Fix Darwin version indentification (#181) (@christiangnrd)
  • Topk for MPSMatrix (#182) (@christiangnrd)
  • Update manifest (#183) (@github-actions[bot])
  • Don't rely on thread adoption for command buffer callbacks. (#184) (@maleadt)
Metal.jl - v0.3.0

Published by github-actions[bot] over 1 year ago

Metal v0.3.0

Diff since v0.2.0

Closed issues:

  • Migrate to metal C++? (#2)
  • Improved errors when calling device functions on CPU (#90)
  • Improve Objective-C interfacing (#104)
  • Rename grid to groups (#116)
  • Add functionality check helper (#121)
  • inputing non-isbits types (#128)
  • @metal docstring out-of-date (#129)
  • mapreduce kernel uses too many threads (#132)
  • Powers don't work with complex floats (#142)

Merged pull requests:

  • Add contributing documentation (#93) (@max-Hawkins)
  • Reduce multiple consecutive values in each thread to improve efficiency (#112) (@maxwindiff)
  • Remove libcmt, use native ObjectiveC FFI (#117) (@maleadt)
  • Rename grid to groups (#119) (@habemus-papadum)
  • Audit MRR (#122) (@maleadt)
  • Faster in-place reduction by using broadcasting to initialize partial… (#123) (@maxwindiff)
  • Add MPS matrix decompositions (#124) (@tgymnich)
  • Minor documentation formatting (#125) (@asinghvi17)
  • Switch default mode to private storage (#126) (@christiangnrd)
  • Update manifest (#127) (@github-actions[bot])
  • Add some MtlArray docs (#130) (@christiangnrd)
  • Port MetalKernels (#131) (@maxwindiff)
  • Adapt to GPUCompiler 0.18. (#134) (@maleadt)
  • Support passing non-isbits arguments, as long as they're unused. (#135) (@maleadt)
  • Do not change grain size after pipeline creation (#136) (@maxwindiff)
  • Bump GPUArrays. (#137) (@maleadt)
  • Specialize GPUArrays' global_size query. (#139) (@maleadt)
  • Catch errors that happen during command buffer callbacks. (#140) (@maleadt)
  • Call the correct current_device() in reflection (#143) (@maxwindiff)
  • Error when calling device functions on CPU (#144) (@christiangnrd)
  • Implement MTLGPUFamily and use it to validate gpu (#146) (@christiangnrd)
  • Add functional() (#147) (@christiangnrd)
  • Update manifest (#148) (@github-actions[bot])
  • CompatHelper: add new compat entry for StaticArrays at version 1, (keep existing compat) (#151) (@github-actions[bot])
  • Update to LLVM.jl 5 and GPUCompiler 0.19. (#154) (@maleadt)
Metal.jl - v0.2.0

Published by github-actions[bot] over 1 year ago

Metal v0.2.0

Diff since v0.1.2

Closed issues:

  • Threadgroup memory breaks on small datatypes (#26)
  • Int64 not supported on AMD GPUs? (#38)
  • Base.unsafe_convert is ambiguous (#42)
  • Support for multiple devices (#44)
  • Add CITATION file (#55)
  • XGBoost on Metal.jl (#82)
  • first try at metal (#84)
  • Copysign intrinsic possibly wrong (#89)
  • Metal.jl fails to precompile on Linux (#97)
  • Silent failure with unsupported(?) Intel Iris Graphics (#109)
  • I have 2 question about Metal.jl and Flux.jl (#110)

Merged pull requests:

  • Update manifest (#57) (@github-actions[bot])
  • Add GPU profiling capabilities (#58) (@max-Hawkins)
  • Automatically detect if we need cmt build from source. (#59) (@maleadt)
  • Update manifest (#60) (@github-actions[bot])
  • Add queue kernel launch argument (#61) (@tgymnich)
  • Update manifest (#63) (@github-actions[bot])
  • Switch pipeline to juliaecosystem (#64) (@vchuravy)
  • Update manifest (#65) (@github-actions[bot])
  • Add a function for setting the current device (#66) (@maxwindiff)
  • Add documentation webpage (#67) (@max-Hawkins)
  • Wrap simdgroup matrix functions (#70) (@maxwindiff)
  • Support loading/saving simdgroup matrix from threadgroup memory (#71) (@maxwindiff)
  • Conditionalize the MtlDeviceArray element-type workaround. (#72) (@maleadt)
  • Add basic SIMD shuffle up/down (#73) (@max-Hawkins)
  • Update manifest (#74) (@github-actions[bot])
  • Optimize warp reduction for mapreduce (#75) (@max-Hawkins)
  • Specialize GPUArrays.global_index() to improve broadcast performance (#76) (@maxwindiff)
  • Update manifest (#78) (@github-actions[bot])
  • Add initial performance shader support (matmul) (#80) (@max-Hawkins)
  • Use Ninja to build cmt. (#81) (@maleadt)
  • Update manifest (#83) (@github-actions[bot])
  • Support Julia 1.9 (#85) (@maleadt)
  • Add queue parameter to unsafe_copyto (#88) (@tgymnich)
  • Update manifest (#91) (@github-actions[bot])
  • Add MPS tests. (#92) (@maleadt)
  • Support for writing binary archives (#94) (@maleadt)
  • Support precompilation and loading on non-Apple hardware (#98) (@maleadt)
  • Update manifest (#99) (@github-actions[bot])
  • Improve reduce performance by passing CartesianIndices and length statically (#100) (@maxwindiff)
  • Do not release objects that are autoreleased. (#102) (@habemus-papadum)
  • Fix path the cmt in Hacking Section of the Readme (#105) (@habemus-papadum)
  • Add example showing Metal and Gtk4 integration (#106) (@habemus-papadum)
  • Fix memory leak. (#107) (@habemus-papadum)
  • Add a mtl function for simple recursive data conversions. (#114) (@maleadt)
  • Write profile trace in the current folder. (#115) (@maleadt)
Metal.jl - v0.1.2

Published by github-actions[bot] about 2 years ago

Metal v0.1.2

Diff since v0.1.1

Closed issues:

  • installation issue (libz.1.dylib not found) [+workaround] (#51)
  • Optimally choosing threads and grid (#54)

Merged pull requests:

  • Use Base.active_project. (#43) (@maleadt)
  • Update manifest (#45) (@github-actions[bot])
  • Add aliases MtlVector and MtlMatrix (#48) (@amontoison)
  • Update manifest (#49) (@github-actions[bot])
  • Wrap at-metal's output in a let block. (#50) (@maleadt)
  • Update manifest (#52) (@github-actions[bot])
  • Update manifest (#56) (@github-actions[bot])
Metal.jl - v0.1.1

Published by github-actions[bot] over 2 years ago

Metal v0.1.1

Diff since v0.1.0

Closed issues:

  • Super slow broadcast (#39)

Merged pull requests:

  • Fix typos in unified memory example (#37) (@pitmonticone)
  • Fix the launch heuristic. (#40) (@maleadt)
Metal.jl - v0.1.0

Published by github-actions[bot] over 2 years ago

Metal v0.1.0

Diff since v0.0.1

Metal.jl - v0.0.1

Published by github-actions[bot] over 2 years ago

Metal v0.0.1

Closed issues:

  • error when using (#1)
  • Argument buffer encoding is fragile (#5)
  • LLVMType of MtlDeviceArray needs changing/manipulation (#6)
  • Errors running on M1 Max (#14)
  • I get this, my name isn't Tim (#16)
  • Thanks for the previous fix - had a go (#18)
  • Custom IR verification (#25)
  • cmt: Release build fails install (#27)

Merged pull requests:

  • Add device_code_metallib macro (#3) (@max-Hawkins)
  • Update README (#8) (@max-Hawkins)
  • Implement GPUArrays launch heuristic (#9) (@max-Hawkins)
  • Add docstrings (#12) (@max-Hawkins)
  • Rework metadata generation (#13) (@maleadt)
  • Add CI (#19) (@maleadt)
  • Use sw_vers to query the macOS version. (#20) (@maleadt)
  • Updates for macOS 13 (Ventura); use bindless argument buffers (#23) (@maleadt)
  • Enable the GPUArrays test suite (#24) (@maleadt)
  • Use cmt from pre-built JLL. (#28) (@maleadt)
  • Package updates (#29) (@maleadt)
  • First test with a locally-built cmt. (#30) (@maleadt)
  • Use labels to determine whether to build local deps. (#31) (@maleadt)
  • Bump GPUArrays. (#32) (@maleadt)
  • MTL wrapper clean-ups (#33) (@maleadt)