This project is a set of libraries designed to work with LDC to enable native execution of D on GPUs (and other more exotic targets of OpenCL such as FPGAs DSPs, hereafter just 'GPUs') on the OpenCL and CUDA runtimes. As DCompute depends on developments in LDC for the code generation, a relatively recent LDC is required, use 1.8.0 or newer.
There are four main parts:
dub test
(see dub.json
for the configuration used).Kernel:
@kernel void saxpy(GlobalPointer!(float) res,
float alpha,
GlobalPointer!(float) x,
GlobalPointer!(float) y,
size_t N)
{
auto i = GlobalIndex.x;
if (i >= N) return;
res[i] = alpha*x[i] + y[i];
}
Invoke with (CUDA):
q.enqueue!(saxpy)
([N,1,1],[1,1,1]) // Grid & block & optional shared memory
(b_res,alpha,b_x,b_y, N); // kernel arguments
equivalent to the CUDA code
saxpy<<<1,N,0,q>>>(b_res,alpha,b_x,b_y, N);
For more examples and the full code see source/dcompute/tests
.
To build DCompute you will need:
$dub build
or add "dcompute": "~>0.1.0"
to your dub.json
or dependency "dcompute" version="~>0.1.0"
to your dub.sdl
.If you get an error like Error: unrecognized switch '-mdcompute-targets=cuda-210
, make sure you are using LDC and not DMD: passing --compiler=/path/to/ldc2
to dub will force it to use /path/to/ldc2
as the D compiler.
A dmd compatible d compiler,dmd, ldmd or gdmd (available as part of ldc and gdc respectively), and cmake for building ldc is also required if you need to build ldc yourself.
Please see the documentation.
Generate OpenCL builtins from here