According to this post GPU code generation status: NVidia, OpenCL, GPU dialect is aimed to provide support for host-device ABI and launch kernels.
I’ve been playing with GPU dialect but I don’t understand how can I lower the dialect. Right now I have a pass where I add a
gpu.alloc op, like this:
%memref = gpu.alloc () : memref<200x100x600xf32> ... gpu.dealloc %memref : memref<200x100x600xf32>
Then I tried to lower this
gpu.alloc/dealloc to MLIR LLVM dialect using:
but it does nothing. It works but I get the same MLIR code. I assume that the GPU dialect cannot be converted to a lower level dialect? But it can’t be converted to LLVM either, because if I try to do so, I get:
cannot be converted to LLVM IR: missing `LLVMTranslationDialectInterface` registration for dialect for op: gpu.alloc
Then how I’m supposed to lower GPU dialect?
PS: I expected the
gpu.alloc to be lowered to some llvm.call to cudaMalloc (on the device). However, because
gpu dialect should work with AMD too, I don’t know where should I specify that I want to lower to NVIDIA.