GPU Compute Basic Algorithms

nicolasvasilache · June 29, 2020, 1:09am

It doesn’t always have to be micro-ops in descendent dialects but indeed, progressive lowering + transformations that break up an op into smaller ops seem to be a reasonable way to get started. You could look at how vector.contract gets progressively lowered to either:

vector.matrix_multiply → llvm.matrix_multiply
vector.outerproduct + vector.transpose → insert/extract/fma
directly to insert/extract/fma

From any of these levels it would make sense to go to a HW-specific dialect e.g. GPU.
However there are also implications on chains of ops with sources / sinks to memory (e.g. load/store and vector.transfer); see e.g. @ThomasRaoux’s commits to iree to see a bit of the gradient here.

There is also some WIP that adds a lowering to vector.reduce.

A lot more work is needed to get a meaningful set of useful primitives such as described by @Lichtso .

For example, Linalg supports a primitive lowering of ops to library calls for semantically named ops. This needs to be extended (e.g. like discussed here) but it shows that starting from high-level, semantically charged ops, we can mix transformations, codegen and library calls. This is also why I have been talking about ops whose semantics is captured by attributes: this allows building generic mechanisms to mix codegen + library calls.

I imagine similar mechanisms can be generalized and serve the same purpose. However, it is unclear at this point whether all/most of the ops discussed in the meeting have such attribute-based representations but the least the compiler / analysis and transformations need to know about e.g. my_special_scan_and_conv_op, the better.

Topic		Replies	Views
Gpu.all_reduce among a subgroup? MLIR	0	217	August 31, 2022
What's the meaning of gpu.all_reduce MLIR	5	705	March 31, 2022
Sparse Compiler and GPU Code-Generation MLIR	8	2543	October 20, 2023
[RFC] Tile Dialect and-or Dialect Reshuffle MLIR	9	994	June 10, 2023
Utilizing MLIR in Database Engines MLIR	2	203	February 6, 2024

GPU Compute Basic Algorithms

Related Topics