[MLIR] Clarifications about memrefs / vectors / tensors

Hello everyone.
I’m quite new to the LLVM & MLIR world, and although I read the documentation multiple times I still have some doubts about vectors, tensors and memrefs. Is there any example demonstrating when each of them are best suited to be used? From what I’ve understood, memref is a generic reference to a buffer (allocated with alloc or alloca), but what about tensors and vectors? How can I instantiate them?
For example, if I want to declare a function, would it better to declare it as

func @foo(%arg0: vector<2xf32>) -> vector<2xf32> {
            return %arg0 : vector<2xf32>


func @foo(%arg0: memref<2xf32>) -> memref<2xf32> {
            return %arg0 : memref<2xf32>


In the first case, if I want to modify the vector elements, do I have to rely on the vector dialect? And If I want to return a new vector, how can I allocate and populate it (alloc and alloca both return memrefs)?
In the second case, instead, do I have to use loads / stores only?

To clarify more about my doubts: considering a C function like foo(float v[2]), emitting its LLVM IR gives

@foo(f32* %0)

which is none of the previous cases. I suppose the vector case is the most similar one, but I’m not sure about this.

Forgive me the surely stupid question, but I’m getting headaches trying to find out when to use what.

Hi mscuttari,

the closest representation of your C-code would be a memref. Memrefs represent buffers in memory, similar to a C-array. You can load elements from memrefs and operate on them using e.g. the Standard dialect.

Vectors are useful if you want to perform the same operation on multiple data-elements simultaneously (vectorization, although MLIR’s nD-vectors can be more than simple SIMD). To load and store vectors, you would have to use operations from the vector dialect and then operate on the data using operations from e.g. the Standard dialect. This is mainly useful, if you want to perform the same operation on all elements of the vector at the same time.

Tensors are a more high-level abstraction and useful, as they are very common in many computations (e.g. Machine Learning). Throughout the lowering process (e.g. towards LLVM IR), they get lowered to memrefs.

Also take a look at https://mlir.llvm.org/docs/ConversionToLLVMDialect/#calling-convention-for-ranked-memref to see how memrefs are lowered to LLVM IR.

In short, tensors are treated as single values that live in (virtual) registers and are single-assignment. That is, one cannot write into a tensor, only create a new tensor with some elements changed. This means all dependencies between tensors are visible through SSA use-def chains, and aliasing is not a concern.

Memrefs are memory buffers that can be read from and written into. This means there is a need for additional dependency analysis beyond use-def (a user of a memref may actually be writing into the memref as a side effect) and aliasing becomes a concern.

Vectors meant for targeting hardware SIMD vectors, they are not equivalent to std::vector or arrays. Conceptually, they are similar to tensors in a sense that they cannot be written into. Unlike tensors, they are related to some underlying storage and are not expected to have non-trivial layout. We can have memrefs of vectors and tensors of vectors for that reason, but not memrefs of tensors.

There is no direct equivalent to C pointers at any level above the LLVM dialect in core. But MLIR allows one to define custom types if necessary. Similarly, one can define additional ops and dialects that work with types from any other dialect, so there is no requirement to use the vector dialect to work with vectors, although the dialect likely contains useful operations.

If you need load/store semantics, you likely want memref or a homegrown equivalent.

Thank you @sommerlukas and @ftynse, almost clear to me and I think I’ll use plain memrefs for now.
Just a final question: seeing that memrefs get lowered to a list of 5 elements, how can I easily pass a C array or a C++ std::vector to the ExecutionRunner? I can pass single values by pointer and works well, but extracting those 5 elements every time seems quite an overkill. Is there any way to pass the array/vector and let the backend automatically convert it to the required struct?
I attach a small code example to better clarify my intentions:

auto maybeEngine = mlir::ExecutionEngine::create(module);
// ... check error ...
auto& engine = maybeEngine.get();
std::vector<float> myVector;
engine->invoke("foo", ??); // how to pass myVector?

Well, this is a low-level interface that you can wrap in whatever is convenient for your language. The interface is a bit more structured than C, in which one would typically pass in at least a pointer and a list of sizes. Supporting C++ is quite challenging given the complexity of C++ ABI with templates, overloads, multiple/virtual inheritance, etc.

One example is https://github.com/llvm/llvm-project/blob/71699a998d4f648396a1a12820c0f04cc61f8e19/mlir/include/mlir/ExecutionEngine/CRunnerUtils.h#L120 where we have a type compatible with memref descriptors and for which we do some additional manging + processing https://mlir.llvm.org/docs/ConversionToLLVMDialect/#c-compatible-wrapper-emission.

I’ve doen some tests but it’s not clear to me how to use the StridedMemRefType when invoking the function.
This is the MLIR function taking an array of 2 integers:

func @main(%arg0: memref<2xi32>) -> ()

This works (still, it is very heavy to be written):

int x[2] = {23, 57};
int* bufferPtr = x;
int* alignedPtr = x;
long offset = 0;
long size = 2;
long stride = 1;

llvm::SmallVector<void*, 3> args;
args.push_back((void*) &bufferPtr);
args.push_back((void*) &alignedPtr);
args.push_back((void*) &offset);
args.push_back((void*) &size);
args.push_back((void*) &stride);
engine->invoke("main", args))

This does not:

int x[2] = {23, 57};
StridedMemRefType<int, 2> arr{x, x, 0, {2}, {1}};
llvm::SmallVector<void*, 3> args;
args.push_back((void*) &arr);
engine->invoke("main", args));

By the way, how can I get some debug of runtime execution? llvm::DebugFlag is set to true but it doesn’t seem to output anything regarding the runtime execution.

Well, depending on you having requested wrapper emission or not, need to pass a pointer to a StridedMemRefType (or an equivalent struct) OR a list of independent arguments that correspond to the data it contains, respectively. Both can’t work for the same function signature.

I can’t entirely follow you.

The source ML IR is:

func @main(%arg0: memref<2xi32>) -> i32 {

Which leads to LLVM IR:

define i32 @main(i32* %0, i32* %1, i64 %2, i64 %3, i64 %4) !dbg !3 {

The documentation reports the following:

For each function, creates a wrapper function with the fixed interface
void _mlir_funcName(void **)
where the only argument is interpreted as a list of pointers to the actual
arguments of the function, followed by a pointer to the result.

The converted function is named “main” (and not “_mlir_main”), but apart from this behaviour (yet not clear to me), I can’t understand how to avoid that flattening and directly pass the StridedMemRefType. Could you please provide some code example about how to correctly invoke the function? Documentation seems pretty poor about that.

You are confusing two independent signature rewrites:

  1. memref to LLVM types;
  2. JIT wrapping within LLVM IR.

The former is completely orthogonal to JIT and ExecutionEngine and you can observe its results by running mlir-opt -convert-std-to-llvm. The latter takes a function with any signature in LLVM IR, it has no way of knowing if it comes from a memref or not, and adds a wrapper that takes void **, extracts individual arguments and passes them to the actual function. Finally, since you are calling invoke, it will also look for function called "_mlir_" + whatever-named-passed-to-invoke (llvm-project/ExecutionEngine.cpp at 0afdbb4d2dead42df14361ca9f5613d56667481c · llvm/llvm-project · GitHub). The fact that you don’t see it from the API does not mean that the binary function below isn’t called differently.

To avoid memref unpacking, you need to add the llvm.emit_c_wrapper attribute to the function. This will give you something like

func @_mlir_ciface_main(!llvm.ptr<struct<(ptr<i32>, ptr<i32>, i64, array<2 x i64>, array<2 x i64>)>>) {
  // ...
  call @main(...)

after the first step. You can invoke this function, including the _mlir_ciface_ prefix in the name, from the execution engine. Note that since the function expects a pointer, and the execution will load each argument from the void ** list, you need to pass a pointer to pointer to memref struct.

Inside the engine (you don’t need to know what happens there to use it), this will first get converted to LLVM IR

def @_mlir_ciface_main({i32*, i32*, i64, [2 x i64], [2 x i64]}*)
def @main(i32*, i32*, i64, i64, i64, i64, i64)

and then both functions will get additional wrappers

def @_mlir__mlir_ciface_main(i8**) {
  # %2 = gep, bitcast to struct type ptr ptr, load
  call void @_mlir_ciface_main({i32*, i32*, i64, [2 x i64], [2 x i64]}* %2)

def @_mlir_main(i8**) {
  # %2... = gep, bitcast to ptrs to specific types, load
  call void @main(i32* %2, i32* %3, ...)
1 Like

Thank you very much, this explanation made the whole process much more clearer to me.
May I ask where such information can be found? I searched a lot but I didn’t find anything about what you explained. Maybe I’m missing some reference.

  1. Built-in Function and MemRef Calling Convention - MLIR
  2. Is derived from the documentation on the ExecutionEngine and ExecutionEngine::invoke, which arguably could be improved. I know the internals because I wrote most of them.
1 Like