Can't convert strided memref to LLVM

Hello all!
I want to deal with strided memrefs to provide correct 256-bits memory alignment of data string beginnings (regardless the input matrix string size, that may not be divisible by 256 bits) to be able to use fast aligned vector 256-bits loads/stores.
As I’ve read in the MLIR docs, I create affine map that explicitly defines 2D-memref strides:

#my_map = affine_map<(d0, d1) → (d0 * 2048 + d1)>

Then I allocate strided memref:

%B = alloc() {alignment = 32} : memref<2048x2048xf32, #my_map>

And after that I try to fill it with ones:

%cf1 = constant 1.00000e+00 : f32
linalg.fill(%B, %cf1) : memref<2088x2048xf32, #my_map>, f32

Then I try to process this MLIR code by the following command:

mlir-opt sgemm-tiled-benchmark-my-maps.mlir -convert-linalg-to-loops -lower-affine -convert-scf-to-std -convert-std-to-llvm -canonicalize > a1.llvm

And then, after running this command, I can see the error:

sgemm-tiled-benchmark-my-maps.mlir:12:3: error: ‘std.store’ op operand #2 must be index, but got ‘!llvm.i64’
linalg.fill(%A, %cf1) : memref<2088x2048xf32, #my_map>, f32
^
sgemm-tiled-benchmark-my-maps.mlir:12:3: note: see current operation: “std.store”(%0, %6, %9, %11) : (!llvm.float, memref<2088x2048xf32, affine_map<(d0, d1) → (d0 * 2048 + d1)>>, !llvm.i64, !llvm.i64) → ()

The problem occurs when pass -convert-std-to-llvm works. If I remove pass -convert-std-to-llvm from command parameters, I can see no errors, i.e. other passes ‘think’ everything’s Okay.
What I do wrong? Why the error happens?
If I don’t use my_map with strides, everything goes well and there are no errors.

Thank you and BR, Oleg

I am also interested in a follow up to this. Stumbled on similar issue when exploring affine_map<> with mem_ref and alloc instructions.

Furthermore, in the past, the test below would check an identity and a non-identity map during convert-std-to-llvm, but now it only checks the identity mapping with alloc:

Conversion to the LLVM dialect only supports alloc for memrefs with identity layout, that’s why the test was removed in [mlir] Require std.alloc() ops to have canonical layout during LLVM l… · llvm/llvm-project@04481f2 · GitHub. In the general case, it is impossible to compute the number of contiguous elements to allocate for a memref with an arbitrary affine layout. Strided memrefs can be obtained by allocating a contiguous memref with the sufficient number of elements and taking a view.

The conversion fails because it doesn’t convert the alloc but does convert the uses of the allocated memref. Arguably, it should inject casts instead.

Thank you for the reply!

This makes sense.
Are there any other targets taking advantage of it yet?
Would you be able to point to a test/code of a different conversion target that is leveraging a non-identity layout? I am trying to understand how it skips llvm and maps directly to an accelerator that uses it.

Thank you for the answer, Alex.
I will try to implement this in the manner you have suggested.

There’s no code in tree that uses it as far as I know. Folks downstream seem to use it in some affine passes, see previous discussions:

1 Like