Hello all!
I want to deal with strided memrefs to provide correct 256-bits memory alignment of data string beginnings (regardless the input matrix string size, that may not be divisible by 256 bits) to be able to use fast aligned vector 256-bits loads/stores.
As I’ve read in the MLIR docs, I create affine map that explicitly defines 2D-memref strides:
#my_map = affine_map<(d0, d1) → (d0 * 2048 + d1)>
Then I allocate strided memref:
%B = alloc() {alignment = 32} : memref<2048x2048xf32, #my_map>
And after that I try to fill it with ones:
%cf1 = constant 1.00000e+00 : f32
linalg.fill(%B, %cf1) : memref<2088x2048xf32, #my_map>, f32
Then I try to process this MLIR code by the following command:
mlir-opt sgemm-tiled-benchmark-my-maps.mlir -convert-linalg-to-loops -lower-affine -convert-scf-to-std -convert-std-to-llvm -canonicalize > a1.llvm
And then, after running this command, I can see the error:
sgemm-tiled-benchmark-my-maps.mlir:12:3: error: ‘std.store’ op operand #2 must be index, but got ‘!llvm.i64’
linalg.fill(%A, %cf1) : memref<2088x2048xf32, #my_map>, f32
^
sgemm-tiled-benchmark-my-maps.mlir:12:3: note: see current operation: “std.store”(%0, %6, %9, %11) : (!llvm.float, memref<2088x2048xf32, affine_map<(d0, d1) → (d0 * 2048 + d1)>>, !llvm.i64, !llvm.i64) → ()
The problem occurs when pass -convert-std-to-llvm works. If I remove pass -convert-std-to-llvm from command parameters, I can see no errors, i.e. other passes ‘think’ everything’s Okay.
What I do wrong? Why the error happens?
If I don’t use my_map with strides, everything goes well and there are no errors.
Thank you and BR, Oleg