LLVM Discussion Forums

[RFC] Remove MemRefType element type check? Or add pointer support to ‘std’ dialect?

Hello all. I’m on an extended vacation and am playing around by rewriting a toy C frontend of mine to lower to MLIR instead of LLVM IR. I hit a snag when I tried to lower this C code:

int a;
int *b;
int **c;
int ***d;

When using LLVM IR, my frontend lowers this to:

%a = alloca i32
%b = alloca i32*
%c = alloca i32**
%d = alloca i32***

Using MLIR, I tried the following:

%0 = alloca() : memref<1xi32>
// Assertion in llvm-project/mlir/lib/IR/StandardTypes.cpp:353:
// "Failed to construct instance of MemRefType."
%1 = alloca() : memref<1xmemref<1xi32>>

The specific issue here is that MemRefType::getImpl checks that the element type is one of a very specific set of types (link to the code on GitHub):

// Check that memref is formed from allowed types.
if (!elementType.isIntOrFloat() && !elementType.isa<VectorType>() &&
    !elementType.isa<ComplexType>())
  return emitOptionalError(location, "invalid memref element type"),
         MemRefType();

I have some ideas on how to proceed, and I also received suggestions in the LLVM Discord’s #mlir channel:

  1. Use custom dialect types in my C dialect. Right now my dialect reuses MemRefType , and I think this is nice because this allows my dialect to interface with many other MLIR dialects. But if MemRefType can’t model pointers, I may need to move away from it.
  2. Use LLVMType directly in my C dialect. The LLVM dialect supports pointers and I assume that by using its types, my dialect can as well. However the downside to this approach is similar to (1): it diminishes my dialect’s ability to interface with other dialects.
  3. Simply remove the MemRefType element type checking code above (link to a diff on GitHub). In fact, all check-mlir tests still succeed even without the check of the element type. And without the check, my my C frontend can generate infinitely nested memref types by calling MemRefType::get(1, MemRefType::get(1, <etc>)) , and so using just std.alloca, std.load, and std.store, can generate MLIR for C code such as int **x = &y; **x = 42; .
  4. Add some sort of formal support for pointers to the std dialect. This was suggested by Medhi in the #mlir channel of the LLVM Discord. At first I thought this would be the best way forward, but then I realized that (3) solves all my problems and requires zero work! :smiley:

I’m curious what all of you think. If (3) seems reasonable, I’ll send a diff up to Phabricator to remove the type check. If that’s not a reasonable solution, I’d like to learn more about why not. (Also, the fact that check-mlir succeeds even when that code is removed indicates to me that the test suite is missing some test cases. I could help add those.)

If you all would prefer formal support for pointers in ‘std’ as per (4), I’d like to discuss what that means concretely: a new kind of type? std.ptrload and std.ptrstore operations? Something else? I’m relatively new to MLIR so I don’t have much capacity for imagination here.

(I can share my C dialect and toy compiler if you all need to see the specific ways in which I’m std.alloca 'ing MemRefType to model C pointers, but it’s a little embarrassing considering how few C features it supports at the moment — just two types int and float , pointers, and a handful of expression/declaration kinds.)

@kern_h responded to this topic on the LLVM Discord #mlir channel:

@modocache i tried commenting out those checks too and was excited when all existing tests passed. it was soon pointed out to me though that there’s no code that knows how to handle something like memref<1xmemref<f32>> . even if parsing and construction succeeds, it will presumably fail during conversion.

I hadn’t realized this because my frontend is successfully using MLIR for the following C program:

// Compiles and executes to return an exit code of 9.

int main() {
  int a;
  a = 1;

  int *b;
  b = &a;

  int **c;
  c = &*&b;

  **c = 3;

  return a + *b + **c;
}

My frontend emits the following MLIR when converted to ‘std’:

module @"cl/test/CodeGen/pointer.c" {
  func @main() -> i32 {
    %0 = alloca() : memref<1xi32> cl/test/CodeGen/pointer.c:9:1
    %c0_i32 = constant 0 : i32 cl/test/CodeGen/pointer.c:9:1
    %c0 = constant 0 : index cl/test/CodeGen/pointer.c:9:1
    store %c0_i32, %0[%c0] : memref<1xi32> cl/test/CodeGen/pointer.c:9:1
    %1 = alloca() : memref<1xi32> cl/test/CodeGen/pointer.c:10:7
    %c1_i32 = constant 1 : i32 cl/test/CodeGen/pointer.c:11:7
    %c0_0 = constant 0 : index cl/test/CodeGen/pointer.c:11:3
    store %c1_i32, %1[%c0_0] : memref<1xi32> cl/test/CodeGen/pointer.c:11:3
    %2 = alloca() : memref<1xmemref<1xi32>> cl/test/CodeGen/pointer.c:13:8
    %c0_1 = constant 0 : index cl/test/CodeGen/pointer.c:14:3
    store %1, %2[%c0_1] : memref<1xmemref<1xi32>> cl/test/CodeGen/pointer.c:14:3
    %3 = alloca() : memref<1xmemref<1xmemref<1xi32>>> cl/test/CodeGen/pointer.c:16:9
    %c0_2 = constant 0 : index cl/test/CodeGen/pointer.c:17:3
    store %2, %3[%c0_2] : memref<1xmemref<1xmemref<1xi32>>> cl/test/CodeGen/pointer.c:17:3
    %c3_i32 = constant 3 : i32 cl/test/CodeGen/pointer.c:19:9
    %c0_3 = constant 0 : index cl/test/CodeGen/pointer.c:19:5
    %4 = load %3[%c0_3] : memref<1xmemref<1xmemref<1xi32>>> cl/test/CodeGen/pointer.c:19:5
    %c0_4 = constant 0 : index cl/test/CodeGen/pointer.c:19:4
    %5 = load %4[%c0_4] : memref<1xmemref<1xi32>> cl/test/CodeGen/pointer.c:19:4
    %c0_5 = constant 0 : index cl/test/CodeGen/pointer.c:19:3
    store %c3_i32, %5[%c0_5] : memref<1xi32> cl/test/CodeGen/pointer.c:19:3
    %c0_6 = constant 0 : index cl/test/CodeGen/pointer.c:21:10
    %6 = load %1[%c0_6] : memref<1xi32> cl/test/CodeGen/pointer.c:21:10
    %c0_7 = constant 0 : index cl/test/CodeGen/pointer.c:21:15
    %7 = load %2[%c0_7] : memref<1xmemref<1xi32>> cl/test/CodeGen/pointer.c:21:15
    %c0_8 = constant 0 : index cl/test/CodeGen/pointer.c:21:14
    %8 = load %7[%c0_8] : memref<1xi32> cl/test/CodeGen/pointer.c:21:14
    %9 = addi %6, %8 : i32 cl/test/CodeGen/pointer.c:21:10
    %c0_9 = constant 0 : index cl/test/CodeGen/pointer.c:21:21
    %10 = load %3[%c0_9] : memref<1xmemref<1xmemref<1xi32>>> cl/test/CodeGen/pointer.c:21:21
    %c0_10 = constant 0 : index cl/test/CodeGen/pointer.c:21:20
    %11 = load %10[%c0_10] : memref<1xmemref<1xi32>> cl/test/CodeGen/pointer.c:21:20
    %c0_11 = constant 0 : index cl/test/CodeGen/pointer.c:21:19
    %12 = load %11[%c0_11] : memref<1xi32> cl/test/CodeGen/pointer.c:21:19
    %13 = addi %9, %12 : i32 cl/test/CodeGen/pointer.c:21:10
    %c0_12 = constant 0 : index cl/test/CodeGen/pointer.c:21:3
    store %13, %0[%c0_12] : memref<1xi32> cl/test/CodeGen/pointer.c:21:3
    br ^bb1 cl/test/CodeGen/pointer.c:21:3
  ^bb1:	// pred: ^bb0
    %c0_13 = constant 0 : index cl/test/CodeGen/pointer.c:9:1
    %14 = load %0[%c0_13] : memref<1xi32> cl/test/CodeGen/pointer.c:9:1
    return %14 : i32 cl/test/CodeGen/pointer.c:9:1
  } cl/test/CodeGen/pointer.c:9:1
} cl/test/CodeGen/pointer.c:9:1

Lowered to LLVM IR this produces:

; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"

declare i8* @malloc(i64)

declare void @free(i8*)

define i32 @main() !dbg !3 {
  %1 = alloca i32, i64 ptrtoint (i32* getelementptr (i32, i32* null, i64 1) to i64), align 4, !dbg !7
  %2 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } undef, i32* %1, 0, !dbg !7
  %3 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %2, i32* %1, 1, !dbg !7
  %4 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %3, i64 0, 2, !dbg !7
  %5 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %4, i64 1, 3, 0, !dbg !7
  %6 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %5, i64 1, 4, 0, !dbg !7
  %7 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %6, 1, !dbg !7
  %8 = getelementptr i32, i32* %7, i64 0, !dbg !7
  store i32 0, i32* %8, align 4, !dbg !7
  %9 = alloca i32, i64 ptrtoint (i32* getelementptr (i32, i32* null, i64 1) to i64), align 4, !dbg !9
  %10 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } undef, i32* %9, 0, !dbg !9
  %11 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %10, i32* %9, 1, !dbg !9
  %12 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %11, i64 0, 2, !dbg !9
  %13 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %12, i64 1, 3, 0, !dbg !9
  %14 = insertvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %13, i64 1, 4, 0, !dbg !9
  %15 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %14, 1, !dbg !10
  %16 = getelementptr i32, i32* %15, i64 0, !dbg !10
  store i32 1, i32* %16, align 4, !dbg !10
  %17 = alloca { i32*, i32*, i64, [1 x i64], [1 x i64] }, i64 ptrtoint ({ i32*, i32*, i64, [1 x i64], [1 x i64] }* getelementptr ({ i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* null, i64 1) to i64), align 8, !dbg !11
  %18 = insertvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } undef, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %17, 0, !dbg !11
  %19 = insertvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %18, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %17, 1, !dbg !11
  %20 = insertvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %19, i64 0, 2, !dbg !11
  %21 = insertvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %20, i64 1, 3, 0, !dbg !11
  %22 = insertvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %21, i64 1, 4, 0, !dbg !11
  %23 = extractvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %22, 1, !dbg !12
  %24 = getelementptr { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %23, i64 0, !dbg !12
  store { i32*, i32*, i64, [1 x i64], [1 x i64] } %14, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %24, align 8, !dbg !12
  %25 = alloca { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, i64 ptrtoint ({ { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* getelementptr ({ { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* null, i64 1) to i64), align 8, !dbg !13
  %26 = insertvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } undef, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %25, 0, !dbg !13
  %27 = insertvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %26, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %25, 1, !dbg !13
  %28 = insertvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %27, i64 0, 2, !dbg !13
  %29 = insertvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %28, i64 1, 3, 0, !dbg !13
  %30 = insertvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %29, i64 1, 4, 0, !dbg !13
  %31 = extractvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %30, 1, !dbg !14
  %32 = getelementptr { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %31, i64 0, !dbg !14
  store { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %22, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %32, align 8, !dbg !14
  %33 = extractvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %30, 1, !dbg !15
  %34 = getelementptr { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %33, i64 0, !dbg !15
  %35 = load { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %34, align 8, !dbg !15
  %36 = extractvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %35, 1, !dbg !16
  %37 = getelementptr { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %36, i64 0, !dbg !16
  %38 = load { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %37, align 8, !dbg !16
  %39 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %38, 1, !dbg !17
  %40 = getelementptr i32, i32* %39, i64 0, !dbg !17
  store i32 3, i32* %40, align 4, !dbg !17
  %41 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %14, 1, !dbg !18
  %42 = getelementptr i32, i32* %41, i64 0, !dbg !18
  %43 = load i32, i32* %42, align 4, !dbg !18
  %44 = extractvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %22, 1, !dbg !19
  %45 = getelementptr { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %44, i64 0, !dbg !19
  %46 = load { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %45, align 8, !dbg !19
  %47 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %46, 1, !dbg !20
  %48 = getelementptr i32, i32* %47, i64 0, !dbg !20
  %49 = load i32, i32* %48, align 4, !dbg !20
  %50 = add i32 %43, %49, !dbg !18
  %51 = extractvalue { { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %30, 1, !dbg !21
  %52 = getelementptr { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %51, i64 0, !dbg !21
  %53 = load { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }, { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] }* %52, align 8, !dbg !21
  %54 = extractvalue { { i32*, i32*, i64, [1 x i64], [1 x i64] }*, { i32*, i32*, i64, [1 x i64], [1 x i64] }*, i64, [1 x i64], [1 x i64] } %53, 1, !dbg !22
  %55 = getelementptr { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %54, i64 0, !dbg !22
  %56 = load { i32*, i32*, i64, [1 x i64], [1 x i64] }, { i32*, i32*, i64, [1 x i64], [1 x i64] }* %55, align 8, !dbg !22
  %57 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %56, 1, !dbg !23
  %58 = getelementptr i32, i32* %57, i64 0, !dbg !23
  %59 = load i32, i32* %58, align 4, !dbg !23
  %60 = add i32 %50, %59, !dbg !18
  %61 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %6, 1, !dbg !24
  %62 = getelementptr i32, i32* %61, i64 0, !dbg !24
  store i32 %60, i32* %62, align 4, !dbg !24
  br label %63, !dbg !24

63:                                               ; preds = %0
  %64 = extractvalue { i32*, i32*, i64, [1 x i64], [1 x i64] } %6, 1, !dbg !7
  %65 = getelementptr i32, i32* %64, i64 0, !dbg !7
  %66 = load i32, i32* %65, align 4, !dbg !7
  ret i32 %66, !dbg !7
}

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2}

!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "mlir", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
!1 = !DIFile(filename: "LLVMDialectModule", directory: "/")
!2 = !{i32 2, !"Debug Info Version", i32 3}
!3 = distinct !DISubprogram(name: "main", linkageName: "main", scope: null, file: !4, line: 9, type: !5, scopeLine: 9, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !6)
!4 = !DIFile(filename: "cl/test/CodeGen/pointer.c", directory: "/home/modocache/Source/llvm/git/dev/llvm-project")
!5 = !DISubroutineType(types: !6)
!6 = !{}
!7 = !DILocation(line: 9, column: 1, scope: !8)
!8 = !DILexicalBlockFile(scope: !3, file: !4, discriminator: 0)
!9 = !DILocation(line: 10, column: 7, scope: !8)
!10 = !DILocation(line: 11, column: 3, scope: !8)
!11 = !DILocation(line: 13, column: 8, scope: !8)
!12 = !DILocation(line: 14, column: 3, scope: !8)
!13 = !DILocation(line: 16, column: 9, scope: !8)
!14 = !DILocation(line: 17, column: 3, scope: !8)
!15 = !DILocation(line: 19, column: 5, scope: !8)
!16 = !DILocation(line: 19, column: 4, scope: !8)
!17 = !DILocation(line: 19, column: 3, scope: !8)
!18 = !DILocation(line: 21, column: 10, scope: !8)
!19 = !DILocation(line: 21, column: 15, scope: !8)
!20 = !DILocation(line: 21, column: 14, scope: !8)
!21 = !DILocation(line: 21, column: 21, scope: !8)
!22 = !DILocation(line: 21, column: 20, scope: !8)
!23 = !DILocation(line: 21, column: 19, scope: !8)
!24 = !DILocation(line: 21, column: 3, scope: !8)

So, while I can’t be certain the LLVM IR is exactly correct, it is at least producing the correct result for my toy program. Anyway, I thought I’d at least share the specific code I’m generating, in case it’s helpful when discussing this.

Based on @kern_h’s comment it sounds like this has been discussed before, and suggestion (3) from my original post was not considered a viable approach. In that case, I’m wondering what (4) would look like: some sort of “support” for pointers in the ‘std’ dialect.

Hey Brian,

Cool project! I agree that memref of memrefs make sense to support.

Unrelatedly, have you considered having your C frontend lower to an MLIR-based “C IR” which uses your existing type system? You’d lower the operations to something intended for high level optimization, and expose control flow etc.

That could then be lowered to the LLVM dialect. I don’t think there is any need to touch memrefs if you went this path.

-Chris

1 Like

Thanks Chris! In fact the idea to rewrite my toy compiler to use MLIR was inspired by some of your ideas: for example, you mentioned on the mailing list that you thought Clang could introduce a C IR to incrementally replace/improve the CFG analyses done by libclangAnalysis (mailing list link).

For now I’ve been focusing on getting as much done as possible using just existing dialects and existing constructs such as MemRefType. My understanding, which is based on discussions in the LLVM Discord’s #mlir channel, is that doing so allows my dialect to play nicely with existing transforms and optimizations originally designed for other dialects.

That being said, yes, I do intend to build out a “C IR” dialect, and I think at least part of it will have to lower directly to LLVM dialect, even if doing so means some loss in interoperability with other dialects. In fact I already have at least one operation for which I was forced to do just that: c.fptosi, which lowers to llvm.fptosi. (Maybe this could be added to ‘std’? I haven’t explored enough of MLIR to understand whether it ought to be added or not! :sweat_smile:)

Very cool. I’d suggest introducing something like c.cast that maps onto the operation of the C cast operator “(T)x”, and its operands and results would be in the C type system (int float, double, long long, etc).

-Chris

1 Like

I’m interested in having a “pointer” type in the standard dialect as an abstraction that always refers to a single object in memory (and not an array), a “pointer” would differs from memref<1xT> by its ABI, but I’d expect we could have explicit conversion between these two otherwise.

1 Like

I just implemented a pointer type with the same behaviour in our dialect. I also saw there was something like that in FIR as well. So it seems already common enough.

The reason for this was to enforce an ABI where we pass around single pointers and avoiding the fat pointers from memrefs.

1 Like

Hi Brian,

TL;DR: I think both memref-of-memrefs and raw pointers are valuable additions, but supporting the former requires careful consideration of how memref is used today, not just removing the check.

the idea of a C frontend for MLIR has been circulating for some time, how much toy is yours? :slight_smile:

Indeed, memref was specifically designed as a structured type that encapsulates the pointer and the size/indexing computation, the absence of which in lowered C programs significantly hampers optimization. This design decision makes it less expressive than nested C pointers by defining away some of the hard problems (recovering closed expressions for addresses, self-aliasing). At the time, we only considered how this would work with “simple” types (scalars, vectors) stored in memory, hence the element type restriction. (Actually, memref predates the possibility to define custom types). You can find some more information in the rationale, even though some parts of it are quite outdated by now. I would welcome a careful relaxation of memref to contain other element types, but it will require to (a) maintain the notion of pointing to memory and only apply to types that can be represented in memory and (b) update multiple analyses and passes that operate under assumption that memref elements are only those listed in the current check. It would be generally helpful to restate those assumptions in terms of type properties that can be then used to specify which types are allowed as memref elements.

This also partly explains why the tests pass: the assumption is so “core” to the original IR design that almost everything relies on it being true. If you actually see an assert when parsing %1 = alloca() : memref<1xmemref<1xi32>>, that should be fixed to report an error instead, with an appropriate test ASAP, regardless of how this discussion ends.

Indeed, type system is the actual barrier between dialects, more than ops themselves. This can be circumvented, however, by providing local “typecast” operations that convert the value between compatible types from different dialects. We already do so between standard and LLVM dialect types. In particular, I think that having !cdialect.ptr<> type and the corresponding load/store operations could be just fine and would not significantly decrease your possibility to interact with standard type-based ops. You would need a custom load/store pair (since you don’t need the whole complexity of memref load/store anyway), and a cdialect.inttoptr+cdialect.ptrtoint that converts a pointer to index or i64, at which point standard operations are applicable for address arithmetics. Maybe even cdialect.memrefcast that could convert a single-element memref to a pointer and back. Operations are relatively cheap to define in MLIR.

Actually, you may not want memref to model C code. It is a higher-level abstraction than what is possible in C. memref<1xf32> is actually struct {float*, float*, size_t[1], size_t[1]} in the way it is implemented in the default convention (there is also a “bare pointer” convention). It can actually point to a different address to account for, e.g., alignment. Higher-dimensional memrefs are much more powerful and can point to non-contiguous blocks of data with non-trivial addressing schemes (not just C-style row-major computation).

I think this would be a useful addition indeed. This is not the first time I hear about someone needing raw pointers. However, I would like to make sure it is only used when memref is not suitable because, like I said above, it significantly reduces the amount of analysis we can apply.

This is quite interesting! (I was likely I who pointed out the need to ensure the lowering is correct, in https://reviews.llvm.org/D74820). Note that I never said it was not viable, just that we need to address it systematically (semantics, lowering of load/store/alloc/view, non-contiguity, supporting the two type conversion conventions we have, potential implicit conversions) and just removing the check would not suffice.

1 Like

Thank you for the thorough reply, @ftynse!

the idea of a C frontend for MLIR has been circulating for some time, how much toy is yours? :slight_smile:

Heh, well, the stakes are very low for me – I’ll consider my dialect a success as long as I have fun writing it and I learn something by doing so – but at the very least I do have a month or so of vacation to work on it. It would be fun if it ended up being generally useful, though, so I’ll share it if once I think it could be! :smiley: If in the meantime someone publishes a better one, that’s fine too – that doesn’t interfere with my two goals :stuck_out_tongue:

Indeed, type system is the actual barrier between dialects, more than ops themselves. […] In particular, I think that having !cdialect.ptr<> type and the corresponding load/store operations could be just fine and would not significantly decrease your possibility to interact with standard type-based ops. Actually, you may not want memref to model C code. It is a higher-level abstraction than what is possible in C.

These are all very interesting observations, thank you! At the moment I’m wiring up my dialect to use its own types and implementing a type converter between it and LLVM. This goes a little beyond the MLIR toy tutorial, but the std-to-LLVM conversion is giving me very good examples of how this is done.

This is not the first time I hear about someone needing raw pointers. However, I would like to make sure it is only used when memref is not suitable because, like I said above, it significantly reduces the amount of analysis we can apply.

I see that now, thanks for the detailed explanation! For the time being I think I’d like to focus on my dialect, then – I don’t have much experience in the MLIR codebase so it’d be difficult for me to add a new core type like this, I think. But between FIR, the dialect @lorrden mentioned, and hopefully mine once I share it, maybe there’ll be enough examples soon that a pointer type could take shape.

I’m curious if there is any type that doesn’t make sense to support as a memref element type? LLVM types probably should be allowed in some form. And a memref of memrefs isn’t much different than a memref of strings or tensors.

We probably want to revisit the rationale on disallowing IndexType as memref element types since memref-of-memrefs seems like it contradicts it (since it would seem to me that a memref typically will typically contain at least one subobject of index type).

I think that memref<index> makes sense to support as well. The reason we didn’t do this is because we wanted the indexing to be computable without target information. I don’t think this design will really work out in practice though - we should support this, and those queries should return an llvm::Optional or something.

It would be fun if it ended up being generally useful, though, so I’ll share it if once I think it could be! :smiley: If in the meantime someone publishes a better one, that’s fine too – that doesn’t interfere with my two goals :stuck_out_tongue:

Please do! I am particularly interested in a C subset that covers (affine) loops around arithmetic expressions with arrays (but no pointers) :wink:

Design and making people agree on yours is much harder than code, actually. MLIR is implemented in such a way that there is (almost) no difference between “core” type and dialect types. Porting a dialect type should be straightforward.

Anything that does not (cleanly) represent a value in memory, e.g. void or unit types, possibly some token types.

Indexing is fine IMO, it is expressed in terms of elements not bytes. The two issues we had were related to:

  • allocation (we need to know how many bytes to allocate when converting memref<42 x index>);
  • storing attributes of this type (which is not related to memref).

I could argue that the allocation part is not the memref problem but the lowering problem, and that std->llvm lowering should just refuse such allocations and fail gracefully. Storage is trickier since we need to allocate enough space to store and unique the contents of, e.g., an index-typed attribute, but this might have been solved with opaque constant interface now.

Or we could just introduce some sort of DataLayout description.