LLVM Discussion Forums

Record types?

Hello everybody,

We are currently working on a (dataflow) dialect.
As part of the lowering pass, we need to express and manipulate C-like
record types (and pointers to them) to represent structured states.

This is standard compiler work, so I wonder if someone has already done
it in MLIR.

Best,
Dumitru

@ftynse wrote a proposal to support mutually recursive structs here: [RFC] First-Party Support for LLVM Types.
There is also a starting effort to do so in the SPIR-V dialect. You can find my dev branch here: https://github.com/KareemErgawy/llvm-project/tree/mlir/spirv/support_recursive_structs. It is still very early but we are trying to adopt the above linked proposal to have a unified approach as much as possible.

The main difference between the LLVM dialect and the SPIR-V one (at least the main difference I am aware of now) is type-closedness. LLVM dialect is proposed to be closed in the sense that not even standard types are reused in the dialect but rather duplication is allowed in order to make things like type printing easier to handle.

On the other hand, this is not the case for SPIR-V where standard types are reused and therefore more substantial infrastructure changes might be needed in MLIR.

1 Like

There’s also an implementation available.

Just to clarify, LLVM types are sufficiently different from standard types to mandate separate modeling. For example, standard integers can be signless, signed or unsigned whereas LLVM integers can only be signless. This is clearly stated in the proposal. I’m not doing this because I’m too lazy to write a more complicated parser. This is proper separation of concerns that makes sure LLVM dialect is tied to LLVM IR, not the choices MLIR will want to do about standard types in the future.

Going back to the original question, if this is something that is being done for the purpose of lowering to LLVM IR and the structures are exactly C-like, you may want to consider using LLVM types.

2 Likes

Thanks for the clarification and sorry for any newbie inaccuracies :).

Dumutru,

CIRCT has a handshake dialect: https://github.com/llvm/circt/tree/master/include/circt/Dialect/Handshake. I’d be interested in understanding how what you’re doing is different, or if we can extend the handshake dialect to cover your use cases.

Steve

1 Like

Dear Stephen,

I looked into your link, and there’s only source code, with little documentation.
From what I read (but recall that I’m not completely fluent in MLIR):

  • It seems that you want to represent some form of handshake logic (hence the point-to-point variables with only one user) plus some extensions, such as non-deterministic merges and load/store. By comparison, our objective is to stick 100% to the static single assignment credo of synchronous approaches. Our values can be read by multiple destinations, and everything is deterministic.
  • Our dialect has less operations than yours (node, instance, when, merge, fby and a form of yield, all of them deterministic), but we assume that we can use constants, function calls and function call-like operations (e.g. addi) from other dialects. You define new constructs for these (to keep the dialect self-sufficient?).
  • Like always in synchronous languages, the difficulty is not (only) in defining the language constructs, but in analysis – dependence and the so-called clock analysis that ensures initialization and activation are consistent). This is covered by dedicated verification passes. In handshake logic, if you don’t explicitly use non-deterministic constructs, determinism is ensured by construction, it’s liveness that poses problems. I didn’t see liveness verification code in your sources. How do you cover correctness?

To conclude:

  • My interest in this is allowing the native MLIR representation of real-time code, but some other things can be done, more related to efficient HPC code generation (we interact with @albertcohen and @Ulysse).
  • The lowering process we envision has several steps, and two of them are currently only partial. When we do, we will make the whole code open. Of the parts we have, dependence and clock analysis are the most important, followed by the first steps of lowering which move from implicit clocks to explicit activation conditions (needed to generate sequential code in the end).

If you are interested, I can continue with the explanations, and we can discuss, possibly also with @albertcohen and @Ulysse, which we know for a long time.

Best regards,
Dumitru

Actually, the handshake runner currently is limited to single reader single writer, but this limitation should get removed. The model can be more general, but converting to single reader single writer before lowering can simplify some lowerings.

OK, I understand synchronous languages. The handshake dialect we built is intended to model semantics closer to Kahn Process Networks, so there is no clock, and no concept of absent (although non-deterministic merges blur this distinction).

This would be very cool. I think there are other interesting places where this could interplay with CIRCT.