[RFC] Handshake top-level memory handling

This isn’t necessary a full proposal, but it is a request for comments.

We have an open issue about the handling of top-level memories in Handshake: Standard-To-Handshake + handshake runner doesn't handle toplevel memories well. · Issue #68 · llvm/circt · GitHub. These are memrefs passes as arguments to the top-level handshake.func op.

There is ongoing work to improve how this works with handshake-runner: [Handshake] Added temporary support for memory data with handshake-runner by JianyiCheng · Pull Request #745 · llvm/circt · GitHub. This led to some discussion about the longer term plans, which is what this thread is for.

Here’s what @JianyiCheng proposed in the PR:

  1. If a memory op is external, include the corresponding function argument (memref) as one of its operands. This use of the argument can be further lowered downto a proper memory interface.
  2. If a memory op is local, we can keep the alloca op and use it as one of the operands. The alloca can be lowered down to BRAMs.
func(%ag0 : memref<>, ...) {
   %0 = alloca ... : memref<>
   %1 = handshake.memory (%arg0, ...) // external RAM, %arg0 - %1 is external memory interface
   %2 = handshake.memory(%0, ...) // internal RAM, %0 - %2 is local memory interface, %0 is lowered to a BRAM block
}

I think this proposal is a good start at a high level.

I don’t have much to propose myself, other than eliciting comments, but I do have one specific request that I think could be a first step: can we remove the logic in StandardToHandshake that is sinking the top-level handshake.func input memrefs and replacing them with internal memories?

This might work for handshake-runner, but it makes it tough to do anything else with handshake.funcs which take input memrefs. I am asking because I’m experimenting with connecting Handshake to ESI, and in this case I don’t necessarily want the inputs sunk and replaced with internal memories.

My proposal is to remove this behavior from StandardToHandshake, and instead do this in the new HandshakeCleanFuncArg pass. That way, we can run the new pass before handshake-runner to get the memrefs into an internal form that works for handshake-runner, but we are free to do other things as well. I’m not sure how much work this is going to be in practice…

Something else worth considering is how the recent HIR work handles memrefs. Section 4.4 is relevant. My understanding of HIR is each underlying tensor is treated as a memory, and each memref is treated as a port. This seems in line with MLIR’s notion of memref being a “memory reference”. In CIRCT today, each handshake.memory is treated as a memory in the FIRRTL lowering, and each load/store is treated as a port. I think that’s a little different, so I’m curious to see how to align this.

1 Like

This sounds reasonable to me, though I’m not a handshake expert.