LLVM Discussion Forums

Representing device data

Hey all-

This is – for now – a purely academic question: does it make sense and how would one represent device data in MLIR? I’m defining ‘device data’ as information about the physical structure of a device: LUTs, memories, DSPs, PCIe MACs, DRAM controllers, their location(s) on the chip, routing resources, timing information, etc. The stuff contained in the vendors’ device databases.

We could use this information to floorplan. For instance with ESI: to communicate from the PCIe interface to the HBM controller on the opposite side of the chip, you’ll probably need a pretty deep pipeline. Knowing the physical locations is the first step to estimating the number of stages necessary.

I don’t think it makes sense to encode this information in an MLIR IR, but I may just not be creative enough. I also don’t see any specific advantages of doing this, though I’m not familiar enough with the graph algorithms present in MLIR to say. Maybe there’s some advantage of having a way to encode physical locations in MLIR IR for placement optimizations, but encoding the device data itself…?

Personally, I think this makes alot of sense and we’ve been doing exactly this for some of our next-gen devices with hundreds of Vector/VLIW cores. MLIR provides a nice structure for mixing the programming of processors, DMA engines, Other IP cores and programmable logic. At some level the ‘program’ ends up describing aspects of a device (like how many tiles there) and aspects of the program (like how those tiles are programmed). I don’t know if it’s the easiest way to handle your particular example (using floorplanning information to pipeline long paths), but I think it’s certainly possible. You could for instance, start with a design without floorplanning information and build up a hierarchical structure to represent different regions on a device. One thing you run into quickly is that values don’t escape regions, so accessing components in those regions becomes tricky without explicitly pushing all the signals through the boundary of each region, which is cumbersome.

Yep. I suspect you could use MLIR could be used to model anything hierarchical or linear in nature. (Since the basic data structure is a hierarchy with lists at the leaves.) That’s just not true of physical spaces, so you’d be fighting the basic data structure – you mention one of the symptoms.

There’s actually two separate but related issues here as I see it:

  • Hierarchy: it’s not clear by what property (location, type of widget, routing of some sort) one should construct a hierarchy upon. This will affect the “value escape” issue you mention. It’s almost certainly application-dependent.
  • Multidimensionality: in forcing a 2D structure into a hierarchy one makes it more difficult to reason about locality – two adjacent structures may be very distant in terms of hierarchy edges. (This is actually a problem for any tree-based data structure leading to the – now outdated – notion of data structure threads, links between the leaves in the tree.)

It does absolutely make sense to me to link your design hierarchy into some sort of more appropriate device data structure at map/place/route time. It intuitively makes a ton of sense to me to store routing information in the design hierarchy edges, though I’m not sure how much that would buy you in terms of timing estimation.

You doubtlessly have much more experience with device databases than I so I’m largely speaking from a place of intuition than experience.