ReturnAddressLocation

Context

I’m currently working on an DSL embedded in Swift which constructs MLIR. Swift has some nice EDSL features, but unfortunately some of them do not make it easy to access the caller’s location information. For functions, there are magic keywords (#file, #line, etc) that are similar to the __FILE__ and __LINE__ macros in C. Swift also has operators (x ^ y), literals (let x: MyCustomType = 42) and computed properties (var x: Int { return 42 }) all of which do not have mechanisms to access their associated source locations. Some of this could be addressed by improving the language (operators, for instance, could be enhanced to provide this information) but I don’t see a straightforward way to add this information to literals or properties. I’m not an expert, but I could imagine other languages (even C++) might face similar challenges accessing location data everywhere, potentially due to things like callbacks, inheritance or ABI concerns.
As I was thinking about this, I realized that there is an alternative mechanism for accessing source location information: debug symbols. I’ve cooked up a proof-of-concept both in Swift and C. Both of these examples are far from robust (the most pressing issue I have at the moment is selecting which dyld image slide to use when calculating the symbol address), and completely platform-specific.

Pitch

I can implement this completely in the Swift bindings using OpaqueLocation, but this seems sub-optimal. As far as I understand, OpaqueLocation will not be serialized in the textual representation of MLIR (it requires a fallback location which will be used in its place). As such, I’m wondering whether it makes sense to introduce a new location attribute type ReturnAddressLocation which takes a single memory address as an argument, and has a simple textual representation. We could also then provide some platform-specific utilities, such as building a chain of call-site locations from a portion of the backtrace and scaffolding for a pass to symbolicate ReturnAddressLocations into file/line/colum locations (and maybe even provide platform-specific implementations of that pass). Luckily the dependencies for this type of approach (dwarfdump and dsymutil) are parts of LLVM.

Questions

  • Overall, does this approach seem too crazy and destined for pain? I know symbolication is notoriously difficult to get right, though we do seem to have pretty good infrastructure for it in LLVM.
  • Is there some place in the LLVM codebase that does something similar that I can leverage to make this simpler (other than dsymutil and dwarfdump)? I know LLDB does this type of thing for the image lookup command, but I haven’t quite been able to parse that code yet.
  • Is this something that makes sense to include in MLIR? or is it specific to my use-case?

Hey George,

This is really cool, but direct support for this strikes me as something that is currently to exotic for core MLIR.

I’m not sure how a “pointer to dwarf” could ever be reasonably serialized, even ignoring current infra issues. However, does MLIR let an OpaqueLocation serialize itself as other locations? The nice thing to do here would be to lazily convert the stack trace information into FileLineCol records if and only if a .mlir file is emitted with locations on.

This sort of thing would also be useful for the SIL style of location information, where you have a pointer to the AST node as your location info when coming out of the parser (very useful for generating syntactic diagnostics in the early passes), but serializing to a SIL bytecode file drops to file/line/col.

-Chris

-Chris

1 Like