Context
I’m currently working on an DSL embedded in Swift which constructs MLIR. Swift has some nice EDSL features, but unfortunately some of them do not make it easy to access the caller’s location information. For functions, there are magic keywords (#file
, #line
, etc) that are similar to the __FILE__
and __LINE__
macros in C. Swift also has operators (x ^ y
), literals (let x: MyCustomType = 42
) and computed properties (var x: Int { return 42 }
) all of which do not have mechanisms to access their associated source locations. Some of this could be addressed by improving the language (operators, for instance, could be enhanced to provide this information) but I don’t see a straightforward way to add this information to literals or properties. I’m not an expert, but I could imagine other languages (even C++) might face similar challenges accessing location data everywhere, potentially due to things like callbacks, inheritance or ABI concerns.
As I was thinking about this, I realized that there is an alternative mechanism for accessing source location information: debug symbols. I’ve cooked up a proof-of-concept both in Swift and C. Both of these examples are far from robust (the most pressing issue I have at the moment is selecting which dyld
image slide to use when calculating the symbol address), and completely platform-specific.
Pitch
I can implement this completely in the Swift bindings using OpaqueLocation
, but this seems sub-optimal. As far as I understand, OpaqueLocation
will not be serialized in the textual representation of MLIR (it requires a fallback location which will be used in its place). As such, I’m wondering whether it makes sense to introduce a new location attribute type ReturnAddressLocation
which takes a single memory address as an argument, and has a simple textual representation. We could also then provide some platform-specific utilities, such as building a chain of call-site locations from a portion of the backtrace and scaffolding for a pass to symbolicate ReturnAddressLocation
s into file/line/colum locations (and maybe even provide platform-specific implementations of that pass). Luckily the dependencies for this type of approach (dwarfdump
and dsymutil
) are parts of LLVM.
Questions
- Overall, does this approach seem too crazy and destined for pain? I know symbolication is notoriously difficult to get right, though we do seem to have pretty good infrastructure for it in LLVM.
- Is there some place in the LLVM codebase that does something similar that I can leverage to make this simpler (other than
dsymutil
anddwarfdump
)? I know LLDB does this type of thing for theimage lookup
command, but I haven’t quite been able to parse that code yet. - Is this something that makes sense to include in MLIR? or is it specific to my use-case?