Just an FYI that I added some basic TOSA support in Add a basic TOSA E2E backend. by silvasean · Pull Request #355 · llvm/torch-mlir · GitHub
Now we just need to add more patterns in TorchToTosa.cpp
Two points of discussion
I was first going to implement
torch.aten.mm in terms of MATMUL, but I found that MATMUL takes a leading batch dimension. I couldn’t find a way in TOSA to do a dynamically shaped “insert a leading 1 dimension” operation, so I ended up just going with TANH.
It’s fine if we want to limit this to static shapes, though it seems like we aspire for TosaToLinalg to support dynamic shapes (Rob said that many ops already do), then we probably want this layer to support it as well.
This is largely related to dynamic shapes actually. Consider the case of a matmul with dynamic shapes, and the K dimension mismatches at runtime. What is the TOSA program specced to do? I see two options:
A. The program must safely report an error and return control to the caller
B. Undefined behavior (it might scribble over random memory, format your hard drive, etc.)
Linalg-on-tensors has behavior B. So if we want TOSA to support A., then TosaToLinalg needs to be inserting error guards (which it doesn’t currently do).
We have to broach a similar topic with respect to broadcasting behavior for elementwise ops (when a ? turns out to be 1 at runtime and broadcasts against the opposite dimension). In TorchToLinalg and MHLO-to-Linalg, we use a strategy where any dynamic size-1 broadcast is a runtime error, but handle cases where the dimension is statically 1 (the code in TorchToLinalg is here). This is not super principled (type inference can convert a program from a runtime error to a success), but has worked well in practice and seems reasonable to adopt in TOSA.