Opening this thread to discuss technical details around the development of a path to perform TOSA functional verification through the MLIR EmitC dialect.
This was briefly discussed on Discord starting with a request with @marbre but bringing it here for broader visibility.
The TOSA reference model (reference_model.git - [no description]) consumes TOSA flatbuffers form. It takes the model, network inputs and emits functional network output. It is aligned to TOSA specification.
The TOSA dialect MLIR can be converted to flatbuffers using the following MLIR pass: tosa_mlir_translator.git - [no description] . It can be integrated into projects as a CMake submodule, though we also integrate it into TensorFlow Bazel builds using a custom build rule. Details in [RFC] Tosa import/export tool - #33 by sjarus .
Drive the TOSA reference model by generating C API calls to reference model from the TOSA MLIR form using the EmitC dialect . The reference model would not be a binary but a library.
Conversation So Far
On our part, when we implemented the reference model, we considered offering a mechanism like this, but there was no immediate use case then. Adding @jsmolens and @eric-k here as additional involved folks at Arm.
The reference model isn’t a performance-focused runtime - the focus is on precision and bit accuracy for comparison to frontend reference output, and serves as a critical part of HW/SW co-design efforts.
There are some design considerations around this proposal to consider, e.g.
- Passing parameter and datatype information in a manner easily parseable on the reference model side.
- How to convey the optional quantization information construction properly to the reference model interface ? There is a cleanup to the dialect interface planned for the next TOSA minor version update (v0.24) that should significantly simplify this, but this update is only intended to happen in January.