I am playing with the conversion between keras, TF and MLIR in order to see how the code looks like (and possibly optimize it). The code generated for RNNs seems particularly bloated, and in looking into the causes of this issue, I stumbled into a conversion problem that might explain it.
For some reason, there seems to be no way to execute a stateful model step by step (sample by sample). I can only execute one batch at a time. Thus, in order to have a step-by-step stateful execution I have to enable the “stateful=True” flag in the RNN definition, and then provide each sample as part of a single-step input batch.
This is a very inefficient execution mechanism, which creates a useless extra loop. Do you happen to know if there’s a way for natively executing batches step by step?