Add erf to math dialect

I want to add the erf function as an operation to the math dialect and its lowering.
Is this the appropriate place? Is erf significant enough to have it in math?

1 Like

Yes, we should have this (and a lowering path, such as lowering it to arith ops or other math ops that lower to arith ops).

The integral in the erf formula does not have a closed form solution. It has to be approximated. I am not sure which is the best approximation. On most platforms C’s erf is a library call. I found the implementation of erf in libm, which is a part of glibc. It probably is a good approximation.
Other ops in math have conversions to their LLVM counterparts, SPIR-V and libm.

Some math functions have approximations that do not rely on library calls: llvm-project/PolynomialApproximation.cpp at main · llvm/llvm-project · GitHub

I will add approximation(s) for erf. Depending on the accuracy required different approximations are possible. They may use only polynomials or use also the exp function. Approximations with exp are more suitable for hardware where there is instruction for it. If there is no exp instruction, there is already a polynomial approximation for math.exp in MLIR. The implementation in libm is the most accurate. It goes even to float128 with accuracy around 10^-35. I don’t think I want to go there.
I plan to start with an approximation from Abramowitz and Stegun that is also shown in Error function - Wikipedia. It has a maximum error of 1.5*10^−7. It uses exp and a polynomial of degree 5. The constants there are up to 10 digits so the accuracy for float32 will be less.


On another communication channel @_sean_silva suggested using Sleef’s erf implementation as a reference. It uses a complicated representation of floating point values with 2 numbers to reach low error of maximum 1 ULP. Polynomial approximations for other operations in the math dialect don’t use such high precision approaches. I don’t know if it is worth following this approach.

I think we can play with the performance/accuracy tradeoffs later. For now we just need something that works and which is reasonably easy to implement. So using SLEEF’s seems fine to me for now. That code you linked seems to have a lot of control flow… we probably want to base off a vectorized version if possible, like this one: llvm-project/ at ac0561ebb734e6241d76c4661507960e8a6dfb20 · llvm/llvm-project · GitHub