Carry-less multiplication[1] instructions exist (at least optionally) on many architectures: armv8, RISC-V, x86_64, POWER, SPARC, C64x, and possibly more.

This proposal is to add a `llvm.clmul`

instruction. Or if that is contentious, `llvm.experimental.bitmanip.clmul`

instruction. It takes two integer operands of the same width, and returns an integer with twice the width of the operands. (Is there a good reason to make these the same width, as all the other operations do even when it doesn’t really make sense for the mathematical operation–like multiplication or ctpop/ctlz/cttz?)

If the CPU does not have a dedication clmul operation, it can be lowered to regular multiplication, by using holes to avoid carrys.

==Where is clmul used?==

While somewhat specialized, the RISC-V manual documents many uses: [2]

The classic applications forclmulare Cyclic Redundancy Check (CRC) [11, 26]

and Galois/CounterMode (GCM), but more applications exist, including the following examples.There are obvious applications in hashing and pseudo random number generations. For exam-ple, it has been reported that hashes based on carry-less multiplications can outperform Google’sCityHash [17].

clmulof a number with itself inserts zeroes between each input bit. This can be useful for generatingMorton code [23].

clmulof a number with -1 calculates the prefix XOR operation. This can be useful for decodinggray codes.Another application of XOR prefix sums calculated withclmulis branchless tracking of quotedstrings in high-performance parsers. [16]

Carry-less multiply can also be used to implement Erasure code efficiently. [14]

==clmul lowering without hardware support==

A 8x8=>16 clmul can also be lowered to a 32x32=>64 multiplication when there is no specialized instruction (also 15x15=>30, to a 60x60=>120, or if bitreverse is available 16x16=>32 to TWO 64x64=>64 multiplications)[3].

[1] https://en.wikipedia.org/wiki/Carry-less_product

[2] (page 30) https://raw.githubusercontent.com/riscv/riscv-bitmanip/master/bitmanip-0.92.pdf

[3] https://www.bearssl.org/constanttime.html