[llvm] CodeGen: Add -denormal-fp-math-bf16 flag (PR #90425)

Thu May 9 16:12:39 PDT 2024

andykaylor wrote:

> > I'm afraid so. BF16 instructions belong to two CPUIDs of AVX512_BF16 and AVX_NE_CONVERT. E.g.
> 
> Is it all the operations, or just these specific dot products? Are there specific bf16<->float conversions that have the same issue? What about basic arithmetic operations?

Is this really a case of "defective instructions" or is it just a difference between the way that Intel processors understand the bfloat16 type compared to other architectures? The Intel white paper on bfloat16 (https://www.intel.com/content/www/us/en/content-details/671279/bfloat16-hardware-numerics-definition.html) says, "There is no need to support denormals; FP32, and therefore also BF16, offer more than enough range for deep learning training tasks."

https://github.com/llvm/llvm-project/pull/90425