[llvm] AMDGPU: Correct cycle counts for f64 mfma on gfx940 (PR #83782)

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 4 11:36:01 PST 2024


https://github.com/rampitec requested changes to this pull request.

Indeed the number of cycles for v_mfma_f64_16x16* is incorrect. However, the hazard recognizier tests are correct. Hazard recognizer uses TargetSchedModel and the number of cycles to insert a correct number of waits (because that's the proper modeling of the underlying hazard). The code related to 8 pass DGEMM now needs to be moved from 'case 16' to 'case 8' in the `GCNHazardRecognizer::checkMAIHazards90A` and `GCNHazardRecognizer::checkMAIVALUHazards`.

No, DGEMM does not use XDL, it uses VALU pipeline, that is correct.

https://github.com/llvm/llvm-project/pull/83782


More information about the llvm-commits mailing list