[PATCH] D115385: [CostModel][AMDGPU] Fix intrinsics costs estimations.

Daniil Fukalov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 8 14:15:01 PST 2021


dfukalov added inline comments.


================
Comment at: llvm/test/Analysis/CostModel/AMDGPU/fma.ll:85
+; NOPACKEDF32-LABEL: 'fma_f64'
+; NOPACKEDF32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %f64 = call double @llvm.fma.f64(double undef, double undef, double undef) #2
+; NOPACKEDF32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v2f64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef) #2
----------------
rampitec wrote:
> That has nothing to do with packed f32. It is just because f64 is full rate on this target.
These `PACKEDF32` and `NOPACKEDF32` are just for distinguish between gfx90a and group of (gfx900, gfx1010) targets. It is not related to f64 or other types, but used for minimizing number of `-check-prefixes` subsets. Please check the `RUN:` lines, I added the one for gfx1010 target. It has the same costs as gfx900 so I renamed (GFX900|GFX90A) to (NOPACKEDF32|PACKEDF32).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115385/new/

https://reviews.llvm.org/D115385



More information about the llvm-commits mailing list