[PATCH] D115385: [CostModel][AMDGPU] Fix intrinsics costs estimations.

Thu Dec 9 12:31:53 PST 2021

rampitec added inline comments.

================
Comment at: llvm/test/Analysis/CostModel/AMDGPU/fma.ll:85
+; NOPACKEDF32-LABEL: 'fma_f64'
+; NOPACKEDF32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %f64 = call double @llvm.fma.f64(double undef, double undef, double undef) #2
+; NOPACKEDF32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v2f64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef) #2
----------------
dfukalov wrote:
> rampitec wrote:
> > dfukalov wrote:
> > > rampitec wrote:
> > > > That has nothing to do with packed f32. It is just because f64 is full rate on this target.
> > > These `PACKEDF32` and `NOPACKEDF32` are just for distinguish between gfx90a and group of (gfx900, gfx1010) targets. It is not related to f64 or other types, but used for minimizing number of `-check-prefixes` subsets. Please check the `RUN:` lines, I added the one for gfx1010 target. It has the same costs as gfx900 so I renamed (GFX900|GFX90A) to (NOPACKEDF32|PACKEDF32).
> > I understand, but it is misleading.
> So what names for groups (gx90a) and (gfx900|gfx1010) do you suggest for the test?
This should get its own FASTF64/SLOWF64 check.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115385/new/

https://reviews.llvm.org/D115385