[PATCH] D115385: [CostModel][AMDGPU] Fix intrinsics costs estimations.
Daniil Fukalov via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 9 12:01:44 PST 2021
dfukalov added inline comments.
================
Comment at: llvm/test/Analysis/CostModel/AMDGPU/fma.ll:85
+; NOPACKEDF32-LABEL: 'fma_f64'
+; NOPACKEDF32-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %f64 = call double @llvm.fma.f64(double undef, double undef, double undef) #2
+; NOPACKEDF32-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v2f64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef) #2
----------------
rampitec wrote:
> dfukalov wrote:
> > rampitec wrote:
> > > That has nothing to do with packed f32. It is just because f64 is full rate on this target.
> > These `PACKEDF32` and `NOPACKEDF32` are just for distinguish between gfx90a and group of (gfx900, gfx1010) targets. It is not related to f64 or other types, but used for minimizing number of `-check-prefixes` subsets. Please check the `RUN:` lines, I added the one for gfx1010 target. It has the same costs as gfx900 so I renamed (GFX900|GFX90A) to (NOPACKEDF32|PACKEDF32).
> I understand, but it is misleading.
So what names for groups (gx90a) and (gfx900|gfx1010) do you suggest for the test?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115385/new/
https://reviews.llvm.org/D115385
More information about the llvm-commits
mailing list