[llvm] AMDGPU: Improve cost handling of canonicalize (PR #101479)
Christudasan Devadasan via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 1 07:47:20 PDT 2024
================
@@ -7,10 +7,10 @@
; Simple 3-pair chain with loads and stores
define amdgpu_kernel void @test1_as_3_3_3_v2f16(ptr addrspace(3) %a, ptr addrspace(3) %b, ptr addrspace(3) %c) {
; GCN-LABEL: @test1_as_3_3_3_v2f16(
-; GCN-NEXT: [[TMP2:%.*]] = load <2 x half>, ptr addrspace(3) [[A:%.*]], align 2
-; GCN-NEXT: [[TMP4:%.*]] = load <2 x half>, ptr addrspace(3) [[B:%.*]], align 2
-; GCN-NEXT: [[TMP5:%.*]] = fmul <2 x half> [[TMP2]], [[TMP4]]
-; GCN-NEXT: store <2 x half> [[TMP5]], ptr addrspace(3) [[C:%.*]], align 2
+; GCN-NEXT: [[TMP1:%.*]] = load <2 x half>, ptr addrspace(3) [[A:%.*]], align 2
----------------
cdevadas wrote:
Nit; The only diff in this test is the TMP variable index changes. May be pre-commit them?
There are some more tests with similar behavior in this same file except for the last test `canonicalize_v2f16`.
https://github.com/llvm/llvm-project/pull/101479
More information about the llvm-commits
mailing list