[llvm] [AMDGPU] Fix error in #88512. (PR #92770)

Mon May 20 08:22:09 PDT 2024

https://github.com/PeddleSpam created https://github.com/llvm/llvm-project/pull/92770

- Reapply "[ctx_profile] Integration test (#92456)"
- [Github] Revert accidental changes to dependabot config
- Fix: remove wrongly pushed etime-function.mlir at toplevel (#92634)
- [MCAsmParser] .macro/.rept/.irp/.irpc: remove excess \n after expansion
- [flang][OpenMP] Re-enable tests when building OpenMP as a runtime (#89046)
- [flang][OpenMP] Try to unify induction var privatization for OMP regions. (#91116)
- [MCAsmParser] Improve .rept/.irp tests
- [clang][ThreadSafety] Skip past implicit cast in `translateAttrExpr`
- [clang][NFC] Further improvements to const-correctness
- [GlobalIsel] Combine select to integer min max more (#92570)
- [X86][CodeGen] Support flags copy lowering for CCMP/CTEST (#91849)
- [mlir] Add operator<< for printing `Block` (#92550)
- [flang][cuf] Add attr gen dependency to fix #92635
- [nfc][ctx_profile] Fix printf - related `-Wformat-pedantic`
- [NVPTX] support immediate values in st.param instructions (#91523)
- [VPlan] Remove unused removeLastOperand (NFC).
- [dsymutil] Use operator==(StringRef, StringRef) (NFC)
- [DWARFLinker] Use an implicit conversion of SmallString to StringRef (NFC)
- [DXIL] Use consistent SmallVector parameters
- [DAG] Use copysign in frem power-2 fold. (#91751)
- [VectorCombine] Don't transform single shuffles in shuffleToIdentity
- update_test_checks: match IR basic block labels (#88979)
- [ThinLTO]Sort imported GUIDs before cache key update (#92622)
- [nfc][InstrFDO]Encapsulate header writes in a class member function (#90142)
- Reformat
- Quick fix for a waning in clang_rt.ctx_profile [-Wgnu-anonymous-struct]
- [NewPM][AMDGPU] Add CodeGenPassBuilder (#91040)
- [gn build] Port b4ba3fe0068b
- [GISel][RISCV] Legalize G_CONSTANT_FOLD_BARRIER (#89960)
- [VectorCombine] Additional extend tests for shuffleToIdentity. NFC
- [DAG] canCreateUndefOrPoison - merge INSERT_VECTOR_ELT/EXTRACT_VECTOR_ELT cases. NFC.
- [ctx_profile] Pass lib path into test
- [DAG] canCreateUndefOrPoison - only compute extract/index vector elt index knownbits when not poison
- [DAG] visitAVG - rewrite "fold (avgfloor x, 0) -> x >> 1" to use SDPatternMatch
- [DAG] visitABD - rewrite "(abs x, 0)" folds to use SDPatternMatch
- Revert "[Bounds-Safety] Temporarily relax a `counted_by` attribute restriction on flexible array members"
- Revert "[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C (#90786)"
- Revert "[Bounds-Safety] Fix `pragma-attribute-supported-attributes-list.test`"
- [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182)
- [CodeGen][SDAG] Skip preferred extend at O0 (#92643)
- [CodeGen][SDAG] Track returntwice in lowering info (#92640)
- [llvm] Add KnownBits implementations for avgFloor and avgCeil (#86445)
- SimplifyLibCalls: Permit pow(2, x) -> ldexp(1, x) fold for vectors (#92532)
- [VPlan] Simplify (X && Y) || (X && !Y) -> X. (#89386)
- HLSL availability diagnostics design doc (#92207)
- [DOCS] ORCv2.rst Typo (#89482)
- [Clang][HLSL] Add environment parameter to availability attribute (#89809)
- ValueTracking: Correct undef handling for constant FP vectors (#92557)
- [BOLT] Fix preserved offset in fixDoubleJumps (#92485)
- [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (#88512)
- [TableGen] Avoid std::string copy. NFC
- Update llvm-bugs.yml (#77243)
- [llvm] Use operator==(StringRef, StringRef) (NFC) (#92705)
- [clang-format][NFC] Clean up SortIncludesTest.cpp
- [mlir] Use operator==(StringRef, StringRef) (NFC) (#92706)
- [CallPromotionUtils]Implement conditional indirect call promotion with vtable-based comparison (#81378)
- [clang] Use operator==(StringRef, StringRef) (NFC) (#92708)
- [SDAG][X86] Extend SplitVecOp_VSETCC for STRICT_FSETCC. (#92509)
- [llvm] Use StringRef::contains (NFC) (#92710)
- [Serialization] Read the initializer for interesting static variables before consuming it (#92353)
- [BOLT][NFC] Don't assign YAML profile to functions with no CFG (#92487)
- [InstCombine] Fold pointer adding in integer to arithmetic add (#91596)
- [AMDGPU] Use removeFnAttrFromReachable in lower-module-lds pass. (#92686)
- [AMDGPU] Fix kernarg preloading crash with some types and alignments (#91625)
- [ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option (#88024)
- [NFC] Remove unused ASTWriter::getTypeID
- [SCEV] Don't use non-deterministic constant folding for trip counts (#90942)
- Revert "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#92715)
- [llvm] Use SmallString::str (NFC) (#92712)
- [AMDGPU] Only set Info.memVT when not later overridden (#92670)
- [MC] Make UseAssemblerInfoForParsing mostly true
- MIPS: Support '%w' token in inline asm template for MSA (#91920)
- Clang/MIPS: Add +fp64 if MSA and no explicit -mfp option (#91949)
- MIPS/Clang: Use FP32 by default if CPU is mips1 (#92122)
- [ELF] Support high address DW_EH_sdata4 for ELFCLASS32
- [PowerPC]perform bitcast lowering only at 64 bit
- [LoongArch] Select {DIV,MOD}.{W,WU} instruction to eliminate explicit sign extension (#92205)
- [Clang] Fix __is_array returning true for zero-sized arrays (#86652)
- [OpenCL] Add cl_khr_kernel_clock builtins (#91950)
- [clang][ExtractAPI] Remove symbols defined in categories to external types unless requested (#92522)
- [RISCV][CostModel] Remove cost of icmp inst in icmp+select with SFB. (#91158)
- [DebugInfo][GVNSink] Fix #77415: GVNSink fails to optimize LLVM IR with debug info (#77602)
- [AArch64] Add PreTest for optimizing `MOV` to `ORR`
- [Driver][PS5] Set visibility option defaults (#92091)
- [AArch64] Optimize `MOV` to `ORR` when load symmetric constants (#86249)
- [Coverage] Rework !SystemHeadersCoverage (#91446)
- [lldb][Windows] Fixed LibcxxChronoTimePointSecondsSummaryProvider() (#92701)
- [ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872)
- InstSimplify: increase shufflevector test coverage (#92407)
- [flang][HLFIR] Adapt SimplifyHLFIRIntrinsics to run on all top level ops (#92573)
- movimm-expand-ldst.mir (d3d6565c2453) requires asserts
- [SLP] NFC. Use TreeEntry::getOperand if setOperandsInOrder is called (#92727)
- [MLIR][OpenMP] NFC: Split OpenMP dialect definitions (#91741)
- [mlir][irdl] Fix missing verifier in irdl.parametric (#92700)
- [VPlan] Add commutative binary OR matcher, use in transform. (#92539)
- [CloneFunction] Remove check that is no longer necessary (#92577)
- [ValueTracking] Fix incorrect inferrence about the signbit of sqrt (#92510)
- [LAA] Add tests with invariant accesses using vector types.
- [clang] CTAD alias: Fix missing template arg packs during the transformation (#92535)
- [TableGen] HasOneUse builtin predicate on PatFrags (#91578)
- [clang] Make PS template DLL attribute propagation the same as MSVC (#92549)
- [DebugInfo][NaryReassociate] Fix missing debug location updates (#92545)
- [clang] Use SmallString::str (NFC) (#92717)
- [libcxx] locale.cpp: Move build_name helper into unnamed namespace (#92461)
- [Offload] Remove unused version script for plugins
- [AMDGPU] Fix error in #88512.


>From 6daf86e03cbf5d65971d9be31ca5a187c550b219 Mon Sep 17 00:00:00 2001
From: Leon Clark <leoclark at amd.com>
Date: Mon, 20 May 2024 16:19:27 +0100
Subject: [PATCH] [AMDGPU] Fix error in #88512.

---
 llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 15a4b6796880f..3523fcc7dbd50 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -4168,7 +4168,7 @@ bool AMDGPULegalizerInfo::legalizeCTLZ_ZERO_UNDEF(MachineInstr &MI,
 
   auto ShiftAmt = B.buildConstant(S32, 32u - NumBits);
   auto Extend = B.buildAnyExt(S32, {Src}).getReg(0u);
-  auto Shift = B.buildLShr(S32, {Extend}, ShiftAmt);
+  auto Shift = B.buildShl(S32, {Extend}, ShiftAmt);
   auto Ctlz = B.buildInstr(AMDGPU::G_AMDGPU_FFBH_U32, {S32}, {Shift});
   B.buildTrunc(Dst, Ctlz);
   MI.eraseFromParent();