[all-commits] [llvm/llvm-project] fab2bb: Add llvm::min/max_element and use it in llvm/ and ...

Mon Mar 11 10:40:35 PDT 2024

  Branch: refs/heads/users/paschalis-mpeis/frem-slp-vectorization
  Home:   https://github.com/llvm/llvm-project
  Commit: fab2bb8bfda865bd438dee981d7be7df8017b76d
      https://github.com/llvm/llvm-project/commit/fab2bb8bfda865bd438dee981d7be7df8017b76d
  Author: Justin Lebar <justin.lebar at gmail.com>
  Date:   2024-03-10 (Sun, 10 Mar 2024)

  Changed paths:
    M llvm/include/llvm/ADT/STLExtras.h
    M llvm/include/llvm/CodeGen/RegAllocPBQP.h
    M llvm/lib/Analysis/ScalarEvolution.cpp
    M llvm/lib/DebugInfo/PDB/Native/PDBFile.cpp
    M llvm/lib/IR/DataLayout.cpp
    M llvm/lib/ObjCopy/MachO/MachOWriter.cpp
    M llvm/lib/ProfileData/GCOV.cpp
    M llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp
    M llvm/lib/Target/AMDGPU/GCNILPSched.cpp
    M llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
    M llvm/lib/Target/AMDGPU/SIMachineScheduler.cpp
    M llvm/lib/Target/Hexagon/HexagonCommonGEP.cpp
    M llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp
    M llvm/lib/Target/Hexagon/HexagonGenInsert.cpp
    M llvm/lib/Target/Hexagon/HexagonVectorCombine.cpp
    M llvm/lib/Transforms/Scalar/GVNSink.cpp
    M llvm/lib/Transforms/Scalar/JumpThreading.cpp
    M llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
    M llvm/lib/Transforms/Utils/SimplifyCFG.cpp
    M llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/tools/llvm-exegesis/lib/LatencyBenchmarkRunner.cpp
    M llvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp
    M llvm/tools/llvm-mca/Views/SchedulerStatistics.cpp
    M llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp
    M llvm/tools/llvm-pdbutil/MinimalTypeDumper.cpp
    M llvm/tools/llvm-rc/ResourceFileWriter.cpp
    M llvm/utils/FileCheck/FileCheck.cpp
    M llvm/utils/TableGen/CodeGenSchedule.cpp
    M llvm/utils/TableGen/RegisterInfoEmitter.cpp
    M mlir/examples/toy/Ch5/mlir/LowerToAffineLoops.cpp
    M mlir/examples/toy/Ch6/mlir/LowerToAffineLoops.cpp
    M mlir/examples/toy/Ch7/mlir/LowerToAffineLoops.cpp
    M mlir/lib/Conversion/PDLToPDLInterp/PredicateTree.cpp
    M mlir/lib/Dialect/Affine/IR/AffineOps.cpp
    M mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp
    M mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp
    M mlir/lib/Dialect/Linalg/Transforms/ConvertToDestinationStyle.cpp
    M mlir/lib/Dialect/SPIRV/Transforms/UnifyAliasedResourcePass.cpp
    M mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
    M mlir/lib/IR/AffineMap.cpp
    M mlir/lib/Reducer/ReductionNode.cpp

  Log Message:
  -----------
  Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)

For some reason this was missing from STLExtras.

  Commit: 7dfa8398354e435cdee5a8ea6d6b17d1e4557733
      https://github.com/llvm/llvm-project/commit/7dfa8398354e435cdee5a8ea6d6b17d1e4557733
  Author: Amirreza Ashouri <ar.ashouri999 at gmail.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M clang/docs/ReleaseNotes.rst
    M clang/lib/AST/Type.cpp
    M clang/test/SemaCXX/type-traits.cpp

  Log Message:
  -----------
  [clang] Fix behavior of `__is_trivially_relocatable(volatile int)` (#77092)

Consistent with `__is_trivially_copyable(volatile int) == true` and
`__is_trivially_relocatable(volatile Trivial) == true`,
`__is_trivially_relocatable(volatile int)` should also be `true`.

Fixes https://github.com/llvm/llvm-project/issues/77091

[clang] [test] New tests for __is_trivially_relocatable(cv-qualified
type)

  Commit: 099be86433a69f264aeb70e512ba1bbd0c7aefd7
      https://github.com/llvm/llvm-project/commit/099be86433a69f264aeb70e512ba1bbd0c7aefd7
  Author: Justin Lebar <justin.lebar at gmail.com>
  Date:   2024-03-10 (Sun, 10 Mar 2024)

  Changed paths:
    M llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp

  Log Message:
  -----------
  Fix broken build after https://github.com/llvm/llvm-project/pull/84678 (sorry).

  Commit: 3f6bc1adf805681293c2ef0b93b708ff52244c00
      https://github.com/llvm/llvm-project/commit/3f6bc1adf805681293c2ef0b93b708ff52244c00
  Author: Chuanqi Xu <yedeng.yd at linux.alibaba.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M clang/include/clang/AST/DeclBase.h
    M clang/include/clang/Serialization/ASTReader.h
    M clang/lib/AST/Decl.cpp
    M clang/lib/AST/DeclBase.cpp
    M clang/lib/Serialization/ASTReader.cpp
    M clang/lib/Serialization/ASTReaderDecl.cpp
    M clang/lib/Serialization/ASTWriter.cpp
    M clang/lib/Serialization/ASTWriterDecl.cpp
    A clang/test/Modules/hashing-decls-in-exprs-from-gmf.cppm

  Log Message:
  -----------
  [C++20] [Moduls] Avoid computing odr hash for functions from comparing constraint expression

Previously we disabled to compute ODR hash for declarations from the
global module fragment. However, we missed the case that the functions
lives in the concept requiments (see the attached the test files for
example). And the mismatch causes the potential crashment.

Due to we will set the function body as lazy after we deserialize it and
we will only take its body when needed. However, we don't allow to take
the body during deserializing. So it is actually potentially problematic
if we set the body as lazy first and computing the hash value of the
function, which requires to deserialize its body. So we will meet a
crash here.

This patch tries to solve the issue by not taking the body of the
function from GMF. Note that we can't skip comparing the constraint
expression from the GMF directly since it is an key part of the
function selecting and it may be the reason why we can't return 0
directly for `FunctionDecl::getODRHash()` from the GMF.

  Commit: d8d2dea7fc6f452ac6a24948fe3ff99920f81c99
      https://github.com/llvm/llvm-project/commit/d8d2dea7fc6f452ac6a24948fe3ff99920f81c99
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2024-03-10 (Sun, 10 Mar 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll

  Log Message:
  -----------
  [RISCV] Handle FP riscv_masked_strided_load with 0 stride. (#84576)

Previously, we tried to create an integer extending load. We need to a
non-extending FP load instead.

Fixes #84541.

  Commit: d9e6aa70484955c9f581577c3b93efc1d277fa46
      https://github.com/llvm/llvm-project/commit/d9e6aa70484955c9f581577c3b93efc1d277fa46
  Author: Carl Ritson <carl.ritson at amd.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
    M llvm/test/CodeGen/AMDGPU/acc-ldst.ll
    M llvm/test/CodeGen/AMDGPU/mfma-no-register-aliasing.ll

  Log Message:
  -----------
  [AMDGPU] Update LiveInterval def index for early-clobber (#79285)

On converting an instruction to an early-clobber definition in
convertToThreeAddress, we must also update live intervals for the
register to start at the early-clobber index.

  Commit: b7f97d3661814c4ae11b8772f8a27c029d01648b
      https://github.com/llvm/llvm-project/commit/b7f97d3661814c4ae11b8772f8a27c029d01648b
  Author: Kito Cheng <kito.cheng at sifive.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVTargetObjectFile.cpp
    M llvm/lib/Target/RISCV/RISCVTargetObjectFile.h
    A llvm/test/CodeGen/RISCV/srodata.ll

  Log Message:
  -----------
  [RISCV] Place mergeable small read only data into srodata section (#82214)

Small mergeable read only data was place on the sdata before, but it
also means it lose the mergeable property, which means lose some code
size optimization opportunity during link time.

  Commit: f6455606bbbb02bbc155a713ae07eab1c7419041
      https://github.com/llvm/llvm-project/commit/f6455606bbbb02bbc155a713ae07eab1c7419041
  Author: Fangrui Song <i at maskray.me>
  Date:   2024-03-10 (Sun, 10 Mar 2024)

  Changed paths:
    M lld/ELF/Arch/AArch64.cpp
    M lld/ELF/Arch/PPC64.cpp
    M lld/ELF/Driver.cpp
    M lld/ELF/ICF.cpp
    M lld/ELF/InputFiles.h
    M lld/ELF/InputSection.cpp
    M lld/ELF/MarkLive.cpp
    M lld/ELF/SyntheticSections.cpp

  Log Message:
  -----------
  [ELF] Move getSymbol/getRelocTargetSym from ObjFile<ELFT> to InputFile. NFC

This removes lots of unneeded `template getFile<ELFT>()`.

  Commit: 4a21e3afa29521192ce686605eb945495455ca5e
      https://github.com/llvm/llvm-project/commit/4a21e3afa29521192ce686605eb945495455ca5e
  Author: Carl Ritson <carl.ritson at amd.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/CodeGen/LiveIntervals.cpp
    M llvm/test/CodeGen/AMDGPU/lds-misaligned-bug.ll

  Log Message:
  -----------
  [LiveIntervals] repairIntervalsInRange: recompute width changes (#78564)

Extend repairIntervalsInRange to completely recompute the interva for a
register if subregister defs exist without precise subrange matches
(LaneMask exactly matching subregister).
This occurs when register sequences are lowered to copies such that the
size of the copies do not match any uses of the subregisters formed
(i.e. during twoaddressinstruction).

The subranges without this change are probably legal, but do not match
those generated by live interval computation. This creates problems with
other code that assumes subranges precisely cover all subregisters
defined, e.g. shrinkToUses().

  Commit: cf1319f9c6561afea381bbfc1a18f5c1fb7b46b0
      https://github.com/llvm/llvm-project/commit/cf1319f9c6561afea381bbfc1a18f5c1fb7b46b0
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
    M compiler-rt/test/tsan/pthread_atfork_deadlock3.c
    A compiler-rt/test/tsan/signal_in_read.c

  Log Message:
  -----------
  [compiler-rt] Mark more calls as blocking (#77789)

If we're in a blocking call, we need to run the signal immediately, as
the call may not return for a very long time (if ever). Not running the
handler can cause deadlocks if the rest of the program waits (in one way
or another) for the signal handler to execute.

I've gone through the list of functions in
sanitizer_common_interceptors and marked as blocking those that I know
can block, but I don't claim the list to be exhaustive. In particular, I
did not mark libc FILE* functions as blocking, because these can end up
calling user functions. To do that correctly, /I think/ it would be
necessary to clear the "is in blocking call" flag inside the fopencookie
wrappers.

The test for the bug (deadlock) uses the read call (which is the one
that I ran into originally), but the same kind of test could be written
for any other blocking syscall.

  Commit: 4e0e9b17c6cacdc3b1ea3a43f85ae443cb146af8
      https://github.com/llvm/llvm-project/commit/4e0e9b17c6cacdc3b1ea3a43f85ae443cb146af8
  Author: AtariDreams <83477269+AtariDreams at users.noreply.github.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h
    M llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
    M llvm/test/CodeGen/AMDGPU/add.ll
    M llvm/test/CodeGen/AMDGPU/ctpop16.ll
    M llvm/test/CodeGen/AMDGPU/ctpop64.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.set.inactive.ll
    M llvm/test/CodeGen/AMDGPU/mul.ll

  Log Message:
  -----------
  [SelectionDAG] Switch to LiveRegUnits (#84197)

  Commit: 561ddb1687c21b82feb92890762a85c2ae1f6e0c
      https://github.com/llvm/llvm-project/commit/561ddb1687c21b82feb92890762a85c2ae1f6e0c
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/CodeGen/TypePromotion.cpp
    M llvm/test/CodeGen/AArch64/and-mask-removal.ll
    M llvm/test/CodeGen/AArch64/signed-truncation-check.ll
    M llvm/test/CodeGen/AArch64/typepromotion-overflow.ll
    M llvm/test/CodeGen/RISCV/typepromotion-overflow.ll
    M llvm/test/Transforms/TypePromotion/ARM/icmps.ll
    M llvm/test/Transforms/TypePromotion/ARM/wrapping.ll

  Log Message:
  -----------
  Revert "[TypePromotion] Support positive addition amounts in isSafeWrap. (#81690)"

This reverts commit 0813b90ff5d195d8a40c280f6b745f1cc43e087a.

Fixes miscompile reported in #84718.

  Commit: 3093d731dff93df02899dcc62f5e7ba02461ff2a
      https://github.com/llvm/llvm-project/commit/3093d731dff93df02899dcc62f5e7ba02461ff2a
  Author: Nathan Ridge <zeratul976 at hotmail.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M clang-tools-extra/clangd/ClangdServer.cpp
    M clang-tools-extra/clangd/CodeComplete.cpp
    M clang-tools-extra/clangd/IncludeCleaner.cpp
    M clang-tools-extra/clangd/ParsedAST.cpp
    M clang-tools-extra/clangd/SourceCode.cpp
    M clang-tools-extra/clangd/SourceCode.h
    M clang-tools-extra/clangd/tool/Check.cpp
    M clang-tools-extra/clangd/unittests/SourceCodeTests.cpp

  Log Message:
  -----------
  [clangd] Avoid libFormat's objective-c guessing heuristic where possible (#84133)

This avoids a known libFormat bug where the heuristic can OOM on certain
large files (particularly single-header libraries such as miniaudio.h).

The OOM will still happen on affected files if you actually try to
format them (this is harder to avoid since the underlyting issue affects
the actual formatting logic, not just the language-guessing heuristic),
but at least it's avoided during non-modifying operations like hover,
and modifying operations that do local formatting like code completion.

Fixes https://github.com/clangd/clangd/issues/719
Fixes https://github.com/clangd/clangd/issues/1384
Fixes https://github.com/llvm/llvm-project/issues/70945

  Commit: d4569d42b5cb8ba076f0115d3d21d89f68e6ce9d
      https://github.com/llvm/llvm-project/commit/d4569d42b5cb8ba076f0115d3d21d89f68e6ce9d
  Author: Pierre van Houtryve <pierre.vanhoutryve at amd.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
    R llvm/test/CodeGen/AMDGPU/lds-reject-absolute-addresses.ll
    A llvm/test/CodeGen/AMDGPU/lds-reject-mixed-absolute-addresses.ll
    A llvm/test/CodeGen/AMDGPU/lds-run-twice-absolute-md.ll
    A llvm/test/CodeGen/AMDGPU/lds-run-twice.ll

  Log Message:
  -----------
  [AMDGPU] Let LowerModuleLDS run twice on the same module (#81729)

If all variables in the module are absolute, this means we're running
the pass again on an already lowered module, and that works.
If none of them are absolute, lowering can proceed as usual.
Only diagnose cases where we have a mix of absolute/non-absolute GVs,
which means we added LDS GVs after lowering, which is broken.

See #81491
Split from #75333

  Commit: f1aa7837884c745ede497e365cc75d5581ecc714
      https://github.com/llvm/llvm-project/commit/f1aa7837884c745ede497e365cc75d5581ecc714
  Author: Matthias Springer <me at m-sp.org>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M mlir/include/mlir/IR/PatternMatch.h
    M mlir/lib/Dialect/Linalg/Transforms/DecomposeLinalgOps.cpp
    M mlir/lib/IR/PatternMatch.cpp
    M mlir/lib/Transforms/Utils/RegionUtils.cpp

  Log Message:
  -----------
  [mlir][IR] Fix overload resolution on MSVC build (#84589)

#82629 added additional overloads to `replaceAllUsesWith` and
`replaceUsesWithIf`. This caused a build breakage with MSVC when called
with ops that can implicitly convert to `Value`.

```
external/llvm-project/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp(881): error C2666: 'mlir::RewriterBase::replaceAllUsesWith': 2 overloads have similar conversions
external/llvm-project/mlir/include\mlir/IR/PatternMatch.h(631): note: could be 'void mlir::RewriterBase::replaceAllUsesWith(mlir::Operation *,mlir::ValueRange)'
external/llvm-project/mlir/include\mlir/IR/PatternMatch.h(626): note: or       'void mlir::RewriterBase::replaceAllUsesWith(mlir::ValueRange,mlir::ValueRange)'
external/llvm-project/mlir/include\mlir/IR/PatternMatch.h(616): note: or       'void mlir::RewriterBase::replaceAllUsesWith(mlir::Value,mlir::Value)'
external/llvm-project/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp(882): note: while trying to match the argument list '(mlir::tensor::ExtractSliceOp, T)'
        with
        [
            T=mlir::Value
        ]
```

Note: The LLVM build bots (Linux and Windows) did not break, this seems
to be an issue with `Tools\MSVC\14.29.30133\bin\HostX64\x64\cl.exe`.

This change renames the newly added overloads to `replaceAllOpUsesWith`
and `replaceOpUsesWithIf`.

  Commit: c9465e4771c93adfbc99ffca5963a48a5334d98d
      https://github.com/llvm/llvm-project/commit/c9465e4771c93adfbc99ffca5963a48a5334d98d
  Author: Jeremy Morse <jeremy.morse at sony.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/IR/Instruction.cpp

  Log Message:
  -----------
  [DebugInfo][RemoveDIs] Assert if we mix PHIs and debug-info (#84054)

A potentially erroneous code construction with the work we've done to
remove debug intrinsics, is inserting PHIs into blocks when the position
hasn't been "sourced correctly". Specifically, if you have:

    %foo = PHI
    #dbg_value
    %bar = add i32...

And plan on inserting a new PHI, you have to use the iterator form of
`getFirstNonPHI` or getFirstInsertionPt (or begin()) to acquire an
iterator that tells the debug-info maintenance code "this is supposed to
be at the start of the block, put it in front of #dbg_value". We can
detect call-sites that aren't doing this at runtime, and should do with
this assertion. It might invalidate code that's doing something very
unexpected, like walking backwards to find a PHI, then going forwards,
then inserting: however that's just an inefficient way of calling
`getFirstNonPHI`.

  Commit: 0f501c30b9601627c236f9abca8a3befba5dc161
      https://github.com/llvm/llvm-project/commit/0f501c30b9601627c236f9abca8a3befba5dc161
  Author: Chuanqi Xu <yedeng.yd at linux.alibaba.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M clang/include/clang/Sema/Sema.h
    M clang/lib/Sema/SemaCoroutine.cpp
    M clang/lib/Sema/SemaExprCXX.cpp
    R clang/test/SemaCXX/gh84064-1.cpp
    R clang/test/SemaCXX/gh84064-2.cpp

  Log Message:
  -----------
  Revert "[C++20][Coroutines] Lambda-coroutine with operator new in promise_type (#84193)"

This reverts commit 35d3b33ba5c9b90443ac985f2521b78f84b611fe.

See the comments in https://github.com/llvm/llvm-project/pull/84193 for
details

  Commit: 3b30559c088d679ca8fe491158e6c32db630f223
      https://github.com/llvm/llvm-project/commit/3b30559c088d679ca8fe491158e6c32db630f223
  Author: Kareem Ergawy <kareem.ergawy at amd.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M flang/include/flang/Optimizer/Builder/HLFIRTools.h
    M flang/lib/Lower/Bridge.cpp
    M flang/lib/Optimizer/Builder/HLFIRTools.cpp
    M flang/test/Lower/OpenMP/parallel-private-clause-str.f90
    M flang/test/Lower/OpenMP/parallel-private-clause.f90

  Log Message:
  -----------
  [flang][OpenMP] Only use HLFIR base in privatization logic (#84123)

Modifies the privatization logic so that the emitted code only used the
HLFIR base (i.e. SSA value `#0` returned from `hlfir.declare`). Before
that, that emitted privatization logic was a mix of using `#0` and `#1`
which leads to some difficulties trying to move to delayed privatization
(see the discussion on #84033).

  Commit: 718962f53bfc610f670f1674457a426e01117097
      https://github.com/llvm/llvm-project/commit/718962f53bfc610f670f1674457a426e01117097
  Author: Dominik Steenken <dost at de.ibm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp
    A llvm/test/Analysis/CostModel/SystemZ/reduce-add.ll

  Log Message:
  -----------
  [SystemZ] Provide improved cost estimates (#83873)

This commit provides better cost estimates for
the llvm.vector.reduce.add intrinsic on SystemZ. These apply to all
vector lengths and integer types up to i128. For integer types larger
than i128, we fall back to the default cost estimate.

This has the effect of lowering the estimated costs of most common
instances of the intrinsic. The expected performance impact of this is
minimal with a tendency to slightly improve performance of some
benchmarks.

This commit also provides a test to check the proper computation of the
new estimates, as well as the fallback for types larger than i128.

  Commit: 58dd59a28293432171c0439eb1ae082f6ea9962f
      https://github.com/llvm/llvm-project/commit/58dd59a28293432171c0439eb1ae082f6ea9962f
  Author: Luke Lau <luke at igalia.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp

  Log Message:
  -----------
  [RISCV] Don't run combineBinOp_VLToVWBinOp_VL until after legalize types. NFCI (#84125)

I noticed this from a discrepancy in fillUpExtensionSupport between how
we apparently need to check for legal types for ISD::{ZERO,SIGN}_EXTEND,
but we don't need to for RISCVISD::V{Z,S}EXT_VL.

Prior to #72340, combineBinOp_VLToVWBinOp_VL only ran after type
legalization because it only operated on _VL nodes. _VL nodes are only
emitted during op legalization, which takes place **after** type
legalization, which is presumably why the existing code didn't need to
check for legal types.

After #72340 we now handle generic ops like ISD::ADD that exist before
op legalization and thus **before** type legalization. This meant that
we needed to add extra checks that the narrow type was legal in #76785.

I think the easiest thing to do here is to just maintain the invariant
that the types are legal and only run the combine after type
legalization.

  Commit: d3ec8c2a25f43225efe997569925aa57324db0dd
      https://github.com/llvm/llvm-project/commit/d3ec8c2a25f43225efe997569925aa57324db0dd
  Author: Hans Wennborg <hans at chromium.org>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M compiler-rt/lib/profile/InstrProfilingPlatformWindows.c
    M llvm/unittests/Analysis/LazyCallGraphTest.cpp

  Log Message:
  -----------
  Typo: ponit

  Commit: 0ef61ed54dca2e974928c55b2144b57d4c4ff621
      https://github.com/llvm/llvm-project/commit/0ef61ed54dca2e974928c55b2144b57d4c4ff621
  Author: Luke Lau <luke at igalia.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp

  Log Message:
  -----------
  [RISCV] Move NodeExtensionHelper assert to getOrCreateExtendedOp. NFC

Move the narrow types assert from the ZERO_EXTEND/SIGN_EXTEND case in
fillUpExtensionSupport to getOrCreateExtendedOp so we check the other nodes
too.

  Commit: 9277a32305c1083653ffaa7955cd26deffc10988
      https://github.com/llvm/llvm-project/commit/9277a32305c1083653ffaa7955cd26deffc10988
  Author: Florian Hahn <flo at fhahn.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

  Log Message:
  -----------
  [VPlan] Funnel recipe insert* through VPBasicBlock::insert (NFCI).

This allows relying on VPBasicBlock::insert to make sure insertion is
well formed, i.e. by updating the recipe's parent as well as other
potential invariants in the future.

  Commit: abe1b4e71e9fe57be4a3962e81c58ce22e313024
      https://github.com/llvm/llvm-project/commit/abe1b4e71e9fe57be4a3962e81c58ce22e313024
  Author: Paschalis Mpeis <Paschalis.Mpeis at arm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    A llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll

  Log Message:
  -----------
  SLP cannot vectorize frem calls in AArch64.

It needs updated costs when there are available vector library functions
given the VF and type.

  Commit: 36ce5eb8f1d26f984e46c9da930a1c15085e1dd9
      https://github.com/llvm/llvm-project/commit/36ce5eb8f1d26f984e46c9da930a1c15085e1dd9
  Author: Paschalis Mpeis <Paschalis.Mpeis at arm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll

  Log Message:
  -----------
  [AArch64] SLP can vectorize frem

When vector library calls are available for frem, given its type and
vector length, the SLP vectorizer uses updated costs that amount to a
call, matching LoopVectorizer's functionality.

This allows 'superword-level' vectorization, which can be converted to
a vector lib call by later passes.

Add tests that vectorize code that contains 2x double and 4x float frem
instructions.

  Commit: 3ed8acc7591ee4a52fa39e54c6013ae6da12e807
      https://github.com/llvm/llvm-project/commit/3ed8acc7591ee4a52fa39e54c6013ae6da12e807
  Author: Paschalis Mpeis <Paschalis.Mpeis at arm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/include/llvm/Analysis/VectorUtils.h
    M llvm/lib/Analysis/VectorUtils.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

  Log Message:
  -----------
  Addressing reviewers

  Commit: 34bbbf876b23bd55212c895b13c458b969fb2170
      https://github.com/llvm/llvm-project/commit/34bbbf876b23bd55212c895b13c458b969fb2170
  Author: Paschalis Mpeis <Paschalis.Mpeis at arm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/include/llvm/Analysis/VectorUtils.h
    M llvm/lib/Analysis/TargetTransformInfo.cpp
    M llvm/lib/Analysis/VectorUtils.cpp
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

  Log Message:
  -----------
  [LV][SLP] Vectorizers now use getFRemInstrCost for frem costs

SLP vectorization for frem now happens when vector library calls are
available, given its type and vector length. This is due to using the
updated cost that amounts to a call.

Add tests that do SLP vectorization for code that contains 2x double and
4x float frem instructions.

LoopVectorizer now also uses getFRemInstrCost.

  Commit: ecd7da705ab614914b0a5f1afd092f0530369617
      https://github.com/llvm/llvm-project/commit/ecd7da705ab614914b0a5f1afd092f0530369617
  Author: Paschalis Mpeis <Paschalis.Mpeis at arm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/lib/Analysis/TargetTransformInfo.cpp
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

  Log Message:
  -----------
  Addressing reviewers (2)

  Commit: 6d508fb09fb3a90aa323772e919713b37223ec08
      https://github.com/llvm/llvm-project/commit/6d508fb09fb3a90aa323772e919713b37223ec08
  Author: Paschalis Mpeis <Paschalis.Mpeis at arm.com>
  Date:   2024-03-11 (Mon, 11 Mar 2024)

  Changed paths:
    M llvm/include/llvm/Analysis/TargetTransformInfo.h
    M llvm/lib/Analysis/TargetTransformInfo.cpp
    M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

  Log Message:
  -----------
  [AArch64][LV][SLP] Vectorizers use call cost for vectorized frem

getArithmeticInstrCost is used by both LoopVectorizer and SLPVectorizer
to compute the cost of frem, which becomes a call cost on AArch64 when
TLI has a vector library function.

Add tests that do SLP vectorization for code that contains 2x double and
4x float frem instructions.

Compare: https://github.com/llvm/llvm-project/compare/5f1335a2a629...6d508fb09fb3

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications