[all-commits] [llvm/llvm-project] bdacd5: [flang][CodeGen] add nsw to address calculations (...

Fri Dec 8 04:22:32 PST 2023

  Branch: refs/heads/users/fhahn/main.vplan-initial-modeling-of-runtime-vf-uf-as-vpvalue-1
  Home:   https://github.com/llvm/llvm-project
  Commit: bdacd56fd1f4825cfe19cf8de0cf24a3d1ff18fa
      https://github.com/llvm/llvm-project/commit/bdacd56fd1f4825cfe19cf8de0cf24a3d1ff18fa
  Author: Tom Eccles <tom.eccles at arm.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M flang/lib/Optimizer/CodeGen/CodeGen.cpp
    M flang/test/Fir/array-coor.fir
    M flang/test/Fir/arrexp.fir
    M flang/test/Fir/convert-to-llvm.fir
    M flang/test/Fir/coordinateof.fir
    M flang/test/Fir/tbaa.fir

  Log Message:
  -----------
  [flang][CodeGen] add nsw to address calculations (#74709)

`nsw` is a flag for LLVM arithmetic operations meaning "no signed wrap".
If this keyword is present, the result of the operation is a poison
value if overflow occurs. Adding this keyword permits LLVM to re-order
integer arithmetic more aggressively.

In

https://discourse.llvm.org/t/rfc-changes-to-fircg-xarray-coor-codegen-to-allow-better-hoisting/75257/16
@vzakhari observed that adding nsw is useful to enable hoisting of
address calculations after some loops (or is at least a step in that
direction).

Classic flang also adds nsw to address calculations.

  Commit: faecc736e2ac3cd8c77bebf41b1ed2e2d8cb575f
      https://github.com/llvm/llvm-project/commit/faecc736e2ac3cd8c77bebf41b1ed2e2d8cb575f
  Author: Simon Pilgrim <RKSimon at users.noreply.github.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
    M llvm/test/CodeGen/ARM/vector-store.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
    M llvm/test/CodeGen/X86/fold-pcmpeqd-2.ll
    M llvm/test/CodeGen/X86/var-permute-256.ll
    M llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-4.ll
    M llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-6.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-128.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-128.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-256.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-128.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-256.ll

  Log Message:
  -----------
   [DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443)

  Commit: c90cb6eee8296953c097fcc9fc6e61f739c0dad3
      https://github.com/llvm/llvm-project/commit/c90cb6eee8296953c097fcc9fc6e61f739c0dad3
  Author: taalhaataahir0102 <77788288+taalhaataahir0102 at users.noreply.github.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M lldb/include/lldb/Core/Address.h
    M lldb/include/lldb/Core/Debugger.h
    M lldb/include/lldb/Symbol/Symbol.h
    M lldb/include/lldb/Symbol/SymbolContext.h
    M lldb/include/lldb/Utility/Stream.h
    M lldb/source/Commands/CommandObjectTarget.cpp
    M lldb/source/Core/Address.cpp
    M lldb/source/Core/CoreProperties.td
    M lldb/source/Core/Debugger.cpp
    M lldb/source/Symbol/Symbol.cpp
    M lldb/source/Symbol/SymbolContext.cpp
    M lldb/source/Utility/Stream.cpp
    A lldb/test/Shell/Commands/command-image-lookup-color.test

  Log Message:
  -----------
  [lldb] colorize symbols in image lookup with a regex pattern (#69422)

Fixes https://github.com/llvm/llvm-project/issues/57372

Previously some work has already been done on this. A PR was generated
but it remained in review:
https://reviews.llvm.org/D136462

In short previous approach was following:
Changing the symbol names (making the searched part colorized) ->
printing them -> restoring the symbol names back in their original form.

The reviewers suggested that instead of changing the symbol table, this
colorization should be done in the dump functions itself. Our strategy
involves passing the searched regex pattern to the existing dump
functions responsible for printing information about the searched
symbol. This pattern is propagated until it reaches the line in the dump
functions responsible for displaying symbol information on screen.

At this point, we've introduced a new function called
"PutCStringColorHighlighted," which takes the searched pattern, a prefix and suffix,
and the text and applies colorization to highlight the pattern in the
output. This approach aims to streamline the symbol search process to
improve readability of search results.

Co-authored-by: José L. Junior <josejunior at 10xengineers.ai>

  Commit: ffd61c1e96e9c8a472f305585930b45be0d639d3
      https://github.com/llvm/llvm-project/commit/ffd61c1e96e9c8a472f305585930b45be0d639d3
  Author: David Spickett <david.spickett at linaro.org>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M lldb/source/Core/Address.cpp
    M lldb/source/Symbol/Symbol.cpp

  Log Message:
  -----------
  [lldb] Add missing nullptr checks when colouring symbol output

This adds some checks missed by c90cb6eee8296953c097fcc9fc6e61f739c0dad3,
probably because some tests only run on certain platforms.

  Commit: 5f91335a55cd65dda8351f85b93eeaa7493e06c4
      https://github.com/llvm/llvm-project/commit/5f91335a55cd65dda8351f85b93eeaa7493e06c4
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/avx512fp16-arith.ll
    M llvm/test/CodeGen/X86/gfni-funnel-shifts.ll
    M llvm/test/CodeGen/X86/gfni-rotates.ll
    M llvm/test/CodeGen/X86/min-legal-vector-width.ll
    M llvm/test/CodeGen/X86/vec_fcopysign.ll
    M llvm/test/CodeGen/X86/vector-fshl-128.ll
    M llvm/test/CodeGen/X86/vector-fshl-256.ll
    M llvm/test/CodeGen/X86/vector-fshl-512.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-128.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-256.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-128.ll
    M llvm/test/CodeGen/X86/vector-fshr-256.ll
    M llvm/test/CodeGen/X86/vector-fshr-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-128.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-256.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-3.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-5.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-5.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-6.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll
    M llvm/test/CodeGen/X86/vector-rotate-128.ll
    M llvm/test/CodeGen/X86/vector-rotate-256.ll
    M llvm/test/CodeGen/X86/vector-rotate-512.ll
    M llvm/test/CodeGen/X86/vector-shuffle-v192.ll

  Log Message:
  -----------
  [X86] canonicalizeBitSelect - always use VPTERNLOGD for sub-32bit types

We were using VPTERNLOGQ for everything but i32 types, which made broadcasts wider than necessary

Noticed in #73509

  Commit: 1d6a678591076f316bfcaa03a55beba20406dc00
      https://github.com/llvm/llvm-project/commit/1d6a678591076f316bfcaa03a55beba20406dc00
  Author: XiangZhang <xiang.zhang at iluvatar.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
    A llvm/test/Transforms/LoopUnroll/loop-branch-folding.ll

  Log Message:
  -----------
  [LoopUnroll] Make use of MaxTripCount for loops with "#pragma unroll" (#74703)

Fix loop unroll fail caused by branches folding.

For example:
SimplifyCFG foldloop branches then cause loop unroll failed for "#program unroll" loop.
```
#program unroll
for (int I = 0; I < ConstNum; ++I) { // folding "I < ConstNum" and "Cond2"
  if (Cond2) {
  break;
  }
  xxx loop body;
}
```

The pragma unroll metadata only takes effect if there is an exact trip
count, but not if there is an upper bound trip count. This patch make it
work with an upper bound trip count as well in shouldPragmaUnroll().

Loop unroll is important in stack nervous devices (e.g. GPU, and that is
why a lot of GPU code mark loop with "#program unroll").
It usually much simplify the address (offset) calculations in old
iterations, then we can do a lot of others optimizations, e.g, SROA, for
these simplifed address (escape alloca the whole aggregates).

  Commit: 9017229ecda119e7977739dcab125e455289ade6
      https://github.com/llvm/llvm-project/commit/9017229ecda119e7977739dcab125e455289ade6
  Author: Clement Courbet <courbet at google.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
    M llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
    M llvm/tools/llvm-exegesis/llvm-exegesis.cpp

  Log Message:
  -----------
  [llvm-exegesis]Allow clients to do their own snippet running error ha… (#74711)

…ndling.

Returns an error *and* a benchmark rather than an error *or* a
benchmark. This allows users to have custom error handling while still
being able to inspect the benchmark.

Apart from this small API change, this is an NFC.

This is an alternative to #74211.

  Commit: 69a0a3be0185ce3bc0458b0047795e8ebfe95abd
      https://github.com/llvm/llvm-project/commit/69a0a3be0185ce3bc0458b0047795e8ebfe95abd
  Author: Mehdi Amini <joker.eph at gmail.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M mlir/cmake/modules/MLIRConfig.cmake.in

  Log Message:
  -----------
  [mlir] Add missing MLIR_ENABLE_EXECUTION_ENGINE option to MLIRConfig.cmake.in

This is the kind of options that downstream consumers of preconfigured MLIR
packages can check to see if the execution engine is available or not.

  Commit: 5ea6a3fc6d64d593f447e306c3a9d39e9924ea58
      https://github.com/llvm/llvm-project/commit/5ea6a3fc6d64d593f447e306c3a9d39e9924ea58
  Author: Florian Hahn <flo at fhahn.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
    M llvm/test/Transforms/LoopVectorize/AArch64/eliminate-tail-predication.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/gather-do-not-vectorize-addressing.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/masked-call.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/outer_loop_prefer_scalable.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/pr60831-sve-inv-store-crash.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/scalable-avoid-scalarization.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/scalable-reduction-inloop-cond.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-cond-inv-loads.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-inloop-reductions.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-reductions.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-strict-reductions.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-fneg.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-inv-store.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-live-out-pointer-induction.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-low-trip-count.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-multiexit.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-runtime-check-size-based-threshold.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-forced.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-optsize.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-overflow-checks.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-reductions.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding-unroll.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-tail-folding.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-vfabi.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/tail-folding-styles.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/uniform-args-call-variants.ll
    M llvm/test/Transforms/LoopVectorize/AArch64/wider-VF-for-callinst.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/defaults.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/inloop-reduction.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/lmul.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/low-trip-count.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/mask-index-type.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/masked_gather_scatter.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/safe-dep-distance.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/scalable-basics.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/scalable-tailfold.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/select-cmp-reduction.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/short-trip-count.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
    M llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll
    M llvm/test/Transforms/LoopVectorize/outer_loop_scalable.ll
    M llvm/test/Transforms/LoopVectorize/scalable-inductions.ll
    M llvm/test/Transforms/LoopVectorize/scalable-lifetime.ll
    M llvm/test/Transforms/LoopVectorize/scalable-loop-unpredicated-body-scalar-tail.ll
    M llvm/test/Transforms/LoopVectorize/scalable-reduction-inloop.ll
    M llvm/test/Transforms/LoopVectorize/scalable-trunc-min-bitwidth.ll

  Log Message:
  -----------
  [VPlan] Compute scalable VF in preheader for induction increment. (#74762)

UF * VF is loop invariant and can be computed directly in the preheader.
This prepares the code for #74761 and reduces the test changes.

  Commit: 4421cbef172e94ffa3571e60bcd21592b29573af
      https://github.com/llvm/llvm-project/commit/4421cbef172e94ffa3571e60bcd21592b29573af
  Author: Florian Hahn <flo at fhahn.com>
  Date:   2023-12-08 (Fri, 08 Dec 2023)

  Changed paths:
    M flang/lib/Optimizer/CodeGen/CodeGen.cpp
    M flang/test/Fir/array-coor.fir
    M flang/test/Fir/arrexp.fir
    M flang/test/Fir/convert-to-llvm.fir
    M flang/test/Fir/coordinateof.fir
    M flang/test/Fir/tbaa.fir
    M lldb/include/lldb/Core/Address.h
    M lldb/include/lldb/Core/Debugger.h
    M lldb/include/lldb/Symbol/Symbol.h
    M lldb/include/lldb/Symbol/SymbolContext.h
    M lldb/include/lldb/Utility/Stream.h
    M lldb/source/Commands/CommandObjectTarget.cpp
    M lldb/source/Core/Address.cpp
    M lldb/source/Core/CoreProperties.td
    M lldb/source/Core/Debugger.cpp
    M lldb/source/Symbol/Symbol.cpp
    M lldb/source/Symbol/SymbolContext.cpp
    M lldb/source/Utility/Stream.cpp
    A lldb/test/Shell/Commands/command-image-lookup-color.test
    M llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
    M llvm/test/CodeGen/ARM/vector-store.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
    M llvm/test/CodeGen/X86/avx512fp16-arith.ll
    M llvm/test/CodeGen/X86/fold-pcmpeqd-2.ll
    M llvm/test/CodeGen/X86/gfni-funnel-shifts.ll
    M llvm/test/CodeGen/X86/gfni-rotates.ll
    M llvm/test/CodeGen/X86/min-legal-vector-width.ll
    M llvm/test/CodeGen/X86/var-permute-256.ll
    M llvm/test/CodeGen/X86/vec_fcopysign.ll
    M llvm/test/CodeGen/X86/vector-fshl-128.ll
    M llvm/test/CodeGen/X86/vector-fshl-256.ll
    M llvm/test/CodeGen/X86/vector-fshl-512.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-128.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-256.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-128.ll
    M llvm/test/CodeGen/X86/vector-fshr-256.ll
    M llvm/test/CodeGen/X86/vector-fshr-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-128.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-256.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
    M llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-4.ll
    M llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-6.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-3.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-5.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-5.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-6.ll
    M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll
    M llvm/test/CodeGen/X86/vector-rotate-128.ll
    M llvm/test/CodeGen/X86/vector-rotate-256.ll
    M llvm/test/CodeGen/X86/vector-rotate-512.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-128.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-128.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-256.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-128.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-256.ll
    M llvm/test/CodeGen/X86/vector-shuffle-v192.ll
    A llvm/test/Transforms/LoopUnroll/loop-branch-folding.ll
    M llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
    M llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
    M llvm/tools/llvm-exegesis/llvm-exegesis.cpp
    M mlir/cmake/modules/MLIRConfig.cmake.in

  Log Message:
  -----------
  [𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]

Compare: https://github.com/llvm/llvm-project/compare/a87250af08c7...4421cbef172e