[all-commits] [llvm/llvm-project] 606937: [SDAG] Remove IndexType manipulation in getUniform...
Alexey Bataev via All-commits
all-commits at lists.llvm.org
Fri Aug 15 12:50:24 PDT 2025
Branch: refs/heads/users/alexey-bataev/spr/slp-prefer-copyable-vectorization-over-alternate-opcodes
Home: https://github.com/llvm/llvm-project
Commit: 606937474e552f0a5d620f67f19947c96cfa9d2a
https://github.com/llvm/llvm-project/commit/606937474e552f0a5d620f67f19947c96cfa9d2a
Author: Philip Reames <preames at rivosinc.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Log Message:
-----------
[SDAG] Remove IndexType manipulation in getUniformBase and callers (#151578)
All paths set it to the same value, just propagate that value to the
consumer.
Commit: 11c22400493a2be9b3b0a01c53860bd4ffc2396b
https://github.com/llvm/llvm-project/commit/11c22400493a2be9b3b0a01c53860bd4ffc2396b
Author: Nikita Popov <npopov at redhat.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Log Message:
-----------
[SDAGBuilder] Rename RetTys -> RetVTs (NFC)
Make it clearer that this is a vector of EVTs, not IR types.
Based on:
https://github.com/llvm/llvm-project/pull/153798#discussion_r2279066696
Commit: 6d3ad9d9fd830eef0ac8a9d558e826b8b624e17d
https://github.com/llvm/llvm-project/commit/6d3ad9d9fd830eef0ac8a9d558e826b8b624e17d
Author: cmtice <cmtice at google.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M lldb/source/ValueObject/DILEval.cpp
M lldb/test/API/commands/frame/var-dil/basics/ArraySubscript/TestFrameVarDILArraySubscript.py
M lldb/test/API/commands/frame/var-dil/basics/ArraySubscript/main.cpp
A lldb/test/API/commands/frame/var-dil/basics/ArraySubscript/myArraySynthProvider.py
Log Message:
-----------
[LLDB] Update DIL handling of array subscripting. (#151605)
This updates the DIL code for handling array subscripting to more
closely match and handle all the cases from the original 'frame var'
implementation. Also updates the DIL array subscripting test. This
particularly fixes some issues with handling synthetic children, objc
pointers, and accessing specific bits within scalar data types.
Commit: 0b04168948d00baf9c656ce02a85dc5cf6703581
https://github.com/llvm/llvm-project/commit/0b04168948d00baf9c656ce02a85dc5cf6703581
Author: Aiden Grossman <aidengrossman at google.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
A .github/workflows/bazel-checks.yml
Log Message:
-----------
[CI] Add Basic Bazel Checks (#153740)
Having basic checks (like running buildifier) on the upstream bazel
files would be helpful for contributors maintaining the bazel build. Add
basic checks (currently just buildifier) to a workflow that runs
whenever the bazel build files change.
Commit: f279c47cb3e7191a22703b837e006eb7dd591de7
https://github.com/llvm/llvm-project/commit/f279c47cb3e7191a22703b837e006eb7dd591de7
Author: Tim Renouf <tim.renouf at amd.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/docs/AMDGPUUsage.rst
M llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
M llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.h
A llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll
A llvm/test/CodeGen/AMDGPU/dvgpr_sym_fail_too_many_block_size_16.ll
A llvm/test/CodeGen/AMDGPU/dvgpr_sym_fail_too_many_block_size_16_anon.ll
Log Message:
-----------
AMDGPU gfx12: Add _dvgpr$ symbols for dynamic VGPRs (#148251)
For each function with the AMDGPU_CS_Chain calling convention, with
dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the
function symbol, plus an offset encoding one less than the number of
VGPR blocks used by the function (16 VGPRs per block, no more than 128)
in bits 5..3 of the symbol value. This is used by a front-end to have
functions that are chained rather than called, and a dispatcher that
dynamically resizes the VGPR count before dispatching to a function.
Commit: 7df862818edbb570cb73888aa6a41a15a53eaf82
https://github.com/llvm/llvm-project/commit/7df862818edbb570cb73888aa6a41a15a53eaf82
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/test/CodeGen/X86/avx512vbmi-builtins.c
M clang/test/CodeGen/X86/avx512vbmivl-builtin.c
Log Message:
-----------
[X86] avx512vbmi-builtins.c / avx512vbmivl-builtin.c - add C/C++ and 32/64-bit test coverage
Commit: 38eb14f27c2700718adcc9175656ed52f4528703
https://github.com/llvm/llvm-project/commit/38eb14f27c2700718adcc9175656ed52f4528703
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/test/CodeGen/X86/avx512vbmi2-builtins.c
M clang/test/CodeGen/X86/avx512vlvbmi2-builtins.c
Log Message:
-----------
[X86] avx512vbmi2-builtins.c / avx512vlvbmi2-builtins.c - add C/C++ and 32/64-bit test coverage
Commit: ffaba758fb4ff98820c0a9ae15733863d1c5be37
https://github.com/llvm/llvm-project/commit/ffaba758fb4ff98820c0a9ae15733863d1c5be37
Author: Tim Gymnich <tim at gymni.ch>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
M mlir/test/Dialect/LLVMIR/rocdl.mlir
M mlir/test/Target/LLVMIR/rocdl.mlir
Log Message:
-----------
[MLIR][ROCDL] Add permlane16.swap and permanlane32.swap (#153804)
add rocdl.permlane16.swap and rocdl.permanlane32.swap
Commit: ae7e1b82fe97f184fdc042f339784a64f28d5c08
https://github.com/llvm/llvm-project/commit/ae7e1b82fe97f184fdc042f339784a64f28d5c08
Author: Dave Lee <davelee.com at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M lldb/include/lldb/DataFormatters/DumpValueObjectOptions.h
M lldb/include/lldb/DataFormatters/ValueObjectPrinter.h
M lldb/include/lldb/Interpreter/OptionGroupValueObjectDisplay.h
M lldb/source/Commands/CommandObjectDWIMPrint.cpp
M lldb/source/Commands/CommandObjectExpression.cpp
M lldb/source/DataFormatters/DumpValueObjectOptions.cpp
M lldb/source/DataFormatters/ValueObjectPrinter.cpp
M lldb/source/Expression/REPL.cpp
M lldb/source/Interpreter/OptionGroupValueObjectDisplay.cpp
A lldb/test/API/lang/objc/failing-description/Makefile
A lldb/test/API/lang/objc/failing-description/TestObjCFailingDescription.py
A lldb/test/API/lang/objc/failing-description/main.m
A lldb/test/API/lang/objc/struct-description/Makefile
A lldb/test/API/lang/objc/struct-description/TestObjCStructDescription.py
A lldb/test/API/lang/objc/struct-description/main.m
Log Message:
-----------
[lldb] Print ValueObject when GetObjectDescription fails (#152417)
This fixes a few bugs, effectively through a fallback to `p` when `po` fails.
The motivating bug this fixes is when an error within the compiler causes `po` to fail.
Previously when that happened, only its value (typically an object's address) was
printed – and problematically, no compiler diagnostics were shown. With this change,
compiler diagnostics are shown, _and_ the object is fully printed (ie `p`).
Another bug this fixes is when `po` is used on a type that doesn't provide an object
description (such as a struct). Again, the normal `ValueObject` printing is used.
Additionally, this also improves how lldb handles an object description method that
fails in some way. Now an error will be shown (it wasn't before), and the value will be
printed normally.
Commit: 868efdcf381d28d6b5e273e6fb8704637436856e
https://github.com/llvm/llvm-project/commit/868efdcf381d28d6b5e273e6fb8704637436856e
Author: Shafik Yaghmour <shafik.yaghmour at intel.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/lib/AST/ByteCode/InterpBuiltin.cpp
Log Message:
-----------
[Clang][Bytecode][NFC] Move Result into APSInt constructor (#153664)
Static analysis flagged this line because we are copying Result instead
of moving it.
Commit: f34326dac8e6903e0621dd87505928756f860d6d
https://github.com/llvm/llvm-project/commit/f34326dac8e6903e0621dd87505928756f860d6d
Author: Ramkumar Ramachandra <ramkumar.ramachandra at codasip.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
M llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
M llvm/lib/Transforms/Vectorize/VPlanUtils.h
Log Message:
-----------
[VPlan] Introduce vputils::onlyScalarValuesUsed (NFC) (#153577)
Commit: 08ff017fb0c9c7c3c91858023ea45149449fbbfc
https://github.com/llvm/llvm-project/commit/08ff017fb0c9c7c3c91858023ea45149449fbbfc
Author: Leandro Lacerda <leandrolcampos at yahoo.com.br>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M libc/benchmarks/gpu/CMakeLists.txt
M libc/benchmarks/gpu/LibcGpuBenchmark.cpp
M libc/benchmarks/gpu/LibcGpuBenchmark.h
M libc/benchmarks/gpu/src/math/CMakeLists.txt
M libc/benchmarks/gpu/src/math/atan2_benchmark.cpp
M libc/benchmarks/gpu/src/math/sin_benchmark.cpp
M libc/benchmarks/gpu/timing/amdgpu/CMakeLists.txt
M libc/benchmarks/gpu/timing/amdgpu/timing.h
M libc/benchmarks/gpu/timing/nvptx/CMakeLists.txt
M libc/benchmarks/gpu/timing/nvptx/timing.h
Log Message:
-----------
[libc] Improve GPU benchmarking (#153512)
This patch improves the GPU benchmarking in this way:
* Replace `rand`/`srand` with a deterministic per-thread RNG seeded by
`call_index`: reproducible, apples-to-apples libc vs vendor comparisons.
* Fix input generation: sample the unbiased exponent uniformly in
`[min_exp, max_exp]`, clamp bounds, and skip `Inf`, `NaN`, `-0.0`, and
`+0.0`.
* Fix standard deviation: use an explicit estimator from sums and
sums-of-squares (`sqrt(E[x^2] − E[x]^2)`) across samples.
* Fix throughput overhead: subtract a loop-only baseline inside
NVPTX/AMDGPU timing backends so `benchmark()` gets cycles-per-call
already corrected (no `overhead()` call).
* Adapt existing math benchmarks to the new RNG/timing plumbing (plumb
`call_index`, drop `rand/srand`, clean includes).
* Correct inter-thread aggregation: use iteration-weighted pooling to
compute the global mean/variance, ensuring statistically sound `Cycles
(Mean)` and `Stddev`.
* Remove `Time / Iteration` column from the results table: it reported
per-thread convergence time (not per-call latency) and was
redundant/misleading next to `Cycles (Mean)`.
* Remove unused `BenchmarkLogger` files: dead code that added
maintenance and cognitive overhead without providing functionality.
---
## TODO (before merge)
* [ ] Investigate compiler warnings and address their root causes.
* [x] Review how per-thread results are aggregated into the overall
result.
## Follow-ups (future PRs)
* Add support to run throughput benchmarks with uniform (linear) input
distributions, alongside the current log2-uniform scheme.
* Review/adjust the configuration and coverage of existing math
benchmarks.
* Add more math benchmarks (e.g., `exp`/`expf`, others).
Commit: 1d1e52e614f95eed5ee440b43fa1992e46976629
https://github.com/llvm/llvm-project/commit/1d1e52e614f95eed5ee440b43fa1992e46976629
Author: Daniel Paoliello <danpao at microsoft.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/X86/X86WinEHUnwindV2.cpp
M llvm/test/CodeGen/X86/win64-eh-unwindv2-errors.mir
A llvm/test/CodeGen/X86/win64-eh-unwindv2-push-pop-stack-alloc.mir
Log Message:
-----------
[win][x64] Allow push/pop for stack alloc when unwind v2 is required (#153621)
While attempting to enable Windows x64 unwind v2, compilation failed
with the following error:
```
fatal error: error in backend: Windows x64 Unwind v2 is required, but LLVM has generated incompatible code in function '<redacted>': Cannot pop registers before the stack allocation has been deallocated
```
I traced this down to an optimization in `X86FrameLowering`:
<https://github.com/llvm/llvm-project/blob/6961139ce9154d03c88b8d46c8742a1eaa569cd9/llvm/lib/Target/X86/X86FrameLowering.cpp#L324-L340>
Technically, using `push`/`pop` to adjust the stack is permitted under
unwind v2: the requirement for a "canonical" epilog is that the stack is
fully adjusted before the registers listed as pushed in the unwind table
are popped. So, as long as the `.seh_unwindv2start` pseudo is after the
pops that adjust the stack, then everything will work correctly.
One other side effect of this change is that the stack is now allowed to
be adjusted across multiple instructions, which would be needed for
extremely large stack frames.
Commit: 01bc7421855889dcc3b10a131928e3a4a8e4b38c
https://github.com/llvm/llvm-project/commit/01bc7421855889dcc3b10a131928e3a4a8e4b38c
Author: Nikita Popov <npopov at redhat.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/include/llvm/CodeGen/TargetLowering.h
M llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
M llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
M llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
M llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
M llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
M llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
M llvm/lib/Target/AArch64/AArch64FastISel.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
M llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
M llvm/lib/Target/ARM/ARMISelLowering.cpp
M llvm/lib/Target/ARM/ARMSelectionDAGInfo.cpp
M llvm/lib/Target/AVR/AVRISelLowering.cpp
M llvm/lib/Target/CSKY/CSKYISelLowering.cpp
M llvm/lib/Target/Hexagon/HexagonSelectionDAGInfo.cpp
M llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
M llvm/lib/Target/M68k/M68kISelLowering.cpp
M llvm/lib/Target/Mips/MipsISelLowering.cpp
M llvm/lib/Target/PowerPC/PPCISelLowering.cpp
M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
M llvm/lib/Target/Sparc/SparcISelLowering.cpp
M llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
M llvm/lib/Target/VE/VEISelLowering.cpp
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/lib/Target/XCore/XCoreISelLowering.cpp
M llvm/lib/Target/XCore/XCoreSelectionDAGInfo.cpp
Log Message:
-----------
[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)
This ensures that the required fields are set, and also makes the
construction more convenient.
Commit: c10766cf49b797a8227c165721e7466a61596729
https://github.com/llvm/llvm-project/commit/c10766cf49b797a8227c165721e7466a61596729
Author: George Burgess IV <george.burgess.iv at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/utils/revert_checker.py
M llvm/utils/revert_checker_test.py
Log Message:
-----------
[utils] add `stop_at_sha` to revert_checker's API (#152011)
This is useful for downstream consumers of this as a module. It's
unclear if interactive use wants this lever, but support can easily be
added if so.
Commit: 853094fd813f773326b452ec5f3360cc5f2be0f7
https://github.com/llvm/llvm-project/commit/853094fd813f773326b452ec5f3360cc5f2be0f7
Author: Craig Topper <craig.topper at sifive.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/CodeGen/VirtRegMap.cpp
Log Message:
-----------
[VirtRegMap] Use TRI member variable. NFC
Commit: cd0bf2735bcd1e9a21dd10169782060a3702c447
https://github.com/llvm/llvm-project/commit/cd0bf2735bcd1e9a21dd10169782060a3702c447
Author: Shubham Sandeep Rastogi <srastogi22 at apple.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M lldb/source/ValueObject/DILEval.cpp
M lldb/test/API/commands/frame/var-dil/basics/ArraySubscript/TestFrameVarDILArraySubscript.py
M lldb/test/API/commands/frame/var-dil/basics/ArraySubscript/main.cpp
R lldb/test/API/commands/frame/var-dil/basics/ArraySubscript/myArraySynthProvider.py
Log Message:
-----------
Revert "[LLDB] Update DIL handling of array subscripting. (#151605)"
This reverts commit 6d3ad9d9fd830eef0ac8a9d558e826b8b624e17d.
This was reverted because it broke the LLDB greendragon bot.
Commit: b0d2b57f7e4726abc5fb6152f151c0c24625e4bf
https://github.com/llvm/llvm-project/commit/b0d2b57f7e4726abc5fb6152f151c0c24625e4bf
Author: Phoebe Wang <phoebe.wang at intel.com>
Date: 2025-08-16 (Sat, 16 Aug 2025)
Changed paths:
M clang/lib/Headers/emmintrin.h
M clang/lib/Headers/xmmintrin.h
Log Message:
-----------
[Headers][X86] Remove more duplicated typedefs (#153820)
They are defined in mmintrin.h
Commit: 0e9b6d6c8a111e214a3907fe97ccadf8f438d854
https://github.com/llvm/llvm-project/commit/0e9b6d6c8a111e214a3907fe97ccadf8f438d854
Author: Min-Yih Hsu <min.hsu at sifive.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/CodeGen/InterleavedAccessPass.cpp
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
Log Message:
-----------
[IA][RISCV] Detecting gap mask from a mask assembled by interleaveN intrinsics (#153510)
If the mask of a (fixed-vector) deinterleaved load is assembled by
`vector.interleaveN` intrinsic, any intrinsic arguments that are
all-zeros are regarded as gaps.
Commit: dfa1335db1fe5e884207b0be375c038c61129a62
https://github.com/llvm/llvm-project/commit/dfa1335db1fe5e884207b0be375c038c61129a62
Author: Andrey Timonin <timonina1909 at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
M mlir/lib/Dialect/EmitC/IR/EmitC.cpp
M mlir/test/Dialect/EmitC/invalid_ops.mlir
Log Message:
-----------
[mlir][emitc] Add verification for the emitc.get_field op (#152577)
This MR adds a `verifier` for the `emitc.get_field` op.
- The `verifier` checks that the `emitc.get_field` operation is nested
inside an `emitc.class` op.
- Additionally, appropriate tests for erroneous cases were added for
class-related operations in `invalid_ops.mlir`.
Commit: 583499a8cf1df76a5439958ffc95d9c04808bcfc
https://github.com/llvm/llvm-project/commit/583499a8cf1df76a5439958ffc95d9c04808bcfc
Author: Valentin Clement (バレンタイン クレメン) <clementval at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-libdevice.cuf
Log Message:
-----------
[flang][cuda] Add missing bind name for __hiloint2double, __double2loint and __double2hiint (#153713)
Commit: 3a8f579a23d0362f77152085846e8e3d80df6b09
https://github.com/llvm/llvm-project/commit/3a8f579a23d0362f77152085846e8e3d80df6b09
Author: CatherineMoore <catmoore at amd.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M openmp/libompd/gdb-plugin/ompdModule.c
Log Message:
-----------
[OpenMP] Update printf statement with missing argument. (#153704)
Commit: 2c20a9bfb3ce7a22b040b4a2694f19beeb616cd0
https://github.com/llvm/llvm-project/commit/2c20a9bfb3ce7a22b040b4a2694f19beeb616cd0
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/test/CodeGen/X86/avx512bf16-builtins.c
M clang/test/CodeGen/X86/avx512vlbf16-builtins.c
Log Message:
-----------
[X86] avx512bf16-builtins.c / avx512vlbf16-builtins.c - add C/C++ and 32/64-bit test coverage
Commit: af96ed6bf6e6a1ba0cbb36cb3925dd44f41c301e
https://github.com/llvm/llvm-project/commit/af96ed6bf6e6a1ba0cbb36cb3925dd44f41c301e
Author: keinflue <keinflue at posteo.de>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/docs/ReleaseNotes.rst
M clang/lib/Sema/SemaDecl.cpp
M clang/lib/Sema/SemaDeclCXX.cpp
M clang/test/CXX/class/class.union/class.union.anon/p4.cpp
Log Message:
-----------
[clang] Inject IndirectFieldDecl even if name conflicts. (#153140)
This modifies InjectAnonymousStructOrUnionMembers to inject an
IndirectFieldDecl and mark it invalid even if its name conflicts with
another name in the scope.
This resolves a crash on a further diagnostic
diag::err_multiple_mem_union_initialization which via
findDefaultInitializer relies on these declarations being present.
Fixes #149985
Commit: a8d25683eec612f180215f446397f39a53c5c416
https://github.com/llvm/llvm-project/commit/a8d25683eec612f180215f446397f39a53c5c416
Author: zGoldthorpe <zgoldtho at ualberta.ca>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/include/llvm/IR/PatternMatch.h
M llvm/lib/Target/Hexagon/HexagonVectorCombine.cpp
M llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
M llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
M llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
Log Message:
-----------
[PatternMatch] Allow `m_ConstantInt` to match integer splats (#153692)
When matching integers, `m_ConstantInt` is a convenient alternative to
`m_APInt` for matching unsigned 64-bit integers, allowing one to
simplify
```cpp
const APInt *IntC;
if (match(V, m_APInt(IntC))) {
if (IntC->ule(UINT64_MAX)) {
uint64_t Int = IntC->getZExtValue();
// ...
}
}
```
to
```cpp
uint64_t Int;
if (match(V, m_ConstantInt(Int))) {
// ...
}
```
However, this simplification is only true if `V` is a scalar type.
Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt`
does not.
This patch ensures that the matching behaviour of `m_ConstantInt`
parallels that of `m_APInt`, and also incorporates it in some obvious
places.
Commit: b045729eb4d66ff76df469e1a995cea4e4f383ba
https://github.com/llvm/llvm-project/commit/b045729eb4d66ff76df469e1a995cea4e4f383ba
Author: asraa <asraa at google.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M mlir/include/mlir/Analysis/Presburger/IntegerRelation.h
M mlir/lib/Analysis/Presburger/IntegerRelation.cpp
M mlir/unittests/Analysis/Presburger/IntegerRelationTest.cpp
Log Message:
-----------
[mlir][presburger] add functionality to compute local mod in IntegerRelation (#153614)
Similar to `IntegerRelation::addLocalFloorDiv`, this adds a utility
`IntegerRelation::addLocalModulo` that adds and returns a local variable
that is the modulus of an affine function of the variables modulo some
constant modulus. The function returns the absolute index of the new var
in the relation.
This is computed by first finding the floordiv of `exprs // modulus = q`
and then computing the remainder `result = exprs - q * modulus`.
Signed-off-by: Asra Ali <asraa at google.com>
Commit: 92cb0414ca419212bb54ac9af99407bd444fb3f4
https://github.com/llvm/llvm-project/commit/92cb0414ca419212bb54ac9af99407bd444fb3f4
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/test/CodeGen/X86/avx512vlvnni-builtins.c
M clang/test/CodeGen/X86/avx512vnni-builtins.c
Log Message:
-----------
[X86] avx512vnni-builtins.c / avx512vlvnni-builtins.c - add C/C++ and 32/64-bit test coverage
Commit: fd3f052aeb59c1672db7f72169ea4b03c73c62d7
https://github.com/llvm/llvm-project/commit/fd3f052aeb59c1672db7f72169ea4b03c73c62d7
Author: Valentin Clement (バレンタイン クレメン) <clementval at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-libdevice.cuf
Log Message:
-----------
[flang][cuda] Add interfaces for int_as_float and float_as_int (#153716)
Commit: bc773632355b3cebde350b0341624e88be40b744
https://github.com/llvm/llvm-project/commit/bc773632355b3cebde350b0341624e88be40b744
Author: Alex MacLean <amaclean at nvidia.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
A llvm/test/CodeGen/NVPTX/cse-mov-sym.ll
Log Message:
-----------
[NVPTX] Do not mark move of global address as cheap enabling more CSE (#153730)
Commit: 0e8c964c2180921e1464ba68f7f7f864257cbdfb
https://github.com/llvm/llvm-project/commit/0e8c964c2180921e1464ba68f7f7f864257cbdfb
Author: Valentin Clement (バレンタイン クレメン) <clementval at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-libdevice.cuf
Log Message:
-----------
[flang][cuda] Add interfaces for double_as_longlong and longlong_as_double (#153719)
Commit: 0e4af726cb4d60072dcabf161fcc3c9e3a31cf2a
https://github.com/llvm/llvm-project/commit/0e4af726cb4d60072dcabf161fcc3c9e3a31cf2a
Author: Valentin Clement (バレンタイン クレメン) <clementval at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-libdevice.cuf
Log Message:
-----------
[flang][cuda] Add interface for __fdividef (#153742)
Commit: 115f8160697815dedab89e67f6211322ca6d43d9
https://github.com/llvm/llvm-project/commit/115f8160697815dedab89e67f6211322ca6d43d9
Author: Valentin Clement (バレンタイン クレメン) <clementval at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-libdevice.cuf
Log Message:
-----------
[flang][cuda] Add missing bind name for __int2double_rn (#153720)
Commit: 069f8121e0257e5271961ab7deb77497da6b3495
https://github.com/llvm/llvm-project/commit/069f8121e0257e5271961ab7deb77497da6b3495
Author: Aiden Grossman <aidengrossman at google.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/X86/X86SchedSkylakeClient.td
M llvm/lib/Target/X86/X86SchedSkylakeServer.td
M llvm/test/tools/llvm-mca/X86/SkylakeClient/zero-idioms.s
M llvm/test/tools/llvm-mca/X86/SkylakeServer/zero-idioms.s
Log Message:
-----------
[X86] Add RCU for Skylake Models (#153832)
We cannot actually retire an infinite number of uops per cycle. This
patch adds a RCU to the skylake scheduling model to fix this. I'm
purposefully using a loose upper bound here. We're unlikely to actually
get four fused uops per cycle, but this is better than not setting
anything. Most realistic code I've put through uiCA will retire up to ~6
uops per cycle.
Information taken from
https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client).
This requires modification of the two zero idiom tests because we do not
currently model the CPU frontend which would likely be the actual
bottleneck in that case.
Related to #153747.
Commit: 0bb1af478a5c8957b7a0b8464bd7c1855b9b5b12
https://github.com/llvm/llvm-project/commit/0bb1af478a5c8957b7a0b8464bd7c1855b9b5b12
Author: Kaitlin Peng <kaitlinpeng at microsoft.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/include/llvm/InitializePasses.h
M llvm/include/llvm/LinkAllPasses.h
M llvm/include/llvm/Transforms/IPO/GlobalDCE.h
M llvm/lib/Target/DirectX/CMakeLists.txt
M llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
M llvm/lib/Target/DirectX/DirectXPassRegistry.def
M llvm/lib/Target/DirectX/DirectXTargetMachine.cpp
M llvm/lib/Transforms/IPO/GlobalDCE.cpp
M llvm/test/CodeGen/DirectX/finalize_linkage.ll
M llvm/test/CodeGen/DirectX/llc-pipeline.ll
M llvm/test/CodeGen/DirectX/scalar-data.ll
M llvm/test/tools/dxil-dis/opaque-value_as_metadata.ll
Log Message:
-----------
[DirectX] Add GlobalDCE pass after finalize linkage pass in DirectX backend (#151071)
Fixes #139023.
This PR essentially removes unused global variables:
- Restores the `GlobalDCE` Legacy pass and adds it to the DirectX
backend after the finalize linkage pass
- Converts external global variables with no usage to internal linkage
in the finalize linkage pass
- (so they can be removed by `GlobalDCE`)
- Makes the `dxil-finalize-linkage` pass usable using the new pass
manager flag syntax
- Adds tests to `finalize_linkage.ll` that make sure unused global
variables are removed
- Adds a use for variable `@CBV` in `opaque-value_as_metadata.ll` so it
isn't removed
- Changes the `scalar-data.ll` run command to avoid removing its global
variables
---------
Co-authored-by: Farzon Lotfi <farzonlotfi at microsoft.com>
Commit: ed6d505fabcb53f02b68efc30aca15bddf823578
https://github.com/llvm/llvm-project/commit/ed6d505fabcb53f02b68efc30aca15bddf823578
Author: Aaron Ballman <aaron at aaronballman.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/docs/LanguageExtensions.rst
Log Message:
-----------
[C][Docs] Add backported language features (#153837)
We've backported a lot more features from C to previous C standards than
we were documenting. I took a pass over the c_status page for Clang and
pulled more entries to add to our documentation.
Commit: 3720d8b52d664c7e3620404d1a2d12cee13677f3
https://github.com/llvm/llvm-project/commit/3720d8b52d664c7e3620404d1a2d12cee13677f3
Author: Valentin Clement (バレンタイン クレメン) <clementval at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-device-proc.cuf
M flang/test/Lower/CUDA/cuda-libdevice.cuf
Log Message:
-----------
[flang][cuda] Update some bind name to fast version and add __sincosf (#153744)
Use the fast version in the bind name and reorder these fast math
functions. Add missing __sincosf interface.
Commit: 5d28284dbb1f4e5c60f96399f8075e8e6a4a2440
https://github.com/llvm/llvm-project/commit/5d28284dbb1f4e5c60f96399f8075e8e6a4a2440
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/GCNSubtarget.h
Log Message:
-----------
[AMDGPU] gfx1250 does not need nop before VGPR dealloc (#153844)
This has no impact as the dealloc is now practically disabled.
Commit: 1dc0005d6d23f36b80358abad6590886c8eed32a
https://github.com/llvm/llvm-project/commit/1dc0005d6d23f36b80358abad6590886c8eed32a
Author: Dave Lee <davelee.com at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M lldb/source/Commands/CommandObjectDWIMPrint.cpp
Log Message:
-----------
Revert "[lldb] Fallback to expression eval when Dump of variable fails in dwim-print" (#153824)
Reverts llvm/llvm-project#151374
Superseded by https://github.com/llvm/llvm-project/pull/152417
Commit: 3a4a60deffdf5bbe622326b2813583acc37cccce
https://github.com/llvm/llvm-project/commit/3a4a60deffdf5bbe622326b2813583acc37cccce
Author: XChy <xxs_chy at outlook.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
M llvm/test/Transforms/VectorCombine/X86/intrinsic-scalarize.ll
M llvm/test/Transforms/VectorCombine/binop-scalarize.ll
M llvm/test/Transforms/VectorCombine/intrinsic-scalarize.ll
Log Message:
-----------
[VectorCombine] Apply InstSimplify in scalarizeOpOrCmp to avoid infinite loop (#153069)
Fixes #153012
As we tolerate unfoldable constant expressions in `scalarizeOpOrCmp`, we
may fold
```llvm
define void @bug(ptr %ptr1, ptr %ptr2, i64 %idx) #0 {
entry:
%158 = insertelement <2 x i64> <i64 5, i64 ptrtoint (ptr @val to i64)>, i64 %idx, i32 0
%159 = or disjoint <2 x i64> splat (i64 2), %158
store <2 x i64> %159, ptr %ptr2
ret void
}
```
to
```llvm
define void @bug(ptr %ptr1, ptr %ptr2, i64 %idx) {
entry:
%.scalar = or disjoint i64 2, %idx
%0 = or <2 x i64> splat (i64 2), <i64 5, i64 ptrtoint (ptr @val to i64)>
%1 = insertelement <2 x i64> %0, i64 %.scalar, i64 0
store <2 x i64> %1, ptr %ptr2, align 16
ret void
}
```
And it would be folded back in `foldInsExtBinop`, resulting in an
infinite loop.
This patch forces scalarization iff InstSimplify can fold the constant
expression.
Commit: 29976f2e58a3700ebedcaaa5692dcd5befd0cab2
https://github.com/llvm/llvm-project/commit/29976f2e58a3700ebedcaaa5692dcd5befd0cab2
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
M llvm/lib/Target/AMDGPU/GCNSubtarget.h
A llvm/test/CodeGen/AMDGPU/hazard-getreg-waitalu.mir
Log Message:
-----------
[AMDGPU] Handle S_GETREG_B32 hazard on gfx1250 (#153848)
GFX1250 SPG says: S_GETREG_B32 does not wait for idle before executing.
The user must S_WAIT_ALU 0 before S_GETREG_B32 on:
STATUS, STATE_PRIV, EXCP_FLAG_PRIV, or EXCP_FLAG_USER.
Commit: dcdbd5b55db818424ce034285eb8482b26f43d73
https://github.com/llvm/llvm-project/commit/dcdbd5b55db818424ce034285eb8482b26f43d73
Author: Erich Keane <ekeane at nvidia.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M clang/lib/Sema/SemaOpenACC.cpp
Log Message:
-----------
[OpenACC][NFCI] Implement 'recipe' generation for firstprivate copy (#153622)
The 'firstprivate' clause requires that we do a 'copy' operation, so
this patch creates some AST nodes from which we can generate the copy
operation, including a 'temporary' and array init. For the most part
this is pretty similar to what 'private' does other than the fact that
the source is copy (and not default init!), and that there is a
temporary from which to copy.
---------
Co-authored-by: Andy Kaylor <akaylor at nvidia.com>
Commit: 758c6852c3ffe6b5e259cafadd811e60d8c276fb
https://github.com/llvm/llvm-project/commit/758c6852c3ffe6b5e259cafadd811e60d8c276fb
Author: Alexey Bataev <a.bataev at outlook.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
A llvm/test/Transforms/SLPVectorizer/X86/schedule-same-user-with-copyable.ll
Log Message:
-----------
[SLP]Do not include copyable data to the same user twice
If the copyable schedule data is created and the user is used several
times in the user node, no need to count same data for the same user
several times, need to include it only ones.
Fixes #153754
Commit: 82caa251d4e145b54ea76236213617076f254c2b
https://github.com/llvm/llvm-project/commit/82caa251d4e145b54ea76236213617076f254c2b
Author: zGoldthorpe <Zach.Goldthorpe at amd.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
A llvm/test/Transforms/InstCombine/repack-ints-thru-zext.ll
Log Message:
-----------
[InstCombine] Fold integer unpack/repack patterns through ZExt (#153583)
This patch explicitly enables the InstCombiner to fold integer
unpack/repack patterns such as
```llvm
define i64 @src_combine(i32 %lower, i32 %upper) {
%base = zext i32 %lower to i64
%u.0 = and i32 %upper, u0xff
%z.0 = zext i32 %u.0 to i64
%s.0 = shl i64 %z.0, 32
%o.0 = or i64 %base, %s.0
%r.1 = lshr i32 %upper, 8
%u.1 = and i32 %r.1, u0xff
%z.1 = zext i32 %u.1 to i64
%s.1 = shl i64 %z.1, 40
%o.1 = or i64 %o.0, %s.1
%r.2 = lshr i32 %upper, 16
%u.2 = and i32 %r.2, u0xff
%z.2 = zext i32 %u.2 to i64
%s.2 = shl i64 %z.2, 48
%o.2 = or i64 %o.1, %s.2
%r.3 = lshr i32 %upper, 24
%u.3 = and i32 %r.3, u0xff
%z.3 = zext i32 %u.3 to i64
%s.3 = shl i64 %z.3, 56
%o.3 = or i64 %o.2, %s.3
ret i64 %o.3
}
; =>
define i64 @tgt_combine(i32 %lower, i32 %upper) {
%base = zext i32 %lower to i64
%upper.zext = zext i32 %upper to i64
%s.0 = shl nuw i64 %upper.zext, 32
%o.3 = or disjoint i64 %s.0, %base
ret i64 %o.3
}
```
Alive2 proofs: [YAy7ny](https://alive2.llvm.org/ce/z/YAy7ny)
Commit: 79cf877627ec341c62f64e25a44f3ba340edad1e
https://github.com/llvm/llvm-project/commit/79cf877627ec341c62f64e25a44f3ba340edad1e
Author: Abhinav Gaba <abhinav.gaba at intel.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M offload/plugins-nextgen/amdgpu/src/rtl.cpp
M offload/plugins-nextgen/common/include/PluginInterface.h
M offload/plugins-nextgen/common/src/PluginInterface.cpp
M offload/plugins-nextgen/cuda/src/rtl.cpp
M offload/plugins-nextgen/host/src/rtl.cpp
Log Message:
-----------
[Offload] Introduce dataFence plugin interface. (#153793)
The purpose of this fence is to ensure that any `dataSubmit`s inserted
into a queue before a `dataFence` finish before finish before any
`dataSubmit`s
inserted after it begin.
This is a no-op for most queues, since they are in-order, and by design
any operations inserted into them occur in order.
But the interface is supposed to be functional for out-of-order queues.
The addition of the interface means that any operations that rely on
such ordering (like ATTACH map-type support in #149036) can invoke it,
without worrying about whether the underlying queue is in-order or
out-of-order.
Once a plugin supports out-of-order queues, the plugin can implement
this function, without requiring any change at the libomptarget level.
---------
Co-authored-by: Alex Duran <alejandro.duran at intel.com>
Commit: d7a29e5d5605f277d991b03a3923597a033d73ed
https://github.com/llvm/llvm-project/commit/d7a29e5d5605f277d991b03a3923597a033d73ed
Author: Jasmine Tang <jjasmine at igalia.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
M llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp
M llvm/test/CodeGen/WebAssembly/memcmp-expand.ll
A llvm/test/CodeGen/WebAssembly/simd-setcc.ll
Log Message:
-----------
[WebAssembly] Reapply #149461 with correct CondCode in combine of SETCC (#153703)
This PR reapplies https://github.com/llvm/llvm-project/pull/149461
In the original `combineVectorSizedSetCCEquality`, the result of setcc
is being negated by returning setcc with the same cond code, leading to
wrong logic.
For example, with
```llvm
%cmp_16 = call i32 @memcmp(ptr %a, ptr %b, i32 16)
%res = icmp eq i32 %cmp_16, 0
```
the original PR producese all_true and then also compares the result
equal to 0 (using the same SETEQ in the returning setcc), meaning that
semantically, it effectively is calling icmp ne.
Instead, the PR should have use SETNE in the returning setcc, this way,
all true return 1, then it is compared again ne 0, which is equivalent
to icmp eq.
Commit: 09f5b9ab0a40b7905701f05094b19964d16cc183
https://github.com/llvm/llvm-project/commit/09f5b9ab0a40b7905701f05094b19964d16cc183
Author: Alexey Bataev <a.bataev at outlook.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
R llvm/test/Transforms/SLPVectorizer/X86/schedule-same-user-with-copyable.ll
Log Message:
-----------
Revert "[SLP]Do not include copyable data to the same user twice"
This reverts commit 758c6852c3ffe6b5e259cafadd811e60d8c276fb to fix
buildbot https://lab.llvm.org/buildbot/#/builders/195/builds/13298
Commit: 139bde203535a89aa975047d496392931bc972b4
https://github.com/llvm/llvm-project/commit/139bde203535a89aa975047d496392931bc972b4
Author: Bill Wendling <morbo at google.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M .gitignore
Log Message:
-----------
[llvm] Ignore coding assistant artifacts (#153853)
Now that "vibe coding" is a thing, ignore the documentation artifacts
that coding assistants, like Claude and Gemini, use to retain coding
workflows and other metadata.
Commit: c6ea7d72d12073c63681bca998a87b4a436a9dff
https://github.com/llvm/llvm-project/commit/c6ea7d72d12073c63681bca998a87b4a436a9dff
Author: Augusto Noronha <anoronha at apple.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
M lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
Log Message:
-----------
[lldb] Fix CXX's SymbolNameFitsToLanguage matching other languages (#153685)
The current implementation of
CPlusPlusLanguage::SymbolNameFitsToLanguage will return true if the
symbol is mangled for any language that lldb knows about.
Commit: 49e28d77b8df2ee2a7f97d0f685a3ccbf3360050
https://github.com/llvm/llvm-project/commit/49e28d77b8df2ee2a7f97d0f685a3ccbf3360050
Author: CatherineMoore <catmoore at amd.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M openmp/libompd/gdb-plugin/ompdModule.c
Log Message:
-----------
[OpenMP] Update ompdModule.c printf to match argument type (#152785)
Update printf format string to match argument list
---------
Co-authored-by: Joachim <protze at rz.rwth-aachen.de>
Co-authored-by: Joachim Jenke <jenke at itc.rwth-aachen.de>
Commit: b3e3a2090b7307c7efbfbc7cee9d9573f2226d3b
https://github.com/llvm/llvm-project/commit/b3e3a2090b7307c7efbfbc7cee9d9573f2226d3b
Author: Chenguang Wang <w3cing at gmail.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M utils/bazel/llvm-project-overlay/mlir/test/Target/BUILD.bazel
Log Message:
-----------
[bazel] Add missing test inputs inclusion on mlir/test/Target. (#153854)
https://github.com/llvm/llvm-project/pull/152131 added a few tests that
depend on `mlir/test/Target/Wasm/inputs/*`, e.g.
`mlir/test/Target/Wasm/import.mlir` reads `inputs/import.yaml.wasm`.
These inputs should be included as data dependency.
Commit: 2ed727f3f6eedaff061cb38a2404beff970a0243
https://github.com/llvm/llvm-project/commit/2ed727f3f6eedaff061cb38a2404beff970a0243
Author: Florian Hahn <flo at fhahn.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Log Message:
-----------
[VPlan] Move SCEV invalidation to ::executePlan. (NFCI)
Move SCEV invalidation from legacy ILV code-path directly to ::executePlan.
Commit: 732eb5427cfcb103710b21ca6f2de8dbacaec215
https://github.com/llvm/llvm-project/commit/732eb5427cfcb103710b21ca6f2de8dbacaec215
Author: David Green <david.green at arm.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Target/AArch64/AArch64InstrFormats.td
M llvm/lib/Target/AArch64/AArch64InstrInfo.td
Log Message:
-----------
[AArch64] Replace SIMDLongThreeVectorBHSabd with SIMDLongThreeVectorBHS. (#152987)
We just need to use a BinOpFrag to share the patterns. This also moves
UABDL to where it belongs in with similar instructions, and removes some
patterns that are now handled by abd nodes. This is mostly NFC except
for GISel, which will catch back up when it handles abd nodes in the
same way.
Commit: b157599156942de04d1174a5dbf5d07ca81256d7
https://github.com/llvm/llvm-project/commit/b157599156942de04d1174a5dbf5d07ca81256d7
Author: Alexey Bataev <a.bataev at outlook.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
A llvm/test/Transforms/SLPVectorizer/X86/schedule-same-user-with-copyable.ll
Log Message:
-----------
[SLP]Do not include copyable data to the same user twice
If the copyable schedule data is created and the user is used several
times in the user node, no need to count same data for the same user
several times, need to include it only ones.
Fixes #153754
Commit: cce60ce18a513838c79667129ffec8bd51b0db07
https://github.com/llvm/llvm-project/commit/cce60ce18a513838c79667129ffec8bd51b0db07
Author: Alexey Bataev <a.bataev at outlook.com>
Date: 2025-08-15 (Fri, 15 Aug 2025)
Changed paths:
A .github/workflows/bazel-checks.yml
M .gitignore
M clang/docs/LanguageExtensions.rst
M clang/docs/ReleaseNotes.rst
M clang/lib/AST/ByteCode/InterpBuiltin.cpp
M clang/lib/Headers/emmintrin.h
M clang/lib/Headers/xmmintrin.h
M clang/lib/Sema/SemaDecl.cpp
M clang/lib/Sema/SemaDeclCXX.cpp
M clang/lib/Sema/SemaOpenACC.cpp
M clang/test/CXX/class/class.union/class.union.anon/p4.cpp
M clang/test/CodeGen/X86/avx512bf16-builtins.c
M clang/test/CodeGen/X86/avx512vbmi-builtins.c
M clang/test/CodeGen/X86/avx512vbmi2-builtins.c
M clang/test/CodeGen/X86/avx512vbmivl-builtin.c
M clang/test/CodeGen/X86/avx512vlbf16-builtins.c
M clang/test/CodeGen/X86/avx512vlvbmi2-builtins.c
M clang/test/CodeGen/X86/avx512vlvnni-builtins.c
M clang/test/CodeGen/X86/avx512vnni-builtins.c
M flang/module/cudadevice.f90
M flang/test/Lower/CUDA/cuda-device-proc.cuf
M flang/test/Lower/CUDA/cuda-libdevice.cuf
M libc/benchmarks/gpu/CMakeLists.txt
M libc/benchmarks/gpu/LibcGpuBenchmark.cpp
M libc/benchmarks/gpu/LibcGpuBenchmark.h
M libc/benchmarks/gpu/src/math/CMakeLists.txt
M libc/benchmarks/gpu/src/math/atan2_benchmark.cpp
M libc/benchmarks/gpu/src/math/sin_benchmark.cpp
M libc/benchmarks/gpu/timing/amdgpu/CMakeLists.txt
M libc/benchmarks/gpu/timing/amdgpu/timing.h
M libc/benchmarks/gpu/timing/nvptx/CMakeLists.txt
M libc/benchmarks/gpu/timing/nvptx/timing.h
M lldb/include/lldb/DataFormatters/DumpValueObjectOptions.h
M lldb/include/lldb/DataFormatters/ValueObjectPrinter.h
M lldb/include/lldb/Interpreter/OptionGroupValueObjectDisplay.h
M lldb/source/Commands/CommandObjectDWIMPrint.cpp
M lldb/source/Commands/CommandObjectExpression.cpp
M lldb/source/DataFormatters/DumpValueObjectOptions.cpp
M lldb/source/DataFormatters/ValueObjectPrinter.cpp
M lldb/source/Expression/REPL.cpp
M lldb/source/Interpreter/OptionGroupValueObjectDisplay.cpp
M lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
A lldb/test/API/lang/objc/failing-description/Makefile
A lldb/test/API/lang/objc/failing-description/TestObjCFailingDescription.py
A lldb/test/API/lang/objc/failing-description/main.m
A lldb/test/API/lang/objc/struct-description/Makefile
A lldb/test/API/lang/objc/struct-description/TestObjCStructDescription.py
A lldb/test/API/lang/objc/struct-description/main.m
M lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
M llvm/docs/AMDGPUUsage.rst
M llvm/include/llvm/CodeGen/TargetLowering.h
M llvm/include/llvm/IR/PatternMatch.h
M llvm/include/llvm/InitializePasses.h
M llvm/include/llvm/LinkAllPasses.h
M llvm/include/llvm/Transforms/IPO/GlobalDCE.h
M llvm/lib/CodeGen/InterleavedAccessPass.cpp
M llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
M llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
M llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
M llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
M llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
M llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
M llvm/lib/CodeGen/VirtRegMap.cpp
M llvm/lib/Target/AArch64/AArch64FastISel.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
M llvm/lib/Target/AArch64/AArch64InstrFormats.td
M llvm/lib/Target/AArch64/AArch64InstrInfo.td
M llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
M llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
M llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.h
M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
M llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
M llvm/lib/Target/AMDGPU/GCNSubtarget.h
M llvm/lib/Target/ARM/ARMISelLowering.cpp
M llvm/lib/Target/ARM/ARMSelectionDAGInfo.cpp
M llvm/lib/Target/AVR/AVRISelLowering.cpp
M llvm/lib/Target/CSKY/CSKYISelLowering.cpp
M llvm/lib/Target/DirectX/CMakeLists.txt
M llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
M llvm/lib/Target/DirectX/DirectXPassRegistry.def
M llvm/lib/Target/DirectX/DirectXTargetMachine.cpp
M llvm/lib/Target/Hexagon/HexagonSelectionDAGInfo.cpp
M llvm/lib/Target/Hexagon/HexagonVectorCombine.cpp
M llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
M llvm/lib/Target/M68k/M68kISelLowering.cpp
M llvm/lib/Target/Mips/MipsISelLowering.cpp
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
M llvm/lib/Target/PowerPC/PPCISelLowering.cpp
M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
M llvm/lib/Target/Sparc/SparcISelLowering.cpp
M llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
M llvm/lib/Target/VE/VEISelLowering.cpp
M llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
M llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/lib/Target/X86/X86SchedSkylakeClient.td
M llvm/lib/Target/X86/X86SchedSkylakeServer.td
M llvm/lib/Target/X86/X86WinEHUnwindV2.cpp
M llvm/lib/Target/XCore/XCoreISelLowering.cpp
M llvm/lib/Target/XCore/XCoreSelectionDAGInfo.cpp
M llvm/lib/Transforms/IPO/GlobalDCE.cpp
M llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
M llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
M llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
M llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
M llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
M llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
M llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
M llvm/lib/Transforms/Vectorize/VPlanUtils.h
M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
A llvm/test/CodeGen/AMDGPU/dvgpr_sym.ll
A llvm/test/CodeGen/AMDGPU/dvgpr_sym_fail_too_many_block_size_16.ll
A llvm/test/CodeGen/AMDGPU/dvgpr_sym_fail_too_many_block_size_16_anon.ll
A llvm/test/CodeGen/AMDGPU/hazard-getreg-waitalu.mir
M llvm/test/CodeGen/DirectX/finalize_linkage.ll
M llvm/test/CodeGen/DirectX/llc-pipeline.ll
M llvm/test/CodeGen/DirectX/scalar-data.ll
A llvm/test/CodeGen/NVPTX/cse-mov-sym.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
M llvm/test/CodeGen/WebAssembly/memcmp-expand.ll
A llvm/test/CodeGen/WebAssembly/simd-setcc.ll
M llvm/test/CodeGen/X86/win64-eh-unwindv2-errors.mir
A llvm/test/CodeGen/X86/win64-eh-unwindv2-push-pop-stack-alloc.mir
A llvm/test/Transforms/InstCombine/repack-ints-thru-zext.ll
M llvm/test/Transforms/SLPVectorizer/X86/matched-shuffled-entries.ll
A llvm/test/Transforms/SLPVectorizer/X86/schedule-same-user-with-copyable.ll
M llvm/test/Transforms/VectorCombine/X86/intrinsic-scalarize.ll
M llvm/test/Transforms/VectorCombine/binop-scalarize.ll
M llvm/test/Transforms/VectorCombine/intrinsic-scalarize.ll
M llvm/test/tools/dxil-dis/opaque-value_as_metadata.ll
M llvm/test/tools/llvm-mca/X86/SkylakeClient/zero-idioms.s
M llvm/test/tools/llvm-mca/X86/SkylakeServer/zero-idioms.s
M llvm/utils/revert_checker.py
M llvm/utils/revert_checker_test.py
M mlir/include/mlir/Analysis/Presburger/IntegerRelation.h
M mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
M mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
M mlir/lib/Analysis/Presburger/IntegerRelation.cpp
M mlir/lib/Dialect/EmitC/IR/EmitC.cpp
M mlir/test/Dialect/EmitC/invalid_ops.mlir
M mlir/test/Dialect/LLVMIR/rocdl.mlir
M mlir/test/Target/LLVMIR/rocdl.mlir
M mlir/unittests/Analysis/Presburger/IntegerRelationTest.cpp
M offload/plugins-nextgen/amdgpu/src/rtl.cpp
M offload/plugins-nextgen/common/include/PluginInterface.h
M offload/plugins-nextgen/common/src/PluginInterface.cpp
M offload/plugins-nextgen/cuda/src/rtl.cpp
M offload/plugins-nextgen/host/src/rtl.cpp
M openmp/libompd/gdb-plugin/ompdModule.c
M utils/bazel/llvm-project-overlay/mlir/test/Target/BUILD.bazel
Log Message:
-----------
Rebase
Created using spr 1.3.5
Compare: https://github.com/llvm/llvm-project/compare/efe2e4d48475...cce60ce18a51
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list