[all-commits] [llvm/llvm-project] 5ae9ff: [RISCV] Address review comment from 88062
darkbuck via All-commits
all-commits at lists.llvm.org
Wed Apr 10 15:52:09 PDT 2024
Branch: refs/heads/users/darkbuck/spr/globalisel-handle-more-commutable-instructions-in-commute_constant_to_rhs
Home: https://github.com/llvm/llvm-project
Commit: 5ae9ffbd18fd93edbbc8efebe140aeb24cd763c2
https://github.com/llvm/llvm-project/commit/5ae9ffbd18fd93edbbc8efebe140aeb24cd763c2
Author: Philip Reames <preames at rivosinc.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
Log Message:
-----------
[RISCV] Address review comment from 88062
As pointed out by Fraser, KillSrcReg is always false at this point in
code, and having the inconcistency on whether we check the flag between
the if and else blocks is confusing.
Commit: a8f9f85ab0114deb0f6adae2b578bc39c62c19b3
https://github.com/llvm/llvm-project/commit/a8f9f85ab0114deb0f6adae2b578bc39c62c19b3
Author: Joseph Huber <huberjn at outlook.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
M openmp/libomptarget/src/interface.cpp
Log Message:
-----------
[Libomptarget][NFC] Fix unused variable warnings
Summary:
This patch fixes a few warnings that would show up while building.
Commit: 2bf48892ab0ce5d53126c7b114070bba18521501
https://github.com/llvm/llvm-project/commit/2bf48892ab0ce5d53126c7b114070bba18521501
Author: Yaxun (Sam) Liu <yaxun.liu at amd.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/docs/HIPSupport.rst
Log Message:
-----------
[HIP] document difference with CUDA (#86838)
Commit: 6ca5a410d26262f06f954e91200eefe0cbfb7fb8
https://github.com/llvm/llvm-project/commit/6ca5a410d26262f06f954e91200eefe0cbfb7fb8
Author: Alexey Bataev <a.bataev at outlook.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
A llvm/test/Transforms/SLPVectorizer/X86/extractlements-gathered-first-node.ll
Log Message:
-----------
[SLP]Fix PR87358: broken module, Instruction does not dominate all uses.
If the first node is a gather node with extractelement instructions,
still need to put the vector value after all instructions, not after the
very first one.
Commit: 7f1b9adfc8d86c77ee87a268b3d30e0eda8ed493
https://github.com/llvm/llvm-project/commit/7f1b9adfc8d86c77ee87a268b3d30e0eda8ed493
Author: Craig Topper <craig.topper at sifive.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/include/llvm/CodeGen/MachineCombinerPattern.h
M llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
M llvm/test/CodeGen/RISCV/rv64zba.ll
Log Message:
-----------
[RISCV] Add MachineCombiner to fold (sh3add Z, (add X, (slli Y, 6))) -> (sh3add (sh3add Y, Z), X). (#87884)
This improves a pattern that occurs in 531.deepsjeng_r. Reducing the
dynamic instruction count by 0.5%.
This may be possible to improve in SelectionDAG, but given the special
cases around shXadd formation, it's not obvious it can be done in a
robust way without adding multiple special cases.
I've used a GEP with 2 indices because that mostly closely resembles the
motivating case. Most of the test cases are the simplest GEP case. One
test has a logical right shift on an index which is closer to the
deepsjeng code. This requires special handling in isel to reverse a
DAGCombiner canonicalization that turns a pair of shifts into (srl (and
X, C1), C2).
Commit: f9f4aba547f50e6dcb2d9345b51fe4883bb64d8d
https://github.com/llvm/llvm-project/commit/f9f4aba547f50e6dcb2d9345b51fe4883bb64d8d
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
A llvm/test/Transforms/InstCombine/vector-reduce-min-max-known.ll
Log Message:
-----------
[InstCombine] Add tests for non-zero/knownbits of `vector_reduce_{s,u}{min,max}`; NFC
Commit: 77d668451ad2e6370eb595c171779429e9becdf2
https://github.com/llvm/llvm-project/commit/77d668451ad2e6370eb595c171779429e9becdf2
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstCombine/vector-reduce-min-max-known.ll
Log Message:
-----------
[ValueTracking] Add support for `vector_reduce_{s,u}{min,max}` in `isKnownNonZero`
Previously missing, proofs for all implementations:
https://alive2.llvm.org/ce/z/G8wpmG
Commit: 41c52217b003ce9435ae534251b0d0d035495262
https://github.com/llvm/llvm-project/commit/41c52217b003ce9435ae534251b0d0d035495262
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstCombine/vector-reduce-min-max-known.ll
Log Message:
-----------
[ValueTracking] Add support for `vector_reduce_{s,u}{min,max}` in `computeKnownBits`
Previously missing. We compute by just applying the reduce function on
the knownbits of each element.
Closes #88169
Commit: a02b3c01820090d4208146b51372587251fdce61
https://github.com/llvm/llvm-project/commit/a02b3c01820090d4208146b51372587251fdce61
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/Transforms/InstCombine/known-bits.ll
Log Message:
-----------
[ValueTracking] Add tests for overflow detection functions is `isKnownNonZero`; NFC
Commit: f0a487d7e2085e21f3691393070f54110d889fb6
https://github.com/llvm/llvm-project/commit/f0a487d7e2085e21f3691393070f54110d889fb6
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
Log Message:
-----------
[ValueTracking] Split `isNonZero(mul)` logic to a helper; NFC
Commit: 37ca6fa1e26e86c85c544023b18695be420e80dd
https://github.com/llvm/llvm-project/commit/37ca6fa1e26e86c85c544023b18695be420e80dd
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstCombine/known-bits.ll
Log Message:
-----------
[ValueTracking] Add support for overflow detection functions is `isKnownNonZero`
Adds support for: `{s,u}{add,sub,mul}.with.overflow`
The logic is identical to the the non-overflow binops, we where just
missing the cases.
Closes #87701
Commit: 2ff82c2c6490a1478e4311f60f1ce80af0957403
https://github.com/llvm/llvm-project/commit/2ff82c2c6490a1478e4311f60f1ce80af0957403
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
Log Message:
-----------
[ValueTracking] Add tests for improving `isKnownNonZero` of `smax`; NFC
Commit: f1ee458ddb45c9887b3df583ce9a4ba12aae8b3b
https://github.com/llvm/llvm-project/commit/f1ee458ddb45c9887b3df583ce9a4ba12aae8b3b
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
Log Message:
-----------
[ValueTracking] improve `isKnownNonZero` precision for `smax`
Instead of relying on known-bits for strictly positive, use the
`isKnownPositive` API. This will use `isKnownNonZero` which is more
accurate.
Closes #88170
Commit: 7d60232b38b66138dae1b31027d73ee5b9df5c58
https://github.com/llvm/llvm-project/commit/7d60232b38b66138dae1b31027d73ee5b9df5c58
Author: Krzysztof Parzyszek <Krzysztof.Parzyszek at amd.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/docs/tools/clang-formatted-files.txt
M clang/include/clang/Driver/Options.td
M clang/lib/Driver/ToolChains/Flang.cpp
M flang/include/flang/Frontend/PreprocessorOptions.h
M flang/include/flang/Parser/parsing.h
A flang/include/flang/Parser/preprocessor.h
A flang/include/flang/Parser/token-sequence.h
M flang/lib/Frontend/CompilerInvocation.cpp
M flang/lib/Frontend/FrontendActions.cpp
M flang/lib/Parser/parsing.cpp
M flang/lib/Parser/preprocessor.cpp
R flang/lib/Parser/preprocessor.h
M flang/lib/Parser/prescan.cpp
M flang/lib/Parser/prescan.h
M flang/lib/Parser/token-sequence.cpp
R flang/lib/Parser/token-sequence.h
M flang/test/Driver/driver-help-hidden.f90
M flang/test/Driver/driver-help.f90
A flang/test/Preprocessing/show-macros1.F90
A flang/test/Preprocessing/show-macros2.F90
A flang/test/Preprocessing/show-macros3.F90
Log Message:
-----------
[flang][Frontend] Implement printing defined macros via -dM (#87627)
This should work the same way as in clang.
Commit: 52aaa8a87960a7d342c5e6b7d5af82c76c8cc45d
https://github.com/llvm/llvm-project/commit/52aaa8a87960a7d342c5e6b7d5af82c76c8cc45d
Author: Jordan Rupprecht <rupprecht at google.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/test/Driver/lld-repro.c
Log Message:
-----------
[clang][test] Avoid writing to a potentially write-protected dir (#88258)
This test just checks for the stdout/stderr of clang, but it
incidentally tries to write to `a.out` in the current directory, which
may be write protected. Typically one would write `clang -o %t.o` for a
writeable dir, but since we only care about stdout/stderr, throw away
the object file and just write to /dev/null instead.
Commit: 0ad663ead1242e908a8c5005f35e72747d136a3b
https://github.com/llvm/llvm-project/commit/0ad663ead1242e908a8c5005f35e72747d136a3b
Author: Mark de Wever <koraq at xs4all.nl>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M libcxx/include/__algorithm/simd_utils.h
M libcxx/include/__config
M libcxx/test/libcxx/gdb/gdb_pretty_printer_test.sh.cpp
M libcxx/test/std/ranges/range.utility/range.utility.conv/to_deduction.pass.cpp
M libcxx/test/std/utilities/format/format.arguments/format.arg/visit.pass.cpp
M libcxx/test/std/utilities/format/format.arguments/format.arg/visit.return_type.pass.cpp
M libcxx/test/std/utilities/format/format.arguments/format.arg/visit_format_arg.deprecated.verify.cpp
M libcxx/test/std/utilities/variant/variant.visit.member/robust_against_adl.pass.cpp
M libcxx/test/std/utilities/variant/variant.visit.member/visit.pass.cpp
M libcxx/test/std/utilities/variant/variant.visit.member/visit_return_type.pass.cpp
Log Message:
-----------
[libc++] Removes Clang-16 support. (#87810)
With the release of Clang-18 we no longer officially support Clang-16.
Commit: fc3dff9b4637bb5960fe70add90cd27e6842d58b
https://github.com/llvm/llvm-project/commit/fc3dff9b4637bb5960fe70add90cd27e6842d58b
Author: Jan Svoboda <jan_svoboda at apple.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/Serialization/ASTReader.cpp
A clang/test/Modules/home-is-cwd-search-paths.c
Log Message:
-----------
[clang][modules] Stop eagerly reading files with diagnostic pragmas (#87442)
This makes it so that the importer doesn't need to stat all input files
of a module that contain diagnostic pragmas, reducing file system
traffic.
Commit: 51786eb5bfc30e7eff998323a9ce433ec4620383
https://github.com/llvm/llvm-project/commit/51786eb5bfc30e7eff998323a9ce433ec4620383
Author: Jan Svoboda <jan_svoboda at apple.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/Serialization/ASTWriter.cpp
Log Message:
-----------
[clang][modules] Only compute affecting module maps with implicit search (#87849)
When writing out a PCM, we compute the set of module maps that did
affect the compilation and we strip the rest to make the output
independent of them. The most common way to read a module map that is
not affecting is with implicit module map search. The other option is to
pass a bunch of unnecessary `-fmodule-map-file=<path>` arguments on the
command-line, in which case the client should probably not give those to
Clang anyway.
This makes serialization of explicit modules faster, mostly due to
reduced file system traffic.
Commit: 323d3ab2574ba9d371926bb1b5c67dbe7b2b4ec3
https://github.com/llvm/llvm-project/commit/323d3ab2574ba9d371926bb1b5c67dbe7b2b4ec3
Author: Craig Topper <craig.topper at sifive.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
M llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
Log Message:
-----------
[RISCV] Optimize undef Even vector in getWideningInterleave. (#88221)
We recently optimized the code when the Odd vector was undef to fix a
poison bug.
There are additional optimizations we can do if the even vector is
undef. With Zvbb, we can use a single vwsll. Without Zvbb, we can use a
vzext.vf2 and a vsll.
Commit: e72c949c15208ba3dd53a9cebfee02734965a678
https://github.com/llvm/llvm-project/commit/e72c949c15208ba3dd53a9cebfee02734965a678
Author: Evgenii Stepanov <eugeni.stepanov at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
M llvm/test/Instrumentation/MemorySanitizer/overflow.ll
Log Message:
-----------
[msan] Overflow intrinsics. (#88210)
Commit: 43b2b2ebce635bec1e3c060092ea75db858ee3fd
https://github.com/llvm/llvm-project/commit/43b2b2ebce635bec1e3c060092ea75db858ee3fd
Author: Mehdi Amini <joker.eph at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
M mlir/test/Conversion/ComplexToStandard/convert-to-standard.mlir
Log Message:
-----------
Revert "Fix complex log1p accuracy with large abs values." (#88290)
Reverts llvm/llvm-project#88260
The test fails on the GCC7 buildbot.
Commit: 48c5c70fdd3bec2929e2e903e3bf4494a65f7a92
https://github.com/llvm/llvm-project/commit/48c5c70fdd3bec2929e2e903e3bf4494a65f7a92
Author: erichkeane <ekeane at nvidia.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/Sema/SemaOpenACC.cpp
Log Message:
-----------
[NFC] Update SemaRef.Diag to just Diag in OpenACC implementation
I missed these two in my last patch as the two patches crossed in
review, so correct this now.
Commit: 3d468566eb395995ac54fcf90d3afb9b9f822eb3
https://github.com/llvm/llvm-project/commit/3d468566eb395995ac54fcf90d3afb9b9f822eb3
Author: erichkeane <ekeane at nvidia.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/Serialization/ASTReader.cpp
Log Message:
-----------
[NFC] Remove unneeded 'maybe_unused' attributes
This was added while we only had a partial implementation of clauses, so
we don't need these anymore.
Commit: f388a3a446ef2566d73b6a73ba300738f8c2c002
https://github.com/llvm/llvm-project/commit/f388a3a446ef2566d73b6a73ba300738f8c2c002
Author: Aart Bik <ajcbik at google.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
M mlir/test/Dialect/SparseTensor/invalid.mlir
M mlir/test/Dialect/SparseTensor/roundtrip.mlir
M mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
Log Message:
-----------
[mlir][sparse] update doc and examples of the [dis]assemble operations (#88213)
The doc and examples of the [dis]assemble operations did not reflect all
the recent changes on order of the operands. Also clarified some of the
text.
Commit: 798e04f93769318db857b27f51020e7115e00301
https://github.com/llvm/llvm-project/commit/798e04f93769318db857b27f51020e7115e00301
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/include/clang/Basic/OpenACCKinds.h
Log Message:
-----------
Fix MSVC "not all control paths return a value" warning. NFC.
Commit: 335d5d5f47b883055e676ffe5f981469a5f5f4f6
https://github.com/llvm/llvm-project/commit/335d5d5f47b883055e676ffe5f981469a5f5f4f6
Author: Vyacheslav Levytskyy <vyacheslav.levytskyy at intel.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/SPIRV/SPIRVUtils.cpp
M llvm/test/CodeGen/SPIRV/SampledImageRetType.ll
Log Message:
-----------
[SPIRV] Tweak parsing of base type name in builtins (#88255)
This PR is a small improvement of parsing of base type name in builtins,
allowing to understand `unsigned ...` types. The test case that fails
without the fix is attached.
Commit: 4dcf33b6c2806216dfe8c5e1e3582a45516dbc69
https://github.com/llvm/llvm-project/commit/4dcf33b6c2806216dfe8c5e1e3582a45516dbc69
Author: David Green <david.green at arm.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/CodeGen/AArch64/lrint-conv.ll
M llvm/test/CodeGen/AArch64/vector-lrint.ll
Log Message:
-----------
[AArch64] Cleanup and GISel coverage for lrint tests. NFC
Commit: 04bf1a4090c535e3a1033ab9a8ef92068166461f
https://github.com/llvm/llvm-project/commit/04bf1a4090c535e3a1033ab9a8ef92068166461f
Author: Kojo Acquah <KoolJBlack at users.noreply.github.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp
M mlir/test/Dialect/ArmNeon/lower-to-arm-neon.mlir
Log Message:
-----------
Update `LowerContractionToSMMLAPattern` to ingnore matvec (#88288)
Patterns in `LowerContractionToSMMLAPattern` are designed to handle
vector-to-matrix multiplication but not matrix-to-vector. This leads to
the following error when processing `rhs` with rank < 2:
```
iree-compile: /usr/local/google/home/kooljblack/code/iree-build/llvm-project/tools/mlir/include/mlir/IR/BuiltinTypeInterfaces.h.inc:268: int64_t mlir::detail::ShapedTypeTrait<mlir::VectorType>::getDimSize(unsigned int) const [ConcreteType = mlir::VectorType]: Assertion `idx < getRank() && "invalid index for shaped type"' failed.
```
Updates to explicitly check the rhs rank and fail cases that cannot
process.
Commit: c54afe5c33ca6159841d909fb8fe20e5d4e0069b
https://github.com/llvm/llvm-project/commit/c54afe5c33ca6159841d909fb8fe20e5d4e0069b
Author: higher-performance <113926381+higher-performance at users.noreply.github.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/docs/ReleaseNotes.rst
M clang/lib/AST/ParentMapContext.cpp
Log Message:
-----------
Fix quadratic slowdown in AST matcher parent map generation (#87824)
Avoids the need to linearly re-scan all seen parent nodes to check for
duplicates, which previously caused a slowdown for ancestry checks in
Clang AST matchers.
Fixes: #86881
Commit: f27f3697108470c3e995cf3cb454641c22ec1fa9
https://github.com/llvm/llvm-project/commit/f27f3697108470c3e995cf3cb454641c22ec1fa9
Author: Craig Topper <craig.topper at sifive.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
M llvm/test/CodeGen/RISCV/interrupt-attr-nocall.ll
M llvm/test/CodeGen/RISCV/interrupt-attr.ll
Log Message:
-----------
[RISCV] Remove interrupt handler special case from RISCVFrameLowering::determineCalleeSaves. (#88069)
This code was trying to save temporary argument registers in interrupt
handler functions that contain calls. With the exception that all FP
registers are saved including the normally callee saved registers.
If all of the callees use an FP ABI and the interrupt handler doesn't
touch the normally callee saved FP registers, we don't need to save
them.
It doesn't appear that we need to special case functions with calls. The
normal callee saved register handling will already check each of the calls
and consider a register clobbered if the call doesn't explicitly say it is preserved.
All of the test changes are from the removal of the FP callee saved
registers. There are tests for interrupt handlers with F and D extension
that use ilp32 or lp64 ABIs that are not affected by this change. They
still save the FP callee saved registers as they should.
gcc appears to have a bug where the D extension being enabled with the
ilp32f or lp64f ABI does not save the FP callee saved regs. The callee
would only save/restore the lower 32 bits and clobber the upper bits.
LLVM saves the FP callee saved regs in this case and there is an
unchanged test for it.
The unnecessary save/restore was raised in this thread
https://discourse.llvm.org/t/has-bugs-when-optimizing-save-restore-csrs-by-changing-csr-xlen-f32-interrupt/78200/1
Commit: 86842e1f724fba5abae50ce438553895e69b8141
https://github.com/llvm/llvm-project/commit/86842e1f724fba5abae50ce438553895e69b8141
Author: Jun Wang <jwang86 at yahoo.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/include/clang/Driver/Options.td
M clang/lib/Driver/ToolChains/AMDGPU.cpp
M clang/test/Driver/amdgpu-features.c
M llvm/lib/Target/AMDGPU/AMDGPU.td
M llvm/lib/Target/AMDGPU/GCNSubtarget.h
M llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
A llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_precise_memory.ll
Log Message:
-----------
[AMDGPU] New clang option for emitting a waitcnt instruction after each memory instruction (#79236)
This patch introduces a new command-line option for clang, namely,
amdgpu-precise-mem-op (or precise-memory in the backend). When this option is specified, a waitcnt
instruction is generated after each memory load/store instruction. The
counter values are always 0, but which counters are involved depends on
the memory instruction.
---------
Co-authored-by: Jun Wang <jun.wang7 at amd.com>
Commit: 4d80dff819d1164775d0d55fc68bffedb90ba53c
https://github.com/llvm/llvm-project/commit/4d80dff819d1164775d0d55fc68bffedb90ba53c
Author: Aaron Ballman <aaron at aaronballman.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/AST/Interp/FunctionPointer.h
Log Message:
-----------
int -> uintptr_t to silence diagnostics
'int' may not be sufficiently large to store a pointer representation
anyway, so this is also a correctness fix.
Commit: 21009f466ece9f21b18e1bb03bd74b566188bae5
https://github.com/llvm/llvm-project/commit/21009f466ece9f21b18e1bb03bd74b566188bae5
Author: martinboehme <mboehme at google.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
M clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
M clang/lib/Analysis/FlowSensitive/Transfer.cpp
M clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
M clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
M clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
Log Message:
-----------
[clang][dataflow] Propagate locations from result objects to initializers. (#87320)
Previously, we were propagating storage locations the other way around,
i.e.
from initializers to result objects, using `RecordValue::getLoc()`. This
gave
the wrong behavior in some cases -- see the newly added or fixed tests
in this
patch.
In addition, this patch now unblocks removing the `RecordValue` class
entirely,
as we no longer need `RecordValue::getLoc()`.
With this patch, the test `TransferTest.DifferentReferenceLocInJoin`
started to
fail because the framework now always uses the same storge location for
a
`MaterializeTemporaryExpr`, meaning that the code under test no longer
set up
the desired state where a variable of reference type is mapped to two
different
storage locations in environments being joined. Rather than trying to
modify
this test to set up the test condition again, I have chosen to replace
the test
with an equivalent test in DataflowEnvironmentTest.cpp that sets up the
test
condition directly; because this test is more direct, it will also be
less
brittle in the face of future changes.
Commit: b9a3551c905573df456ee52fa1051e49fa956c65
https://github.com/llvm/llvm-project/commit/b9a3551c905573df456ee52fa1051e49fa956c65
Author: Kevin P. Neal <kevin.neal at sas.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/unittests/Bitcode/BitReaderTest.cpp
Log Message:
-----------
[FPEnv][BitcodeReader] Correct strictfp test.
Correct a strictfp test to follow the rules documented in the LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics
This test needed the strictfp attribute added to a function definition.
Test changes verified with D146845.
Commit: c1d3f39ae98535777c957aab3611d2abc97b2815
https://github.com/llvm/llvm-project/commit/c1d3f39ae98535777c957aab3611d2abc97b2815
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
Log Message:
-----------
[ValueTracking] Add tests for `shufflevector` in `isKnownNonZero`
Commit: 87528bfefbb50ed6560b9b8482fc7c9f86ca34cd
https://github.com/llvm/llvm-project/commit/87528bfefbb50ed6560b9b8482fc7c9f86ca34cd
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
Log Message:
-----------
[ValueTracking] Add support for `shufflevector` in `isKnownNonZero`
Shuffles don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.
Closes #87702
Commit: 8a28b9b8ec1686426a4b43c8431570eaa1da77d9
https://github.com/llvm/llvm-project/commit/8a28b9b8ec1686426a4b43c8431570eaa1da77d9
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
Log Message:
-----------
[ValueTracking] Add tests for `insertelement` in `isKnownNonZero`; NFC
Commit: 9c545a14c09051b011358854655c1f466d656e79
https://github.com/llvm/llvm-project/commit/9c545a14c09051b011358854655c1f466d656e79
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
Log Message:
-----------
[ValueTracking] Add support for `insertelement` in `isKnownNonZero`
Inserts don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.
Closes #87703
Commit: 195d278d502308655edb1e9ff1c6f0c9256d0d15
https://github.com/llvm/llvm-project/commit/195d278d502308655edb1e9ff1c6f0c9256d0d15
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/Transforms/InstSimplify/icmp.ll
Log Message:
-----------
[ValueTracking] Add tests for `xor`/`disjoint or` in `getInvertibleOperands`; NFC
Commit: 0c57a2e4b4e5a6e5dda78a313fc8d8e3c91797f5
https://github.com/llvm/llvm-project/commit/0c57a2e4b4e5a6e5dda78a313fc8d8e3c91797f5
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstSimplify/icmp.ll
Log Message:
-----------
[ValueTracking] Add support for `xor`/`disjoint or` in `getInvertibleOperands`
This strengthens our `isKnownNonEqual` logic with some fairly
trivial cases.
Proofs: https://alive2.llvm.org/ce/z/4pxRTj
Closes #87705
Commit: 2646790155f73d6cfb28ec0ee472056740e4658e
https://github.com/llvm/llvm-project/commit/2646790155f73d6cfb28ec0ee472056740e4658e
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/test/Transforms/InstSimplify/icmp.ll
Log Message:
-----------
[ValueTracking] Add tests for `xor`/`disjoint or` in `isKnownNonZero`; NFC
Commit: 81cdd35c0c8db22bfdd1f06cb2118d17fd99fc07
https://github.com/llvm/llvm-project/commit/81cdd35c0c8db22bfdd1f06cb2118d17fd99fc07
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/test/Transforms/InstSimplify/icmp.ll
Log Message:
-----------
[ValueTracking] Add support for `xor`/`disjoint or` in `isKnownNonZero`
Handles cases like `X ^ Y == X` / `X disjoint| Y == X`.
Both of these cases have identical logic to the existing `add` case,
so just converting the `add` code to a more general helper.
Proofs: https://alive2.llvm.org/ce/z/Htm7pe
Closes #87706
Commit: 2b00a73f62605fcaeaedd358ba8b55fad06571aa
https://github.com/llvm/llvm-project/commit/2b00a73f62605fcaeaedd358ba8b55fad06571aa
Author: Alexey Bataev <a.bataev at outlook.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
M llvm/test/Transforms/SLPVectorizer/AArch64/extractelements-to-shuffle.ll
M llvm/test/Transforms/SLPVectorizer/X86/ext-int-reduced-not-operand.ll
M llvm/test/Transforms/SLPVectorizer/X86/gather-move-out-of-loop.ll
M llvm/test/Transforms/SLPVectorizer/X86/gathered-delayed-nodes-with-reused-user.ll
M llvm/test/Transforms/SLPVectorizer/X86/non-scheduled-inst-reused-as-last-inst.ll
M llvm/test/Transforms/SLPVectorizer/X86/reorder_with_external_users.ll
M llvm/test/Transforms/SLPVectorizer/alternate-non-profitable.ll
Log Message:
-----------
[SLP]Buildvector for alternate instructions with non-profitable gather operands.
If the operands of the potentially alternate node are going to produce
buildvector sequences, which result in more instructions, than the
original code, then suhinstructions should be vectorized as alternate
node, better to end up with the buildvector node.
Left column - experimental, Right - reference.
Metric: size..text
Program size..text
results results0 diff
test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 413680.00 416272.00 0.6%
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12351788.00 12354844.00 0.0%
test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 664901.00 664949.00 0.0%
test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 664901.00 664949.00 0.0%
test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test 1171371.00 1171355.00 -0.0%
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 1036396.00 1036284.00 -0.0%
test-suite :: MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg.test 111280.00 111248.00 -0.0%
test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1392113.00 1391361.00 -0.1%
test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1392113.00 1391361.00 -0.1%
test-suite :: MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc.test 281676.00 281452.00 -0.1%
test-suite :: MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes.test 3025.00 3019.00 -0.2%
test-suite :: MultiSource/Benchmarks/Prolangs-C/plot2fig/plot2fig.test 6351.00 6335.00 -0.3%
Metric: SLP.NumVectorInstructions
Program SLP.NumVectorInstructions
results results0 diff
test-suite :: MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes.test 15.00 16.00 6.7%
test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 1703.00 1707.00 0.2%
test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 1703.00 1707.00 0.2%
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 26241.00 26239.00 -0.0%
test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 11761.00 11754.00 -0.1%
test-suite :: MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc.test 824.00 822.00 -0.2%
test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 5668.00 5654.00 -0.2%
test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 5668.00 5654.00 -0.2%
test-suite :: External/SPEC/CINT2017rate/502.gcc_r/502.gcc_r.test 792.00 790.00 -0.3%
test-suite :: External/SPEC/CINT2017speed/602.gcc_s/602.gcc_s.test 792.00 790.00 -0.3%
test-suite :: MultiSource/Benchmarks/FreeBench/pifft/pifft.test 1389.00 1384.00 -0.4%
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 596.00 590.00 -1.0%
test-suite :: MultiSource/Benchmarks/Prolangs-C/plot2fig/plot2fig.test 6.00 5.00 -16.7%
Metric: exec_time
Program exec_time
results results0 diff
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 99.14 100.00 0.9%
Other changes are not significant (less than 0.1% percent with exectime
less 5 secs).
SingleSource/Benchmarks/Adobe-C++/loop_unroll - same small patterns
remain scalar, smaller code.
External/SPEC/CFP2017rate/526.blender_r/526.blender_r - many small
changes, some extra stores gets vectorized.
External/SPEC/CINT2017speed/625.x264_s/625.x264_s
External/SPEC/CINT2017rate/525.x264_r/525.x264_r
x264 has one change in a loop body, in function ssim_end4, some code
remain scalar, resulting in less code size.
External/SPEC/CFP2017rate/511.povray_r/511.povray_r - some extra code
gets vectorized, looks like some other patterns were matched.
MultiSource/Benchmarks/7zip/7zip-benchmark - extra stores were
vectorized (looks like the graphs become profitable)
MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg - small
changes in vectorized code (some small part remain scalar).
External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r
External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s
Many changes cause by the fact that the code of one function becomes
smaller (onvertLCHabToRGB) and this functions gets inlined after that.
MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc - some small
changes here and there, some extra code is vectorized, some remain
scalar (2 x vectors)
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes - emits 2 scalars
+ 2 insertelems instead of insert, broadcast, alt code (3 instructions,
total 5 insts)
MultiSource/Benchmarks/Prolangs-C/plot2fig/plot2fig - small graph
becomes profitable and gets vectorized.
External/SPEC/CINT2017rate/502.gcc_r/502.gcc_r
External/SPEC/CINT2017speed/602.gcc_s/602.gcc_s
Some small graph becomes profitable and gets vectorized.
MultiSource/Benchmarks/FreeBench/pifft/pifft - no changes in final code.
Reviewers: RKSimon, dtcxzyw
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/84978
Commit: 0a1317564a6b437760d96f0a227a3c910875428d
https://github.com/llvm/llvm-project/commit/0a1317564a6b437760d96f0a227a3c910875428d
Author: Mark de Wever <koraq at xs4all.nl>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M libcxx/include/CMakeLists.txt
M libcxx/include/__chrono/leap_second.h
M libcxx/include/__chrono/time_zone_link.h
M libcxx/include/__locale
M libcxx/include/__stop_token/stop_callback.h
A libcxx/include/__utility/private_constructor_tag.h
M libcxx/include/module.modulemap
M libcxx/src/CMakeLists.txt
R libcxx/src/include/tzdb/leap_second_private.h
R libcxx/src/include/tzdb/time_zone_link_private.h
M libcxx/src/locale.cpp
M libcxx/src/tzdb.cpp
A libcxx/test/libcxx/utilities/utility/private_constructor_tag.compile.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/assign.copy.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/cons.copy.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/members/date.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/members/value.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/nonmembers/comparison.pass.cpp
M libcxx/test/support/test_chrono_leap_second.h
M libcxx/utils/generate_iwyu_mapping.py
Log Message:
-----------
[libc++] Adds a global private constructor tag. (#87920)
This removes the similar tags used in the chrono tzdb implementation.
Fixes: https://github.com/llvm/llvm-project/issues/85432
Commit: f81879c0f70ee5a1cf1d5b716dfd49d1a271cc2d
https://github.com/llvm/llvm-project/commit/f81879c0f70ee5a1cf1d5b716dfd49d1a271cc2d
Author: Joseph Huber <huberjn at outlook.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M openmp/libomptarget/DeviceRTL/CMakeLists.txt
M openmp/libomptarget/DeviceRTL/src/LibC.cpp
Log Message:
-----------
[Libomptarget] Add RPC-based printf implementation for OpenMP #85638
Summary:
Relanding after reverting, only applies to AMDGPU for now.
This patch adds an implementation of printf that's provided by the GPU
C library runtime. This pritnf currently implemented using the same
wrapper handling that OpenMP sets up. This will be removed once we have
proper varargs support.
This printf differs from the one CUDA offers in that it is synchronous
and uses a finite size. Additionally we support pretty much every
format specifier except the %n option.
Depends on #85331
Commit: fad14707b73d6387e6276507e1c5726e67f08cd6
https://github.com/llvm/llvm-project/commit/fad14707b73d6387e6276507e1c5726e67f08cd6
Author: Joseph Huber <huberjn at outlook.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M libc/docs/gpu/building.rst
Log Message:
-----------
[libc] Add note to use `LIBC_GPU_BUILD=ON` as another form
Summary:
This is a shorthand to enable GPU support so it should be listed in the
docs.
Commit: ca6b8469c16edfe1713e9050dca3cd68bd585410
https://github.com/llvm/llvm-project/commit/ca6b8469c16edfe1713e9050dca3cd68bd585410
Author: Fangrui Song <i at maskray.me>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M lld/ELF/SyntheticSections.cpp
Log Message:
-----------
[ELF] Avoid unneeded config->isLE and config->wordsize. NFC
Commit: e3ef4612c18845876cda9a13c3435e102f74a3aa
https://github.com/llvm/llvm-project/commit/e3ef4612c18845876cda9a13c3435e102f74a3aa
Author: shamithoke <152091883+shamithoke at users.noreply.github.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/bitreverse.ll
M llvm/test/CodeGen/X86/vector-bitreverse.ll
Log Message:
-----------
Perform bitreverse using AVX512 GFNI for i32 and i64. (#81764)
Currently, the lowering operation for bitreverse using Intel AVX512 GFNI only supports byte vectors
Extend the operation to i32 and i64.
---------
Co-authored-by: shami <shami_thoke at yahoo.com>
Commit: 7549b45825a05fc24fcdbacf006461165aa042cb
https://github.com/llvm/llvm-project/commit/7549b45825a05fc24fcdbacf006461165aa042cb
Author: martinboehme <mboehme at google.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
M clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
M clang/lib/Analysis/FlowSensitive/Transfer.cpp
M clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
M clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
M clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
Log Message:
-----------
Revert "[clang][dataflow] Propagate locations from result objects to initializers." (#88315)
Reverts llvm/llvm-project#87320
This is causing buildbots to fail because
`isOriginalRecordConstructor()` is now unused.
Commit: a6d1366b736cad85b3bb9fbdda340e07488d6cde
https://github.com/llvm/llvm-project/commit/a6d1366b736cad85b3bb9fbdda340e07488d6cde
Author: erichkeane <ekeane at nvidia.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/Parse/ParseOpenACC.cpp
Log Message:
-----------
[NFC] Remove a pair of incorrect comments from ParseOpenACC
We attempt to continue parsing, but the comment says the opposite. Just
remove the inaccurate comments in this patch.
Commit: b3792ae42a4adda5cb51d53f3d6a4b9b025b11fd
https://github.com/llvm/llvm-project/commit/b3792ae42a4adda5cb51d53f3d6a4b9b025b11fd
Author: Xing Xue <xingxue at outlook.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M openmp/runtime/test/lit.cfg
Log Message:
-----------
[OpenMP][AIX] Fix test config for AIX (#88272)
This patch fixes the test config so that it works for
`tasking/omp50_taskdep_depobj.c` which uses different flags to test with
compiler's `omp.h`.
* set test environment variable `OBJECT_MODE` to `64` if it is set
explicitly to `64` in the AIX environment. `OBJECT_MODE` is default to
`32` and is recognized by AIX compilers and toolchain. In this way, we
don't need to set `-m64` for all compiler flags for 64-bit mode
* add option `-Wl,-bmaxdata` to 32-bit `test_openmp_flags` used by
`tasking/omp50_taskdep_depobj.c`
Commit: a12836647e08c4ad203b9834ac55892fa0b9f2d3
https://github.com/llvm/llvm-project/commit/a12836647e08c4ad203b9834ac55892fa0b9f2d3
Author: David Pagan <dave.pagan at amd.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/include/clang/AST/StmtOpenMP.h
M clang/lib/AST/StmtOpenMP.cpp
M clang/lib/CodeGen/CGOpenMPRuntime.cpp
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
M clang/lib/CodeGen/CGStmtOpenMP.cpp
M clang/lib/Sema/SemaOpenMP.cpp
M clang/lib/Serialization/ASTReaderStmt.cpp
M clang/lib/Serialization/ASTWriterStmt.cpp
M clang/test/OpenMP/nvptx_target_teams_generic_loop_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_generic_loop_generic_mode_codegen.cpp
M clang/test/OpenMP/target_teams_generic_loop_codegen.cpp
A clang/test/OpenMP/target_teams_generic_loop_codegen_as_distribute.cpp
A clang/test/OpenMP/target_teams_generic_loop_codegen_as_parallel_for.cpp
M clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp
M clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_codegen-1.cpp
M clang/test/OpenMP/teams_generic_loop_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_private_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp
Log Message:
-----------
[OpenMP][CodeGen] Improved codegen for combined loop directives (#87278)
IR for 'target teams loop' is now dependent on suitability of associated
loop-nest.
If a loop-nest:
- does not contain a function call, or
- the -fopenmp-assume-no-nested-parallelism has been specified,
- or the call is to an OpenMP API AND
- does not contain nested loop bind(parallel) directives
then it can be emitted as 'target teams distribute parallel for', which
is the current default. Otherwise, it is emitted as 'target teams
distribute'.
Added debug output indicating how 'target teams loop' was emitted. Flag
is -mllvm -debug-only=target-teams-loop-codegen
Added LIT tests explicitly verifying 'target teams loop' emitted as a
parallel loop and a distribute loop.
Updated other 'loop' related tests as needed to reflect change in IR.
- These updates account for most of the changed files and
additions/deletions.
Commit: d347235bddbeba2a72d94ebe9d8f98dc675c3776
https://github.com/llvm/llvm-project/commit/d347235bddbeba2a72d94ebe9d8f98dc675c3776
Author: Christopher Di Bella <cjdb at google.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/lib/Driver/ToolChains/Flang.cpp
Log Message:
-----------
[Flang] responds to Clang Tidy feedback (#87847)
Line 267: performance-unnecessary-copy-initialization
Line 592: readability-container-size-empty
Commit: 05093e243859a371f96ffa1c320a4b51579c3da7
https://github.com/llvm/llvm-project/commit/05093e243859a371f96ffa1c320a4b51579c3da7
Author: Farzon Lotfi <1802579+farzonl at users.noreply.github.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
M llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.h
M llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
A llvm/test/CodeGen/SPIRV/hlsl-intrinsics/all.ll
Log Message:
-----------
[Spirv][HLSL] Add OpAll lowering and float vec support (#87952)
The main point of this change was to add support for HLSL's all
intrinsic.
In the process of doing that I found a few issues around creating an
`OpConstantComposite` via `buildZerosVal`.
First the current code didn't support floats so the process of adding
`buildZerosValF` meant I needed a
float version of `getOrCreateIntConstVector`. After doing so I renamed
both versions to `getOrCreateConstVector`. That meant I needed to create
a float type version of `getOrCreateIntCompositeOrNull`. Luckily the
type information was low for this function so was able to split it out
into a helpwe and rename `getOrCreateIntCompositeOrNull` to
`getOrCreateCompositeOrNull` With the exception of type handling
differences of the code and Null vs 0 Constant Op codes these functions
should be identical.
To handle scalar floats I could not use `buildConstantFP` like this PR
did:
https://github.com/llvm/llvm-project/commit/0a2aaab5aba46#diff-733a189c5a8c3211f3a04fd6e719952a3fa231eadd8a7f11e6ecf1e584d57411R1603
because that would create too many superfluous registers (that causes
problems in the validator), I had to create a float version of
`getOrCreateConstInt` which I called `getOrCreateConstFP`.
similar problems with doing it like this:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/SPIRV/SPIRVBuiltins.cpp#L1540.
`buildZerosValF` also has a use of a function `getZeroFP`. This is
because half, float, and double scalar values of 0 would collide in
`SPIRVDuplicatesTracker<Constant> CT` if you use `APFloat(0.0f)`.
`getORCreateConstFP` needed its own version of `getOrCreateConstIntReg`
which I called `getOrCreateConstFloatReg` The one difference in this
function is `getOrCreateConstFloatReg` returns a bit width so we don't
have to call `getScalarOrVectorBitWidth` twice ie when it is used again
in `getOrCreateConstFP` for `OpConstantF` `addNumImm`.
`getOrCreateConstFloatReg` needed an `assignFloatTypeToVReg` helper
which called a `getOrCreateSPIRVFloatType` helper. There was no
equivalent IntegerType::get for floats so I handled this with a switch
statement on bit widths to get the right LLVM float type.
Finally, there is the use of `bool ZeroAsNull = STI.isOpenCLEnv();` This
is partly a cosmetic change. When Zeros are treated as nulls, we don't
create `OpConstantComposite` vectors which is something we do in the
DXCs SPIRV backend. The DXC SPIRV backend also does not use
`OpConstantNull`. Finally, I needed a means to test the behavior of the
OpConstantNull and `OpConstantComposite` changes and this was one way I
could do that via the same tests.
Commit: c258f573981336cd9f87f89e59c6c2117e5d44ec
https://github.com/llvm/llvm-project/commit/c258f573981336cd9f87f89e59c6c2117e5d44ec
Author: Fangrui Song <i at maskray.me>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M lld/ELF/SyntheticSections.cpp
M lld/ELF/SyntheticSections.h
M lld/ELF/Writer.cpp
M lld/ELF/Writer.h
Log Message:
-----------
[ELF] Move createSyntheticSections from Writer.cpp to SyntheticSections.cpp. NFC
SyntheticSections.cpp is more appropriate. This change enables
elimination of many explicit template instantiations.
Due to `make<SymbolTableSection<ELFT>>(*strtab)` in Arch/ARM.cpp,
we do not remove explicit template instantiations for SymbolTableSection.
Commit: 8cfa72ade9f2f7df81a008efea84f833b73494b9
https://github.com/llvm/llvm-project/commit/8cfa72ade9f2f7df81a008efea84f833b73494b9
Author: Nick Desaulniers <ndesaulniers at google.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M libc/hdr/CMakeLists.txt
Log Message:
-----------
[libc] fix typo in hdr/CMakeLists
Fixes #87896
Commit: fb771fe315654231f613a5501ebd538f036c78b6
https://github.com/llvm/llvm-project/commit/fb771fe315654231f613a5501ebd538f036c78b6
Author: Jeff Niu <jeff at modular.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M mlir/lib/Bytecode/Writer/IRNumbering.cpp
Log Message:
-----------
[mlir] Slightly optimize bytecode op numbering (#88310)
If the bytecode encoding supports properties, then the dictionary
attribute is always the raw dictionary attribute of the operation,
regardless of what it contains. Otherwise, get the dictionary attribute
from the op: if the op does not have properties, then it returns the raw
dictionary, otherwise it returns the combined inherent and discardable
attributes.
Commit: af7c196fb8d10f58a704b5a8d142feacf2f0236d
https://github.com/llvm/llvm-project/commit/af7c196fb8d10f58a704b5a8d142feacf2f0236d
Author: Chelsea Cassanova <chelsea_cassanova at apple.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M lldb/include/lldb/API/SBDebugger.h
M lldb/include/lldb/lldb-enumerations.h
M lldb/test/API/functionalities/diagnostic_reporting/TestDiagnosticReporting.py
M lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py
M lldb/test/API/functionalities/progress_reporting/clang_modules/TestClangModuleBuildProgress.py
M lldb/test/API/macosx/rosetta/TestRosetta.py
Log Message:
-----------
[lldb][sbdebugger] Move SBDebugger Broadcast bit enum into lldb-enumerations.h (#87409)
When the `eBroadcastBitProgressCategory` bit was originally added to
Debugger.h and SBDebugger.h, each corresponding bit was added in order
of the other bits that were previously there. Since `Debugger.h` has an
enum bit that `SBDebugger.h` does not, this meant that their offsets did
not match.
Instead of trying to keep the bit offsets in sync between the two, it's
preferable to just move SBDebugger's enum into the main enumerations
header and use the bits from there. This also requires that API tests using the bits from SBDebugger update their usage.
Commit: 2fdfea088c8d78119b74116b94bc6729ce0e3efe
https://github.com/llvm/llvm-project/commit/2fdfea088c8d78119b74116b94bc6729ce0e3efe
Author: Stanislav Mekhanoshin <rampitec at users.noreply.github.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Target/AMDGPU/SIInstructions.td
M llvm/lib/Target/AMDGPU/SIRegisterInfo.td
Log Message:
-----------
[AMDGPU] Add v2i32 to the VS_64 types. NFCI. (#88318)
I am trying to use VOP3Inst with intrinsic taking v2i32 operand and it
fails to create patterm without it.
Commit: 9f6d08f2566a26144ea1753f80aebb1f2ecfdc63
https://github.com/llvm/llvm-project/commit/9f6d08f2566a26144ea1753f80aebb1f2ecfdc63
Author: Chelsea Cassanova <chelsea_cassanova at apple.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M lldb/include/lldb/API/SBDebugger.h
M lldb/include/lldb/lldb-enumerations.h
M lldb/test/API/functionalities/diagnostic_reporting/TestDiagnosticReporting.py
M lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py
M lldb/test/API/functionalities/progress_reporting/clang_modules/TestClangModuleBuildProgress.py
M lldb/test/API/macosx/rosetta/TestRosetta.py
Log Message:
-----------
Revert "[lldb][sbdebugger] Move SBDebugger Broadcast bit enum into lldb-enumerations.h" (#88324)
Reverts llvm/llvm-project#87409 due a missed update to the broadcast bit
causing a build failure on the x86_64 Debian buildbot.
Commit: d8f1e5d2894f7f4edc2e85e63def456c7f430f34
https://github.com/llvm/llvm-project/commit/d8f1e5d2894f7f4edc2e85e63def456c7f430f34
Author: Craig Topper <craig.topper at sifive.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Support/APInt.cpp
Log Message:
-----------
[APInt] Remove accumulator initialization from tcMultiply and tcFullMultiply. NFCI (#88202)
The tcMultiplyPart routine has a flag that says whether to add to the
accumulator or overwrite it. By using the overwrite mode on the first
iteration we don't need to initialize the accumulator to zero.
Note, the initialization in tcFullMultiply was only initializing the
first rhsParts of dst. tcMultiplyPart always overwrites the rhsParts+1
part that just contains the last carry. The first write to each part of
dst past rhsParts is a carry write so that's how the upper part of dst
is initialized.
Commit: a9d4ddd98a0bc495126027122fdca751b6841ceb
https://github.com/llvm/llvm-project/commit/a9d4ddd98a0bc495126027122fdca751b6841ceb
Author: Oskar Wirga <oskar.wirga at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M llvm/lib/Transforms/IPO/MergeFunctions.cpp
M llvm/test/Transforms/MergeFunc/cfi-thunk-merging.ll
Log Message:
-----------
[MergeFuncs/CFI] Ensure all type metadata is propogated for CFI (#88218)
I noticed that we weren't propagating ALL type metadata that was
attached to CFI functions:
# BEFORE
```
; Function Attrs: minsize nounwind optsize ssp uwtable(sync)
define internal void @foo(ptr nocapture noundef readonly %0) #0 !dbg !62311 !type !34028 !type !34029 !type !34030
... fn merging
; Function Attrs: minsize nounwind optsize ssp uwtable(sync)
define internal void @foo(ptr nocapture noundef readonly %0) #0 !type !34028
```
# AFTER
```
; Function Attrs: minsize nounwind optsize ssp uwtable(sync)
define internal void @foo(ptr nocapture noundef readonly %0) #0 !dbg !62311 !type !34028 !type !34029 !type !34030
... fn merging
; Function Attrs: minsize nounwind optsize ssp uwtable(sync)
define internal void @foo(ptr nocapture noundef readonly %0) #0 !type !type !34028 !type !34029 !type !34030
```
This patch makes sure that the entire vector of metadata is copied over.
Commit: fba9084977171a89248b257dbba4f47ad4b0814c
https://github.com/llvm/llvm-project/commit/fba9084977171a89248b257dbba4f47ad4b0814c
Author: Michael Liao <michael.hliao at gmail.com>
Date: 2024-04-10 (Wed, 10 Apr 2024)
Changed paths:
M clang/docs/HIPSupport.rst
M clang/docs/ReleaseNotes.rst
M clang/docs/tools/clang-formatted-files.txt
M clang/include/clang/AST/StmtOpenMP.h
M clang/include/clang/Basic/OpenACCKinds.h
M clang/include/clang/Driver/Options.td
M clang/lib/AST/Interp/FunctionPointer.h
M clang/lib/AST/ParentMapContext.cpp
M clang/lib/AST/StmtOpenMP.cpp
M clang/lib/CodeGen/CGOpenMPRuntime.cpp
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
M clang/lib/CodeGen/CGStmtOpenMP.cpp
M clang/lib/Driver/ToolChains/AMDGPU.cpp
M clang/lib/Driver/ToolChains/Flang.cpp
M clang/lib/Parse/ParseOpenACC.cpp
M clang/lib/Sema/SemaOpenACC.cpp
M clang/lib/Sema/SemaOpenMP.cpp
M clang/lib/Serialization/ASTReader.cpp
M clang/lib/Serialization/ASTReaderStmt.cpp
M clang/lib/Serialization/ASTWriter.cpp
M clang/lib/Serialization/ASTWriterStmt.cpp
M clang/test/Driver/amdgpu-features.c
M clang/test/Driver/lld-repro.c
A clang/test/Modules/home-is-cwd-search-paths.c
M clang/test/OpenMP/nvptx_target_teams_generic_loop_codegen.cpp
M clang/test/OpenMP/nvptx_target_teams_generic_loop_generic_mode_codegen.cpp
M clang/test/OpenMP/target_teams_generic_loop_codegen.cpp
A clang/test/OpenMP/target_teams_generic_loop_codegen_as_distribute.cpp
A clang/test/OpenMP/target_teams_generic_loop_codegen_as_parallel_for.cpp
M clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp
M clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_codegen-1.cpp
M clang/test/OpenMP/teams_generic_loop_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_private_codegen.cpp
M clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp
M flang/include/flang/Frontend/PreprocessorOptions.h
M flang/include/flang/Parser/parsing.h
A flang/include/flang/Parser/preprocessor.h
A flang/include/flang/Parser/token-sequence.h
M flang/lib/Frontend/CompilerInvocation.cpp
M flang/lib/Frontend/FrontendActions.cpp
M flang/lib/Parser/parsing.cpp
M flang/lib/Parser/preprocessor.cpp
R flang/lib/Parser/preprocessor.h
M flang/lib/Parser/prescan.cpp
M flang/lib/Parser/prescan.h
M flang/lib/Parser/token-sequence.cpp
R flang/lib/Parser/token-sequence.h
M flang/test/Driver/driver-help-hidden.f90
M flang/test/Driver/driver-help.f90
A flang/test/Preprocessing/show-macros1.F90
A flang/test/Preprocessing/show-macros2.F90
A flang/test/Preprocessing/show-macros3.F90
M libc/docs/gpu/building.rst
M libc/hdr/CMakeLists.txt
M libcxx/include/CMakeLists.txt
M libcxx/include/__algorithm/simd_utils.h
M libcxx/include/__chrono/leap_second.h
M libcxx/include/__chrono/time_zone_link.h
M libcxx/include/__config
M libcxx/include/__locale
M libcxx/include/__stop_token/stop_callback.h
A libcxx/include/__utility/private_constructor_tag.h
M libcxx/include/module.modulemap
M libcxx/src/CMakeLists.txt
R libcxx/src/include/tzdb/leap_second_private.h
R libcxx/src/include/tzdb/time_zone_link_private.h
M libcxx/src/locale.cpp
M libcxx/src/tzdb.cpp
M libcxx/test/libcxx/gdb/gdb_pretty_printer_test.sh.cpp
A libcxx/test/libcxx/utilities/utility/private_constructor_tag.compile.pass.cpp
M libcxx/test/std/ranges/range.utility/range.utility.conv/to_deduction.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/assign.copy.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/cons.copy.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/members/date.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/members/value.pass.cpp
M libcxx/test/std/time/time.zone/time.zone.leap/nonmembers/comparison.pass.cpp
M libcxx/test/std/utilities/format/format.arguments/format.arg/visit.pass.cpp
M libcxx/test/std/utilities/format/format.arguments/format.arg/visit.return_type.pass.cpp
M libcxx/test/std/utilities/format/format.arguments/format.arg/visit_format_arg.deprecated.verify.cpp
M libcxx/test/std/utilities/variant/variant.visit.member/robust_against_adl.pass.cpp
M libcxx/test/std/utilities/variant/variant.visit.member/visit.pass.cpp
M libcxx/test/std/utilities/variant/variant.visit.member/visit_return_type.pass.cpp
M libcxx/test/support/test_chrono_leap_second.h
M libcxx/utils/generate_iwyu_mapping.py
M lld/ELF/SyntheticSections.cpp
M lld/ELF/SyntheticSections.h
M lld/ELF/Writer.cpp
M lld/ELF/Writer.h
M llvm/include/llvm/CodeGen/MachineCombinerPattern.h
M llvm/lib/Analysis/ValueTracking.cpp
M llvm/lib/Support/APInt.cpp
M llvm/lib/Target/AMDGPU/AMDGPU.td
M llvm/lib/Target/AMDGPU/GCNSubtarget.h
M llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
M llvm/lib/Target/AMDGPU/SIInstructions.td
M llvm/lib/Target/AMDGPU/SIRegisterInfo.td
M llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
M llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
M llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
M llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
M llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.h
M llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
M llvm/lib/Target/SPIRV/SPIRVUtils.cpp
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/lib/Transforms/IPO/MergeFunctions.cpp
M llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
M llvm/test/CodeGen/AArch64/lrint-conv.ll
M llvm/test/CodeGen/AArch64/vector-lrint.ll
A llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_precise_memory.ll
M llvm/test/CodeGen/RISCV/interrupt-attr-nocall.ll
M llvm/test/CodeGen/RISCV/interrupt-attr.ll
M llvm/test/CodeGen/RISCV/rv64zba.ll
M llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
M llvm/test/CodeGen/SPIRV/SampledImageRetType.ll
A llvm/test/CodeGen/SPIRV/hlsl-intrinsics/all.ll
M llvm/test/CodeGen/X86/bitreverse.ll
M llvm/test/CodeGen/X86/vector-bitreverse.ll
M llvm/test/Instrumentation/MemorySanitizer/overflow.ll
M llvm/test/Transforms/InstCombine/known-bits.ll
A llvm/test/Transforms/InstCombine/vector-reduce-min-max-known.ll
M llvm/test/Transforms/InstSimplify/icmp.ll
M llvm/test/Transforms/InstSimplify/known-non-zero.ll
M llvm/test/Transforms/MergeFunc/cfi-thunk-merging.ll
M llvm/test/Transforms/SLPVectorizer/AArch64/extractelements-to-shuffle.ll
M llvm/test/Transforms/SLPVectorizer/X86/ext-int-reduced-not-operand.ll
A llvm/test/Transforms/SLPVectorizer/X86/extractlements-gathered-first-node.ll
M llvm/test/Transforms/SLPVectorizer/X86/gather-move-out-of-loop.ll
M llvm/test/Transforms/SLPVectorizer/X86/gathered-delayed-nodes-with-reused-user.ll
M llvm/test/Transforms/SLPVectorizer/X86/non-scheduled-inst-reused-as-last-inst.ll
M llvm/test/Transforms/SLPVectorizer/X86/reorder_with_external_users.ll
M llvm/test/Transforms/SLPVectorizer/alternate-non-profitable.ll
M llvm/unittests/Bitcode/BitReaderTest.cpp
M mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
M mlir/lib/Bytecode/Writer/IRNumbering.cpp
M mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
M mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp
M mlir/test/Conversion/ComplexToStandard/convert-to-standard.mlir
M mlir/test/Dialect/ArmNeon/lower-to-arm-neon.mlir
M mlir/test/Dialect/SparseTensor/invalid.mlir
M mlir/test/Dialect/SparseTensor/roundtrip.mlir
M mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
M openmp/libomptarget/DeviceRTL/CMakeLists.txt
M openmp/libomptarget/DeviceRTL/src/LibC.cpp
M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
M openmp/libomptarget/src/interface.cpp
M openmp/runtime/test/lit.cfg
Log Message:
-----------
rebase
Created using spr 1.3.4
Compare: https://github.com/llvm/llvm-project/compare/2823d5ed097d...fba908497717
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list