[all-commits] [llvm/llvm-project] e72c71: [AccelTable][nfc] Add helper function to cast Acce...

Mon Jan 8 17:14:27 PST 2024

  Branch: refs/heads/users/vitalybuka/spr/msan-unwind-stack-before-fatal-reports
  Home:   https://github.com/llvm/llvm-project
  Commit: e72c71671e044aa30ca35bed9e20da771ae216b5
      https://github.com/llvm/llvm-project/commit/e72c71671e044aa30ca35bed9e20da771ae216b5
  Author: Felipe de Azevedo Piovezan <fpiovezan at apple.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M llvm/include/llvm/CodeGen/AccelTable.h
    M llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp

  Log Message:
  -----------
  [AccelTable][nfc] Add helper function to cast AccelTableData (#77100)

Specializations of AccelTableBase are always interested in accessing the
derived versions of their data classes (e.g. DWARF5AccelTableData). They
do so by sprinkling `static_casts` all over the code.

This commit adds a helper function to simplify this process, reducinng
the number of casts that have to be made in the middle of code, making
it easier to read.

  Commit: 87f67c2599410786ea3600d388fd1d2df13e60af
      https://github.com/llvm/llvm-project/commit/87f67c2599410786ea3600d388fd1d2df13e60af
  Author: erichkeane <ekeane at nvidia.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M clang/include/clang/Basic/OpenACCKinds.h
    M clang/lib/Parse/ParseOpenACC.cpp
    M clang/test/ParserOpenACC/parse-clauses.c

  Log Message:
  -----------
  [OpenACC] Implement 'self' clause parsing

The 'self' clause takes an optional 'condition' expression, same as the
non-optional expression taken by the 'if' clause.  This patch extracts
the 'condition' expression to a separate function, and implements the
'optional parens' infrastructure for clauses, then implements 'self'
parsing.

  Commit: 22a73e7c4616e0405db85598c049a7ca70cca7cc
      https://github.com/llvm/llvm-project/commit/22a73e7c4616e0405db85598c049a7ca70cca7cc
  Author: carlobertolli <carlo.bertolli at amd.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M openmp/libomptarget/include/Shared/PluginAPI.h
    M openmp/libomptarget/include/Shared/PluginAPI.inc
    M openmp/libomptarget/include/Shared/Requirements.h
    M openmp/libomptarget/include/device.h
    M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
    M openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h
    M openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
    M openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
    M openmp/libomptarget/src/OpenMP/Mapping.cpp
    M openmp/libomptarget/src/PluginManager.cpp
    M openmp/libomptarget/src/device.cpp
    A openmp/libomptarget/test/mapping/auto_zero_copy.cpp

  Log Message:
  -----------
  [OpenMP][libomptarget] Enable automatic unified shared memory executi… (#75999)

…on (zero-copy) on MI300A.

This patch enables applications that did not request OpenMP
unified_shared_memory to run with the same zero-copy behavior, where
mapped memory does not result in extra memory allocations and memory
copies, but CPU-allocated memory is accessed from the device. The name
for this behavior is "automatic zero-copy" and it relies on detecting:
that the runtime is running on a MI300A, that the user did not select
unified_shared_memory in their program, and that XNACK (unified memory
support) is enabled in the current GPU configuration. If all these
conditions are met, then automatic zero-copy is triggered.

This patch is still missing support for global variables, which will be
provided in a subsequent patch.

Co-authored-by: Thorsten Blass <thorsten.blass at amd.com>

  Commit: 6684a09ca84b44f320052a77cb01cb4216e6511b
      https://github.com/llvm/llvm-project/commit/6684a09ca84b44f320052a77cb01cb4216e6511b
  Author: Tom Stellard <tstellar at redhat.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M clang/include/clang/Driver/Options.td
    M clang/lib/Driver/ToolChains/Gnu.cpp
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtend.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crti.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtn.o
    A clang/test/Driver/gcc-triple.cpp

  Log Message:
  -----------
  [Driver] Add the --gcc-triple option (#73214)

When --gcc-triple is used, the driver will search for the 'best' gcc
installation that has the given triple. This is useful for distributions
that want clang to use a specific gcc triple, but do not want to pin to
a specific version as would be required by using --gcc-install-dir.
Having clang linked to a specific gcc version can cause clang to stop
working when the version of gcc installed on the system gets updated.

  Commit: ce4144406c94c3b9cf44bcf2997bae80debc6681
      https://github.com/llvm/llvm-project/commit/ce4144406c94c3b9cf44bcf2997bae80debc6681
  Author: carlobertolli <carlo.bertolli at amd.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M openmp/libomptarget/include/Shared/PluginAPI.h
    M openmp/libomptarget/include/Shared/PluginAPI.inc
    M openmp/libomptarget/include/Shared/Requirements.h
    M openmp/libomptarget/include/device.h
    M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
    M openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h
    M openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
    M openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
    M openmp/libomptarget/src/OpenMP/Mapping.cpp
    M openmp/libomptarget/src/PluginManager.cpp
    M openmp/libomptarget/src/device.cpp
    R openmp/libomptarget/test/mapping/auto_zero_copy.cpp

  Log Message:
  -----------
  Revert "[OpenMP][libomptarget] Enable automatic unified shared memory executi…" (#77371)

Reverts llvm/llvm-project#75999

lit test is failing.

  Commit: ce1305a3cea42dad8dd6ee5606dd4259e8632953
      https://github.com/llvm/llvm-project/commit/ce1305a3cea42dad8dd6ee5606dd4259e8632953
  Author: Nick Desaulniers <nickdesaulniers at users.noreply.github.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M libc/include/llvm-libc-types/off_t.h

  Log Message:
  -----------
  [libc] make off_t 32b for 32b arm (#77350)

Fixes the following diagnostic:

    llvm-project/libc/src/sys/mman/linux/mmap.cpp:44:59: error: implicit
conversion loses integer precision: 'off_t' (aka 'long long') to 'long'
    [-Werror,-Wshorten-64-to-32]
     size, prot, flags, fd, offset);
                            ^~~~~~

It looks like off_t is a curious types on different platforms. FWICT,
it's 32b
on arm (at least for arm-linux-gnueabi) but 64b elsewhere (including 32b
riscv32-linux-gnu).

  Commit: 4435ced94998c00a6589c3500822015b6341c9e3
      https://github.com/llvm/llvm-project/commit/4435ced94998c00a6589c3500822015b6341c9e3
  Author: MaheshRavishankar <1663364+MaheshRavishankar at users.noreply.github.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h
    M mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
    M mlir/test/lib/Interfaces/TilingInterface/TestTilingInterface.cpp

  Log Message:
  -----------
  [mlir][TilingInterface] Allow controlling what fusion is done within tile and fuse (#76871)

Currently the `tileConsumerAndFuseProducerGreedilyUsingSCFFor` method
greedily fuses through all slices that are generated during the tile and
fuse flow. That is not the normal use case. Ideally the caller would
like to control which slices get fused and which dont. This patch
introduces a new field to the `SCFTileAndFuseOptions` to specify this
control.

The contol function also allows the caller to specify if the replacement
for the fused producer needs to be yielded from within the tiled
computation. This allows replacing the fused producers in case they have
other uses. Without this the original producers still survive negating
the utility of the fusion.

The change here also means that the name of the function
`tileConsumerAndFuseProducerGreedily...` can be updated. Defering that
to a later stage to reduce the churn of API changes.

  Commit: 7ab64b3266c580f946b3b65992030c3f68cbe392
      https://github.com/llvm/llvm-project/commit/7ab64b3266c580f946b3b65992030c3f68cbe392
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVRegisterInfo.td

  Log Message:
  -----------
  [RISCV] Remove tab character from RISCVRegisterInfo.td. NFC

  Commit: 09e32ab75076a1f2270d37343922c86c12bdd047
      https://github.com/llvm/llvm-project/commit/09e32ab75076a1f2270d37343922c86c12bdd047
  Author: Alex Langford <alangford at apple.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M lldb/include/lldb/API/SBBreakpoint.h

  Log Message:
  -----------
  [lldb] Deprecate SBBreakpoint::AddName in favor of AddNameWithErrorHandling (#71228)

AddName gives no feedback other than if it succeeded whereas
AddNameWithErrorHandling gives you back an SBError object. I would like
to mark AddName as deprecated and direct folks to use
AddNameWithErorrHandling instead.

---------

Co-authored-by: Med Ismail Bennani <ismail at bennani.ma>

  Commit: 16b8a0dc6885dea0882887a6e642a504fd1e193c
      https://github.com/llvm/llvm-project/commit/16b8a0dc6885dea0882887a6e642a504fd1e193c
  Author: Alex Langford <alangford at apple.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M lldb/include/lldb/Utility/StructuredData.h
    M lldb/source/Breakpoint/BreakpointResolverName.cpp
    M lldb/source/Plugins/InstrumentationRuntime/TSan/InstrumentationRuntimeTSan.cpp
    M lldb/source/Target/DynamicRegisterInfo.cpp

  Log Message:
  -----------
  [lldb] Change interface of StructuredData::Array::GetItemAtIndexAsInteger (#71993)

This is a follow-up to (#71613) and (#71961).

  Commit: f700d748f0447b6a761eb9d42575b28e0af98708
      https://github.com/llvm/llvm-project/commit/f700d748f0447b6a761eb9d42575b28e0af98708
  Author: Nick Desaulniers <nickdesaulniers at users.noreply.github.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M libc/src/__support/HashTable/sse2/bitmask_impl.inc

  Log Message:
  -----------
  [libc] fix more -Wmissing-brace (#77382)

Similar to #77345, the buildbots are observing similar warnings for the
sse2
implementation.

llvm-project/libc/src/__support/HashTable/sse2/bitmask_impl.inc:36:13:
    error: suggest braces around initialization of subobject
    [-Werror,-Wmissing-braces]
    return {bitmask};
            ^~~~~~~
            {      }
llvm-project/libc/src/__support/HashTable/sse2/bitmask_impl.inc:45:13:
    error: suggest braces around initialization of subobject
    [-Werror,-Wmissing-braces]
    return {static_cast<uint16_t>(~mask_available().word)};
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            {                                            }

Link:
https://lab.llvm.org/buildbot/#/builders/163/builds/49350/steps/8/logs/stdio
Link: https://github.com/llvm/llvm-project/pull/74506

  Commit: f84bfa2f92d2aa3329bc06902a12c0f4c54d7297
      https://github.com/llvm/llvm-project/commit/f84bfa2f92d2aa3329bc06902a12c0f4c54d7297
  Author: Martin Storsjö <martin at martin.st>
  Date:   2024-01-09 (Tue, 09 Jan 2024)

  Changed paths:
    M lld/MinGW/Options.td

  Log Message:
  -----------
  [LLD] [MinGW] Sync --thinlto-cache-dir option details with ELF (#77010)

Disallow using the form with a separate argument,
"--thinlto-cache-dir dir", allow only the one with equals,
"--thintlo-cache-dir=dir". This is the only form that actually was
tested when this was added in
f794808bb9ec06966a67fe33d41a13b9601768f8, and matches the ELF side,
where only the form with an equals is supported (and this was also the
case at the time when this option was added to the MinGW linker).

  Commit: b2ea9ec7fcf37ca01979c11c5b2b1cab0e1ae212
      https://github.com/llvm/llvm-project/commit/b2ea9ec7fcf37ca01979c11c5b2b1cab0e1ae212
  Author: Igor Kudrin <ikudrin at accesssoftek.com>
  Date:   2024-01-09 (Tue, 09 Jan 2024)

  Changed paths:
    M llvm/docs/CommandLine.rst
    M llvm/lib/Support/CommandLine.cpp
    M llvm/test/tools/llvm-debuginfo-analyzer/cmdline.test
    M llvm/unittests/Support/CommandLineTest.cpp

  Log Message:
  -----------
  [CommandLine] Do not print empty categories with '--help-hidden' (#77043)

If a category has no options associated with it, the `--help-hidden`
command still shows that category with the annotation "This option
category has no options", and this is how it was implemented from the
beginning when the categories were introduced, see commit 0537a98878. A
feature to hide unrelated options was added later, in
https://reviews.llvm.org/D7100. Now, if a tool needs to hide unrelated
options that are associated with categories, leaving some of them empty,
those categories will still be visible on the `--help-hidden` output,
even if they have no use for the tool; see the changes in
`llvm/test/tools/llvm-debuginfo-analyzer/cmdline.test` for an example.

The patch ensures that only categories with options are shown on both
main and hidden help output.

  Commit: d5f84e6121f0d0cc8984dccc1774ce9ddb7168c4
      https://github.com/llvm/llvm-project/commit/d5f84e6121f0d0cc8984dccc1774ce9ddb7168c4
  Author: Iain Sandoe <iain at sandoe.co.uk>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M libcxxabi/src/private_typeinfo.cpp
    M libcxxabi/src/private_typeinfo.h
    A libcxxabi/test/catch_null_pointer_to_object_pr64953.pass.cpp

  Log Message:
  -----------
  [libc++abi] Handle catch null pointer-to-object (#68076)

This addresses cases (currently failing) where we throw a null
pointer-to-object and fixes #64953.

We are trying to satisfy the following bullet from the C++ ABI 15.3:

* the handler is of type cv1 T* cv2 and E is a pointer type that can be
converted to the type of the handler by either or both of:

  - a standard pointer conversion (4.10 [conv.ptr]) not involving
    conversions to private or protected or ambiguous classes.

  - a qualification conversion.

The existing implementation assesses the ambiguity of bases by computing
the offsets to them; ambiguous cases are then when the same base appears
at different offsets. The computation of offset includes indirecting
through the vtables to find the offsets to virtual bases.

When the thrown pointer points to a real object, this is quite efficient
since, if the base is found, and it is not ambiguous and on a public
path, the offset is needed to return the adjusted pointer (and the
indirections are not particularly expensive to compute).

However, when we throw a null pointer-to-object, this scheme is no
longer applicable (and the code currently bypasses the relevant
computations, leading to the incorrect catches reported in the issue).

-----

The solution proposed here takes a composite approach:

1. When the pointer-to-object points to a real instance (well, at least,
it is determined to be non-null), we use the existing scheme.

2. When the pointer-to-object is null:

  * We note that there is no real object.
  * When we are processing non-virtual bases, we continue to compute the
    offsets, but for a notional dummy object based at 0. This is OK, since
    we never need to access the object content for non-virtual bases.
  * When we are processing a path with one or more virtual bases, we
    remember a cookie corresponding to the inner-most virtual base found so
    far (and set the notional offset to 0). Offsets to inner non-virtual
    bases are then computed as normal.

A base is then ambiguous iff:
* There is a recorded virtual base cookie and that is different from the
  current one or,
* The non-virtual base offsets differ.

When a handler for a pointer succeeds in catching a base pointer for a
thrown null pointer-to-object, we still return a nullptr (so the
adjustment to the pointer is not required and need not be computed).

Since we noted that there was no object when starting the search for
ambiguous bases, we know that we can skip the pointer adjustment.

This was originally uploaded as https://reviews.llvm.org/D158769.
Fixes #64953

  Commit: 0fe86f9c518fb1296bba8d66ce495f9dfff2c435
      https://github.com/llvm/llvm-project/commit/0fe86f9c518fb1296bba8d66ce495f9dfff2c435
  Author: Joseph Huber <huberjn at outlook.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M openmp/libomptarget/include/DeviceImage.h
    M openmp/libomptarget/include/OffloadEntry.h
    M openmp/libomptarget/include/device.h
    M openmp/libomptarget/src/DeviceImage.cpp
    M openmp/libomptarget/src/PluginManager.cpp
    M openmp/libomptarget/src/device.cpp

  Log Message:
  -----------
  [Libomptarget] Remove extra cache for offloading entries (#77012)

Summary:
The offloading entries right now are assumed to be baked into the binary
itself, and thus always valid whenever the library is executing. This
means that we don't need to copy them to additional storage and can
instead simply pass around references to it.

This is not likely to change in the expected operation of the OpenMP
library. Additionally, the indirection for the offload entry struct is
simply two pointers, so moving it by value is trivial.

  Commit: 6e90f13cc9bc9dbc5c2c248d95c6e18a5fb021b4
      https://github.com/llvm/llvm-project/commit/6e90f13cc9bc9dbc5c2c248d95c6e18a5fb021b4
  Author: Jakub Kuderski <jakub at nod-labs.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M mlir/include/mlir/Conversion/GPUToSPIRV/GPUToSPIRV.h
    M mlir/include/mlir/Conversion/Passes.td
    M mlir/include/mlir/Dialect/SPIRV/IR/SPIRVBase.td
    M mlir/include/mlir/Dialect/SPIRV/IR/SPIRVCooperativeMatrixOps.td
    M mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTypes.h
    M mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp
    M mlir/lib/Conversion/GPUToSPIRV/WmmaOpsToSPIRV.cpp
    M mlir/lib/Dialect/SPIRV/IR/CastOps.cpp
    M mlir/lib/Dialect/SPIRV/IR/CooperativeMatrixOps.cpp
    M mlir/lib/Dialect/SPIRV/IR/SPIRVDialect.cpp
    M mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp
    M mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
    M mlir/lib/Target/SPIRV/Deserialization/DeserializeOps.cpp
    M mlir/lib/Target/SPIRV/Deserialization/Deserializer.cpp
    M mlir/lib/Target/SPIRV/Serialization/Serializer.cpp
    M mlir/test/Conversion/GPUToSPIRV/wmma-ops-to-spirv-khr-coop-matrix.mlir
    R mlir/test/Conversion/GPUToSPIRV/wmma-ops-to-spirv-nv-coop-matrix.mlir
    M mlir/test/Dialect/SPIRV/IR/cast-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/composite-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/khr-cooperative-matrix-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/matrix-ops.mlir
    R mlir/test/Dialect/SPIRV/IR/nv-cooperative-matrix-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/structure-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/types.mlir
    M mlir/test/Target/SPIRV/matrix.mlir
    R mlir/test/Target/SPIRV/nv-cooperative-matrix-ops.mlir

  Log Message:
  -----------
  [mlir][spirv] Drop support for SPV_NV_cooperative_matrix (#76782)

This extension has been superseded by SPV_KHR_cooperative_matrix which
is supported across major vendors GPU like Nvidia, AMD, and Intel.

Given that the KHR version has been supported for nearly half a year,
drop the NV-specific extension to reduce the maintenance burden and code
duplication.

  Commit: 6eab9dd7f01e6cad9f1a93bd52e4c6e7b4c3c1fa
      https://github.com/llvm/llvm-project/commit/6eab9dd7f01e6cad9f1a93bd52e4c6e7b4c3c1fa
  Author: Alex MacLean <amaclean at nvidia.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
    M llvm/test/Transforms/InstCombine/NVPTX/nvvm-intrins.ll

  Log Message:
  -----------
  [NVPTX] remove incorrect NVPTX intrinsic transformations (#76870)

`nvvm_fabs_f`
`nvvm_fabs_ftz_f`

Unfortunately, llvm fabs is not equivalent to these intrinsics since
llvm fabs is defined to only set the sign bit to zero while these can
also flush subnormal inputs and modify NaNs.

`nvvm_round_d`
`nvvm_round_f`
`nvvm_round_ftz_f`

llvm.nvvm.round uses RNI, while llvm.round codegens to RZI. LLVM defines
llvm.round to use the same rounding as libm
`round[f]()`, which is not necessary the same as how we define
llvm.nvvm.round.

`nvvm_sqrt_rn_f`
`nvvm_sqrt_rn_ftz_f`

sqrt may be lowered to a less precise version of sqrt, such as
sqrt.approx in NVPTX depending on factors such as the value of
-nvptx-prec-sqrtf32. These intrinsics should always become the
corresponding NVPTX instructions.

`nvvm_add_rn_d`
`nvvm_add_rn_f`
`nvvm_add_rn_ftz_f`
`nvvm_mul_rn_d`
`nvvm_mul_rn_f`
`nvvm_mul_rn_ftz_f`

These nvvm intrinsics have an explicitly specified rounding mode (.rn).
They should always be lowered to a PTX instruction with the same
explicit rounding mode. Converting to fmul and fadd instructions result
in the PTX instructions without rounding modes specified. This can cause
issue because:

> An add [or mul] instruction with no rounding modifier defaults to
round-to-nearest-even and may be optimized aggressively by the code
optimizer. In particular, mul/add sequences with no rounding modifiers
may be optimized to use fused-multiply-add instructions on the target
device.

`nvvm_div_rn_f`
`nvvm_div_rn_ftz_f`
`nvvm_rcp_rn_f`
`nvvm_rcp_rn_ftz_f`

fdiv may be lowered to a less precise version of div, such as div.full
in NVPTX depending on factors such as the value of -nvptx-prec-divf32.
These intrinsics should always become the corresponding NVPTX
instructions.

  Commit: f5145f4dc819d73ff8bebcfba3779533b150884e
      https://github.com/llvm/llvm-project/commit/f5145f4dc819d73ff8bebcfba3779533b150884e
  Author: Krystian Stasiowski <sdkrystian at gmail.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M clang/include/clang/Analysis/CFG.h

  Log Message:
  -----------
  [Clang][NFC] Fix out-of-bounds access (#77193)

The changes to tablegen made by
https://github.com/llvm/llvm-project/pull/76825 result in
`StmtClass::lastStmtConstant` changing from `StmtClass::WhileStmtClass`
to `StmtClass::GCCAsmStmtClass`. Since `CFG::BuildOptions::alwaysAdd` is
never called with a `WhileStmt`, this has flown under the radar until
now.

Once such test in which an out-of-bounds access occurs is
`test/Sema/inline-asm-validate.c`, among many others.

  Commit: faa326de97bf6119dcc42806b07f3523c521ae96
      https://github.com/llvm/llvm-project/commit/faa326de97bf6119dcc42806b07f3523c521ae96
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
    M llvm/lib/Target/RISCV/RISCVFeatures.td
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
    M llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
    M llvm/lib/Target/RISCV/RISCVInstrInfo.td
    M llvm/lib/Target/RISCV/RISCVProcessors.td
    M llvm/lib/Target/RISCV/RISCVSubtarget.h
    A llvm/test/CodeGen/RISCV/cmov-branch-opt.ll

  Log Message:
  -----------
  [RISCV] Add branch+c.mv macrofusion for sifive-p450. (#76169)

sifive-p450 supports a very restricted version of the short forward
branch optimization from the sifive-7-series.

For sifive-p450, a branch over a single c.mv can be macrofused as a
conditional move operation. Due to encoding restrictions on c.mv, we
can't conditionally move from X0. That would require c.li instead.

  Commit: 1ea7a56057492d9da1124787a9855cc2edca7df9
      https://github.com/llvm/llvm-project/commit/1ea7a56057492d9da1124787a9855cc2edca7df9
  Author: Advenam Tacet <advenam.tacet at trailofbits.com>
  Date:   2024-01-09 (Tue, 09 Jan 2024)

  Changed paths:
    M libcxx/include/string

  Log Message:
  -----------
  Revert "[ASan][libc++] String annotations optimizations fix with lambda (#76200)"

This reverts commit c68a9d25e99a096f6862fc4b57dd380a21245d31.

  Commit: ac8b4f874945f83eec8c8f56d9fc80093e02a7b2
      https://github.com/llvm/llvm-project/commit/ac8b4f874945f83eec8c8f56d9fc80093e02a7b2
  Author: Usman Nadeem <mnadeem at quicinc.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
    A llvm/test/CodeGen/AArch64/sve2-bcax.ll

  Log Message:
  -----------
  [AArch64][SVE2] Add pattern for BCAX (#77159)

Bitwise clear and exclusive or
Add pattern for:
    xor x, (and y, not(z)) -> bcax x, y, z

  Commit: a0ae5258065a856d5f8d9f8dcb12e9d8394f789f
      https://github.com/llvm/llvm-project/commit/a0ae5258065a856d5f8d9f8dcb12e9d8394f789f
  Author: Congcong Cai <congcongcai0907 at 163.com>
  Date:   2024-01-09 (Tue, 09 Jan 2024)

  Changed paths:
    M clang-tools-extra/clang-tidy/misc/UnusedUsingDeclsCheck.h
    M clang-tools-extra/docs/ReleaseNotes.rst

  Log Message:
  -----------
  [clang-tidy]unused using decls only check cpp files (#77335)

  Commit: 6958986f77bdbedd6ba571af7b546018f9108067
      https://github.com/llvm/llvm-project/commit/6958986f77bdbedd6ba571af7b546018f9108067
  Author: Nick Desaulniers <nickdesaulniers at users.noreply.github.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M libc/src/string/memory_utils/op_x86.h

  Log Message:
  -----------
  [libc] fix -Wconversion (#77384)

Fixes the following from GCC:

    llvm-project/libc/src/string/memory_utils/op_x86.h:236:24: error:
conversion from ‘long unsigned int’ to ‘uint32_t’ {aka ‘unsigned int’}
may
    change value [-Werror=conversion]
      236 |   return (xored >> 32) | (xored & 0xFFFFFFFF);
          |          ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~

Link:
https://lab.llvm.org/buildbot/#/builders/250/builds/16236/steps/8/logs/stdio
Link: https://github.com/llvm/llvm-project/pull/74506

  Commit: 7c89b20e02ff079ec84fc54880dbc6c063d8c915
      https://github.com/llvm/llvm-project/commit/7c89b20e02ff079ec84fc54880dbc6c063d8c915
  Author: Fangrui Song <i at maskray.me>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M lld/ELF/ScriptParser.cpp
    M lld/test/ELF/linkerscript/overlay.test

  Log Message:
  -----------
  [ELF] OVERLAY: support optional start address and LMA

https://reviews.llvm.org/D44780 implemented rudimentary support for
OVERLAY. The start address and `AT(ldaddr)` in `OVERLAY [start] :
[NOCROSSREFS] [AT ( ldaddr )]` are not optional.

In addition, there are two issues:

* When the start address is `.`, subsequent sections don't share the
  address of the first overlay section.
* When the first overlay section is empty and discardable, `p_paddr` is
  incorrectly zero. This is because a discarded section has a zero
  address, causing `prev->getLMA() + prev->size` where `prev` refers to
  the first section to evaluate to zero.

This patch supports optional start address and LMA and fix the issues.
Close #77265

Pull Request: https://github.com/llvm/llvm-project/pull/77272

  Commit: 1689bbea17683129f41246110af1ebd32b98362f
      https://github.com/llvm/llvm-project/commit/1689bbea17683129f41246110af1ebd32b98362f
  Author: Nick Desaulniers <ndesaulniers at google.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M libc/src/string/memory_utils/op_x86.h

  Log Message:
  -----------
  [libc] fix up #77384

  Commit: 70cea91e0fc93db618069588e6a06314b2b0e2d3
      https://github.com/llvm/llvm-project/commit/70cea91e0fc93db618069588e6a06314b2b0e2d3
  Author: Nick Desaulniers <nickdesaulniers at users.noreply.github.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M libc/src/sys/mman/linux/CMakeLists.txt

  Log Message:
  -----------
  [libc] temporarily set -Wno-shorten-64-to-32 (#77396)

This is still broken after #77350. Disable the warning for now, and fix
properly once the buildbot it back to green.

Link: https://github.com/llvm/llvm-project/issues/77395

  Commit: eee71ed3f7d0abe40f7c54166421421362a8ac46
      https://github.com/llvm/llvm-project/commit/eee71ed3f7d0abe40f7c54166421421362a8ac46
  Author: Kai Sasaki <lewuathe at gmail.com>
  Date:   2024-01-09 (Tue, 09 Jan 2024)

  Changed paths:
    M mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
    M mlir/test/Conversion/ComplexToStandard/convert-to-standard.mlir

  Log Message:
  -----------
  [mlir][complex] Support Fastmath flag for complex.mulf (#74554)

Support fast math flag in the conversion of `complex.mulf` op to
standard dialect.

See:
https://discourse.llvm.org/t/rfc-fastmath-flags-support-in-complex-dialect/71981

  Commit: 4147b72301bf77ad63793e1dcefefe8d37e69a37
      https://github.com/llvm/llvm-project/commit/4147b72301bf77ad63793e1dcefefe8d37e69a37
  Author: HaohaiWen <haohai.wen at intel.com>
  Date:   2024-01-09 (Tue, 09 Jan 2024)

  Changed paths:
    M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/X86/cast.ll

  Log Message:
  -----------
  [CostModel][X86] Fix fpext conversion cost for 16 elements (#76278)

The fpext conversion cost for 16 elements should be 4 from Znver4.

  Commit: 8d982e509bf61fab1df58eaf3582138fc3c331b2
      https://github.com/llvm/llvm-project/commit/8d982e509bf61fab1df58eaf3582138fc3c331b2
  Author: Vitaly Buka <vitalybuka at google.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M compiler-rt/test/hwasan/TestCases/Linux/aligned_alloc-alignment.cpp
    M compiler-rt/test/hwasan/TestCases/Linux/pvalloc-overflow.cpp
    M compiler-rt/test/hwasan/TestCases/Posix/posix_memalign-alignment.cpp
    M compiler-rt/test/hwasan/TestCases/allocator_returns_null.cpp

  Log Message:
  -----------
  [test][hwasan] Test function name in summaries #77391 (#77397)

Push #77391 into the main.

  Commit: 8b1c4de0da36b9f20a7f130ec26ec9fc3e38b274
      https://github.com/llvm/llvm-project/commit/8b1c4de0da36b9f20a7f130ec26ec9fc3e38b274
  Author: Vitaly Buka <vitalybuka at google.com>
  Date:   2024-01-08 (Mon, 08 Jan 2024)

  Changed paths:
    M clang-tools-extra/clang-tidy/misc/UnusedUsingDeclsCheck.h
    M clang-tools-extra/docs/ReleaseNotes.rst
    M clang/include/clang/Analysis/CFG.h
    M clang/include/clang/Basic/OpenACCKinds.h
    M clang/include/clang/Driver/Options.td
    M clang/lib/Driver/ToolChains/Gnu.cpp
    M clang/lib/Parse/ParseOpenACC.cpp
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtend.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crti.o
    A clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtn.o
    A clang/test/Driver/gcc-triple.cpp
    M clang/test/ParserOpenACC/parse-clauses.c
    M compiler-rt/lib/msan/msan.h
    M compiler-rt/lib/msan/msan_allocator.cpp
    M compiler-rt/lib/msan/msan_new_delete.cpp
    M compiler-rt/test/hwasan/TestCases/Linux/aligned_alloc-alignment.cpp
    M compiler-rt/test/hwasan/TestCases/Linux/pvalloc-overflow.cpp
    M compiler-rt/test/hwasan/TestCases/Posix/posix_memalign-alignment.cpp
    M compiler-rt/test/hwasan/TestCases/allocator_returns_null.cpp
    M libc/include/llvm-libc-types/off_t.h
    M libc/src/__support/HashTable/sse2/bitmask_impl.inc
    M libc/src/string/memory_utils/op_x86.h
    M libc/src/sys/mman/linux/CMakeLists.txt
    M libcxx/include/string
    M libcxxabi/src/private_typeinfo.cpp
    M libcxxabi/src/private_typeinfo.h
    A libcxxabi/test/catch_null_pointer_to_object_pr64953.pass.cpp
    M lld/ELF/ScriptParser.cpp
    M lld/MinGW/Options.td
    M lld/test/ELF/linkerscript/overlay.test
    M lldb/include/lldb/API/SBBreakpoint.h
    M lldb/include/lldb/Utility/StructuredData.h
    M lldb/source/Breakpoint/BreakpointResolverName.cpp
    M lldb/source/Plugins/InstrumentationRuntime/TSan/InstrumentationRuntimeTSan.cpp
    M lldb/source/Target/DynamicRegisterInfo.cpp
    M llvm/docs/CommandLine.rst
    M llvm/include/llvm/CodeGen/AccelTable.h
    M llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
    M llvm/lib/Support/CommandLine.cpp
    M llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
    M llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
    M llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
    M llvm/lib/Target/RISCV/RISCVFeatures.td
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
    M llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
    M llvm/lib/Target/RISCV/RISCVInstrInfo.td
    M llvm/lib/Target/RISCV/RISCVProcessors.td
    M llvm/lib/Target/RISCV/RISCVRegisterInfo.td
    M llvm/lib/Target/RISCV/RISCVSubtarget.h
    M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/X86/cast.ll
    A llvm/test/CodeGen/AArch64/sve2-bcax.ll
    A llvm/test/CodeGen/RISCV/cmov-branch-opt.ll
    M llvm/test/Transforms/InstCombine/NVPTX/nvvm-intrins.ll
    M llvm/test/tools/llvm-debuginfo-analyzer/cmdline.test
    M llvm/unittests/Support/CommandLineTest.cpp
    M mlir/include/mlir/Conversion/GPUToSPIRV/GPUToSPIRV.h
    M mlir/include/mlir/Conversion/Passes.td
    M mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h
    M mlir/include/mlir/Dialect/SPIRV/IR/SPIRVBase.td
    M mlir/include/mlir/Dialect/SPIRV/IR/SPIRVCooperativeMatrixOps.td
    M mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTypes.h
    M mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
    M mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRVPass.cpp
    M mlir/lib/Conversion/GPUToSPIRV/WmmaOpsToSPIRV.cpp
    M mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
    M mlir/lib/Dialect/SPIRV/IR/CastOps.cpp
    M mlir/lib/Dialect/SPIRV/IR/CooperativeMatrixOps.cpp
    M mlir/lib/Dialect/SPIRV/IR/SPIRVDialect.cpp
    M mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp
    M mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
    M mlir/lib/Target/SPIRV/Deserialization/DeserializeOps.cpp
    M mlir/lib/Target/SPIRV/Deserialization/Deserializer.cpp
    M mlir/lib/Target/SPIRV/Serialization/Serializer.cpp
    M mlir/test/Conversion/ComplexToStandard/convert-to-standard.mlir
    M mlir/test/Conversion/GPUToSPIRV/wmma-ops-to-spirv-khr-coop-matrix.mlir
    R mlir/test/Conversion/GPUToSPIRV/wmma-ops-to-spirv-nv-coop-matrix.mlir
    M mlir/test/Dialect/SPIRV/IR/cast-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/composite-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/khr-cooperative-matrix-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/matrix-ops.mlir
    R mlir/test/Dialect/SPIRV/IR/nv-cooperative-matrix-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/structure-ops.mlir
    M mlir/test/Dialect/SPIRV/IR/types.mlir
    M mlir/test/Target/SPIRV/matrix.mlir
    R mlir/test/Target/SPIRV/nv-cooperative-matrix-ops.mlir
    M mlir/test/lib/Interfaces/TilingInterface/TestTilingInterface.cpp
    M openmp/libomptarget/include/DeviceImage.h
    M openmp/libomptarget/include/OffloadEntry.h
    M openmp/libomptarget/include/device.h
    M openmp/libomptarget/src/DeviceImage.cpp
    M openmp/libomptarget/src/PluginManager.cpp
    M openmp/libomptarget/src/device.cpp

  Log Message:
  -----------
  fix name

Created using spr 1.3.4

Compare: https://github.com/llvm/llvm-project/compare/552d53ea25bb...8b1c4de0da36