[all-commits] [llvm/llvm-project] bda8c5: [Driver] Use llvm::is_contained (NFC) (#140310)

Fri May 16 23:49:56 PDT 2025

  Branch: refs/heads/users/el-ev/05-17-_clang_nfc_use_llvm_sort_
  Home:   https://github.com/llvm/llvm-project
  Commit: bda8c502bffa4f526bc3a7d22179ebfe398351c7
      https://github.com/llvm/llvm-project/commit/bda8c502bffa4f526bc3a7d22179ebfe398351c7
  Author: Kazu Hirata <kazu at google.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M clang/lib/Driver/ToolChains/MSVC.cpp

  Log Message:
  -----------
  [Driver] Use llvm::is_contained (NFC) (#140310)

  Commit: dfac0445d0813abe2260ffdc9eeb23671cefd812
      https://github.com/llvm/llvm-project/commit/dfac0445d0813abe2260ffdc9eeb23671cefd812
  Author: Kazu Hirata <kazu at google.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M lldb/tools/lldb-dap/EventHelper.cpp
    M lldb/tools/lldb-dap/Handler/EvaluateRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/ExceptionInfoRequestHandler.cpp
    M lldb/tools/lldb-dap/JSONUtils.cpp

  Log Message:
  -----------
  [lldb-dap] Avoid creating temporary instances of std::string (NFC) (#140325)

EmplaceSafeString accepts StringRef for the last parameter, str, and
then internally creates a copy of str via StringRef::str or
llvm::json::fixUTF8, so caller do not need to create their own
temporary instances of std::string.

  Commit: 6963309af12f8d1a688fa2c42019d83e78a0024c
      https://github.com/llvm/llvm-project/commit/6963309af12f8d1a688fa2c42019d83e78a0024c
  Author: Kazu Hirata <kazu at google.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M clang/lib/Frontend/CompilerInvocation.cpp

  Log Message:
  -----------
  [Frontend] Avoid creating a temporary instance of std::string (NFC) (#140326)

Since getLastArgValue returns StringRef, and the constructor of
SmallString accepts StringRef, we do not need to go through a
temporary instance of std::string.

  Commit: 6d9ce6767d259a5231ae312a19459f8fea3bd0ca
      https://github.com/llvm/llvm-project/commit/6d9ce6767d259a5231ae312a19459f8fea3bd0ca
  Author: Jeremy Kun <jkun at google.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M mlir/lib/Dialect/Tensor/IR/TensorDialect.cpp
    M mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
    M mlir/test/Dialect/Tensor/bufferize.mlir

  Log Message:
  -----------
  [mlir][bufferization] implement BufferizableOpInterface for concat op (#140171)

Lowers `tensor.concat` to an alloc with a series of `memref.copy` ops to
copy the operands to the alloc.

Example:

```mlir
func.func @tensor.concat(%f: tensor<8xf32>) -> tensor<16xf32> {
  %t = tensor.concat dim(0) %f, %f : (tensor<8xf32>, tensor<8xf32>) -> tensor<16xf32>
  return %t : tensor<16xf32>
}
```

Produces

```mlir
module {
  func.func @tensor.concat(%arg0: tensor<8xf32>) -> tensor<16xf32> {
    // initialization
    %0 = bufferization.to_memref %arg0 : tensor<8xf32> to memref<8xf32>
    %alloc = memref.alloc() {alignment = 64 : i64} : memref<8xf32>
    memref.copy %0, %alloc : memref<8xf32> to memref<8xf32>
    %alloc_0 = memref.alloc() {alignment = 64 : i64} : memref<8xf32>
    memref.copy %0, %alloc_0 : memref<8xf32> to memref<8xf32>
    %alloc_1 = memref.alloc() {alignment = 64 : i64} : memref<16xf32>

    // one copy for each operand
    %subview = memref.subview %alloc_1[0] [8] [1] : memref<16xf32> to memref<8xf32, strided<[1]>>
    memref.copy %alloc, %subview : memref<8xf32> to memref<8xf32, strided<[1]>>
    %subview_2 = memref.subview %alloc_1[8] [8] [1] : memref<16xf32> to memref<8xf32, strided<[1], offset: 8>>
    memref.copy %alloc_0, %subview_2 : memref<8xf32> to memref<8xf32, strided<[1], offset: 8>>
    %1 = bufferization.to_tensor %alloc_1 : memref<16xf32> to tensor<16xf32>
    return %1 : tensor<16xf32>
  }
}
```

This is my first time implementing BufferizableOpInterface, so I'm
looking for some advice on how I can:

1. Clean up my implementation.
2. Avoid duplicate `memref.copy` ops in the `// initialization` section
above when handling duplicate `tensor.concat` operands.

---------

Co-authored-by: Jeremy Kun <j2kun at users.noreply.github.com>

  Commit: 7b8bc1b3d1ae99894b4c7741e08a0b9bfb2ffb80
      https://github.com/llvm/llvm-project/commit/7b8bc1b3d1ae99894b4c7741e08a0b9bfb2ffb80
  Author: Kazu Hirata <kazu at google.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M mlir/lib/Dialect/Tensor/IR/TensorDialect.cpp
    M mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
    M mlir/test/Dialect/Tensor/bufferize.mlir

  Log Message:
  -----------
  Revert "[mlir][bufferization] implement BufferizableOpInterface for concat op (#140171)"

This reverts commit 6d9ce6767d259a5231ae312a19459f8fea3bd0ca.

Multiple builtbot failures have been reported:
https://github.com/llvm/llvm-project/pull/140171

  Commit: 952306226b5d9279ad3049baa8f10082e12a635a
      https://github.com/llvm/llvm-project/commit/952306226b5d9279ad3049baa8f10082e12a635a
  Author: Kazu Hirata <kazu at google.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M bolt/lib/Passes/PettisAndHansen.cpp

  Log Message:
  -----------
  [BOLT] Use llvm::max_element (NFC) (#140342)

  Commit: e66cecd8d56f4bb62e01e47830327f28dcd7ac66
      https://github.com/llvm/llvm-project/commit/e66cecd8d56f4bb62e01e47830327f28dcd7ac66
  Author: Timm Baeder <tbaeder at redhat.com>
  Date:   2025-05-17 (Sat, 17 May 2025)

  Changed paths:
    M clang/lib/AST/Expr.cpp

  Log Message:
  -----------
  [clang][NFC] Clean up Expr::isTemporaryObject() (#140229)

  Commit: 578741b5e85110565b9b2de84d93b2c993ac0b79
      https://github.com/llvm/llvm-project/commit/578741b5e85110565b9b2de84d93b2c993ac0b79
  Author: Shilei Tian <i at tianshilei.me>
  Date:   2025-05-17 (Sat, 17 May 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
    M llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll
    M llvm/test/CodeGen/AMDGPU/amdgpu-attributor-no-agpr.ll
    M llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll
    M llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll
    M llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
    M llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll
    M llvm/test/CodeGen/AMDGPU/attr-amdgpu-max-num-workgroups-propagate.ll
    M llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit-undefined-behavior.ll
    M llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll
    M llvm/test/CodeGen/AMDGPU/attributor-loop-issue-58639.ll
    M llvm/test/CodeGen/AMDGPU/direct-indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/duplicate-attribute-indirect.ll
    M llvm/test/CodeGen/AMDGPU/implicitarg-offset-attributes.ll
    M llvm/test/CodeGen/AMDGPU/indirect-call-set-from-other-function.ll
    M llvm/test/CodeGen/AMDGPU/inline-attr.ll
    M llvm/test/CodeGen/AMDGPU/issue120256-annotate-constexpr-addrspacecast.ll
    M llvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/propagate-flat-work-group-size.ll
    M llvm/test/CodeGen/AMDGPU/propagate-waves-per-eu.ll
    M llvm/test/CodeGen/AMDGPU/recursive_global_initializer.ll
    M llvm/test/CodeGen/AMDGPU/remove-no-kernel-id-attribute.ll
    M llvm/test/CodeGen/AMDGPU/simple-indirect-call-2.ll
    M llvm/test/CodeGen/AMDGPU/simple-indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-attribute-missing.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-multistep.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-nested-function-calls.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-prevent-attribute-propagation.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-recursion-test.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-test.ll

  Log Message:
  -----------
  [AMDGPU][Attributor] Rework update of `AAAMDWavesPerEU` (#123995)

Currently, we use `AAAMDWavesPerEU` to iteratively update values based
on attributes from the associated function, potentially propagating
user-annotated values, along with `AAAMDFlatWorkGroupSize`. Similarly,
we have `AAAMDFlatWorkGroupSize`. However, since the value calculated
through the flat workgroup size always dominates the user annotation
(i.e., the attribute), running `AAAMDWavesPerEU` iteratively is
unnecessary if no user-annotated value exists.

This PR completely rewrites how the `amdgpu-waves-per-eu` attribute is
handled in `AMDGPUAttributor`. The key changes are as follows:

- `AAAMDFlatWorkGroupSize` remains unchanged.
- `AAAMDWavesPerEU` now only propagates user-annotated values.
- A new function is added to check and update `amdgpu-waves-per-eu`
based on the following rules:
- No waves per eu, no flat workgroup size: Assume a flat workgroup size
of `1,1024` and compute waves per eu based on this.
- No waves per eu, flat workgroup size exists: Use the provided flat
workgroup size to compute waves-per-eu.
- Waves per eu exists, no flat workgroup size: This is a tricky case. In
this PR, we assume a flat workgroup size of `1,1024`, but this can be
adjusted if a different approach is preferred. Alternatively, we could
directly use the user-annotated value.
- Both waves per eu and flat workgroup size exist: If there’s a
conflict, the value derived from the flat workgroup size takes
precedence over waves per eu.

This PR also updates the logic for merging two waves per eu pairs. The
current implementation, which uses `clampStateAndIndicateChange` to
compute a union, might not be ideal. If we think from ensure proper
resource allocation perspective, for instance, if one pair specifies a
minimum of 2 waves per eu, and another specifies a minimum of 4, we
should guarantee that 4 waves per eu can be supported, as failing to do
so could result in excessive resource allocation per wave. A similar
principle applies to the upper bound. Thus, the PR uses the following
approach for merging two pairs, `lo_a,up_a` and `lo_b,up_b`: `max(lo_a,
lo_b), max(up_a, up_b)`. This ensures that resource allocation adheres
to the stricter constraints from both inputs.

Fix #123092.

  Commit: 4ddab1252fe6a90111a034cef184549882aaba2b
      https://github.com/llvm/llvm-project/commit/4ddab1252fe6a90111a034cef184549882aaba2b
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2025-05-17 (Sat, 17 May 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
    M llvm/test/CodeGen/AMDGPU/packed-fp32.ll

  Log Message:
  -----------
  AMDGPU: Move reg_sequence splat handling (#140313)

This code clunkily tried to find a splat reg_sequence by
looking at every use of the reg_sequence, and then looking
back at the reg_sequence to see if it's a splat. Extract this
into a separate helper function to help clean this up. This now
parses whether the reg_sequence forms a splat once, and defers the
legal inline immediate check to the use check (which is really use
context dependent)

The one regression is in globalisel, which has an extra
copy that should have been separately folded out. It was getting
dealt with by the handling of foldable copies in tryToFoldACImm.

This is preparation for #139908 and #139317

  Commit: aaaae99663dbb220c6c27fa9cacf93fcb8f20e7c
      https://github.com/llvm/llvm-project/commit/aaaae99663dbb220c6c27fa9cacf93fcb8f20e7c
  Author: Craig Topper <craig.topper at sifive.com>
  Date:   2025-05-16 (Fri, 16 May 2025)

  Changed paths:
    M llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

  Log Message:
  -----------
  [SelectionDAG] Use getInsertSubvector/VectorElt and getExtractSubvector/VectorElt in LegalizeVectorTypes. NFC

  Commit: 654b3ab9e326feec2865087cf9fa244e35ee1450
      https://github.com/llvm/llvm-project/commit/654b3ab9e326feec2865087cf9fa244e35ee1450
  Author: Iris Shi <0.0 at owo.li>
  Date:   2025-05-17 (Sat, 17 May 2025)

  Changed paths:
    M bolt/lib/Passes/PettisAndHansen.cpp
    M clang/lib/AST/Expr.cpp
    M clang/lib/Driver/ToolChains/MSVC.cpp
    M clang/lib/Frontend/CompilerInvocation.cpp
    M lldb/tools/lldb-dap/EventHelper.cpp
    M lldb/tools/lldb-dap/Handler/EvaluateRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/ExceptionInfoRequestHandler.cpp
    M lldb/tools/lldb-dap/JSONUtils.cpp
    M llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
    M llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
    M llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
    M llvm/test/CodeGen/AMDGPU/addrspacecast-constantexpr.ll
    M llvm/test/CodeGen/AMDGPU/amdgpu-attributor-no-agpr.ll
    M llvm/test/CodeGen/AMDGPU/annotate-existing-abi-attributes.ll
    M llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll
    M llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
    M llvm/test/CodeGen/AMDGPU/annotate-kernel-features.ll
    M llvm/test/CodeGen/AMDGPU/attr-amdgpu-max-num-workgroups-propagate.ll
    M llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit-undefined-behavior.ll
    M llvm/test/CodeGen/AMDGPU/attributor-flatscratchinit.ll
    M llvm/test/CodeGen/AMDGPU/attributor-loop-issue-58639.ll
    M llvm/test/CodeGen/AMDGPU/direct-indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/duplicate-attribute-indirect.ll
    M llvm/test/CodeGen/AMDGPU/implicitarg-offset-attributes.ll
    M llvm/test/CodeGen/AMDGPU/indirect-call-set-from-other-function.ll
    M llvm/test/CodeGen/AMDGPU/inline-attr.ll
    M llvm/test/CodeGen/AMDGPU/issue120256-annotate-constexpr-addrspacecast.ll
    M llvm/test/CodeGen/AMDGPU/packed-fp32.ll
    M llvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/propagate-flat-work-group-size.ll
    M llvm/test/CodeGen/AMDGPU/propagate-waves-per-eu.ll
    M llvm/test/CodeGen/AMDGPU/recursive_global_initializer.ll
    M llvm/test/CodeGen/AMDGPU/remove-no-kernel-id-attribute.ll
    M llvm/test/CodeGen/AMDGPU/simple-indirect-call-2.ll
    M llvm/test/CodeGen/AMDGPU/simple-indirect-call.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-attribute-missing.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-multistep.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-nested-function-calls.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-prevent-attribute-propagation.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-recursion-test.ll
    M llvm/test/CodeGen/AMDGPU/uniform-work-group-test.ll

  Log Message:
  -----------
  Merge branch 'main' into users/el-ev/05-17-_clang_nfc_use_llvm_sort_

Compare: https://github.com/llvm/llvm-project/compare/935633e4e6b2...654b3ab9e326

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications