[all-commits] [llvm/llvm-project] 0d0eed: [AMDGPU][Legalizer] Widen i16 G_SEXT_INREG (#131308)

Wed May 7 06:32:20 PDT 2025

  Branch: refs/heads/users/el-ev/fix-frexp-fold
  Home:   https://github.com/llvm/llvm-project
  Commit: 0d0eed419fa362e1932b694e01534f4012dcea97
      https://github.com/llvm/llvm-project/commit/0d0eed419fa362e1932b694e01534f4012dcea97
  Author: Pierre van Houtryve <pierre.vanhoutryve at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-abs.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext-inreg.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smulh.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.abs.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/sext_inreg.ll
    M llvm/test/CodeGen/AMDGPU/vector-reduce-smax.ll
    M llvm/test/CodeGen/AMDGPU/vector-reduce-smin.ll

  Log Message:
  -----------
  [AMDGPU][Legalizer] Widen i16 G_SEXT_INREG (#131308)

It's better to widen them to avoid it being lowered into a G_ASHR + G_SHL. With this change we just extend to i32 then trunc the result.

  Commit: 74c3025dd518aae01db5fbbd06b81c8ad272f959
      https://github.com/llvm/llvm-project/commit/74c3025dd518aae01db5fbbd06b81c8ad272f959
  Author: Orlando Cazalet-Hyams <orlando.hyams at sony.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/Transforms/Utils/SimplifyCFG.cpp
    A llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll

  Log Message:
  -----------
  [KeyInstr][SimplifyCFG] Remap atoms after duplication for threading (#133484)

Given the same branch condition in `a` and `c` SimplifyCFG converts:

        +> b -+
        |     v
    --> a --> c --> e -->
              |     ^
              +> d -+
into:

        +--> bcd ---+
        |           v
    --> a --> c --> e -->

Remap source atoms on instructions duplicated from `c` into `bcd`.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

  Commit: a13c0b67708173b8033a53ff6ae4c46c5b80bb2b
      https://github.com/llvm/llvm-project/commit/a13c0b67708173b8033a53ff6ae4c46c5b80bb2b
  Author: Kiran Chandramohan <kiran.chandramohan at arm.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M flang/include/flang/Parser/dump-parse-tree.h
    M flang/include/flang/Parser/parse-tree.h
    M flang/lib/Lower/OpenMP/OpenMP.cpp
    M flang/lib/Parser/openmp-parsers.cpp
    M flang/lib/Parser/unparse.cpp
    M flang/lib/Semantics/check-omp-structure.cpp
    M flang/lib/Semantics/check-omp-structure.h
    M flang/lib/Semantics/resolve-names.cpp
    A flang/test/Lower/OpenMP/Todo/declare-variant.f90
    A flang/test/Parser/OpenMP/declare-variant.f90
    A flang/test/Semantics/OpenMP/declare-variant.f90
    M llvm/include/llvm/Frontend/OpenMP/OMP.td

  Log Message:
  -----------
  [Flang][OpenMP] Add frontend support for declare variant (#130578)

Support is added for parsing. Basic semantics support is added to
forward the code to Lowering. Lowering will emit a TODO error. Detailed
semantics checks and lowering is further work.

  Commit: b643a529dcd2b1b2e4e81c3be427edfcadc6d8fa
      https://github.com/llvm/llvm-project/commit/b643a529dcd2b1b2e4e81c3be427edfcadc6d8fa
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/tools/debugserver/source/DNBTimer.h

  Log Message:
  -----------
  [lldb][debugserver] Add missing include to DNBTimer.h

  Commit: 47c7e73e5763f81f218cc4e1eae306d0427aa42d
      https://github.com/llvm/llvm-project/commit/47c7e73e5763f81f218cc4e1eae306d0427aa42d
  Author: David Spickett <david.spickett at linaro.org>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
    M lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py
    M lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py
    M lldb/test/API/tools/lldb-dap/attach/TestDAP_attachByPortNum.py
    M lldb/test/API/tools/lldb-dap/breakpoint-events/TestDAP_breakpointEvents.py
    M lldb/test/API/tools/lldb-dap/completions/TestDAP_completions.py
    M lldb/test/API/tools/lldb-dap/console/TestDAP_console.py
    M lldb/test/API/tools/lldb-dap/disconnect/TestDAP_disconnect.py
    M lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py
    M lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py
    M lldb/test/API/tools/lldb-dap/progress/TestDAP_Progress.py
    M lldb/test/API/tools/lldb-dap/repl-mode/TestDAP_repl_mode_detection.py
    M lldb/test/API/tools/lldb-dap/restart/TestDAP_restart.py
    M lldb/test/API/tools/lldb-dap/restart/TestDAP_restart_runInTerminal.py
    M lldb/test/API/tools/lldb-dap/stop-hooks/TestDAP_stop_hooks.py
    M lldb/tools/lldb-dap/DAP.cpp
    M lldb/tools/lldb-dap/DAP.h
    M lldb/tools/lldb-dap/EventHelper.cpp
    M lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/ConfigurationDoneRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/InitializeRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/LaunchRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/RequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/RequestHandler.h

  Log Message:
  -----------
  Revert "[lldb-dap] Change the launch sequence (#138219)"

This reverts commit ba29e60f9a2222bd5e883579bb78db13fc5a7588.

As it broke tests on Windows on Arm: https://lab.llvm.org/buildbot/#/builders/141/builds/8500

********************
Unresolved Tests (2):
  lldb-api :: tools/lldb-dap/completions/TestDAP_completions.py
  lldb-api :: tools/lldb-dap/startDebugging/TestDAP_startDebugging.py
********************
Timed Out Tests (1):
  lldb-api :: tools/lldb-dap/send-event/TestDAP_sendEvent.py
********************
Failed Tests (6):
  lldb-api :: tools/lldb-dap/console/TestDAP_console.py
  lldb-api :: tools/lldb-dap/console/TestDAP_redirection_to_console.py
  lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py
  lldb-api :: tools/lldb-dap/stackTrace/TestDAP_stackTrace.py
  lldb-api :: tools/lldb-dap/stackTraceDisassemblyDisplay/TestDAP_stackTraceDisassemblyDisplay.py
  lldb-api :: tools/lldb-dap/variables/children/TestDAP_variables_children.py

  Commit: 18c5ad5c6c178365d270439742863e14c8981ea3
      https://github.com/llvm/llvm-project/commit/18c5ad5c6c178365d270439742863e14c8981ea3
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/include/lldb/Symbol/Block.h
    M lldb/source/Symbol/Block.cpp
    A lldb/test/Shell/Commands/command-disassemble-sections.s

  Log Message:
  -----------
  [lldb] Fix block address resolution for functions in multiple sections (#137955)

Continuing the theme from #116777 and #124931, this patch ensures we
compute the correct address when a functions is spread across multiple
sections. Due to this, it's not sufficient to adjust the offset in the
section+offset pair (Address::Slide). We must actually slide the file
offset and then recompute the section using the result.

I found this out due to a failure to disassemble some parts of the
function, so I'm testing with that, although it's likely there are other
things that were broken due to this.

  Commit: 75e5643abf6b59db8dfae6b524e9c3c2ec0ffc29
      https://github.com/llvm/llvm-project/commit/75e5643abf6b59db8dfae6b524e9c3c2ec0ffc29
  Author: Tom Eccles <tom.eccles at arm.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M flang/docs/OpenMPSupport.md
    M flang/include/flang/Lower/ConvertVariable.h
    M flang/lib/Lower/ConvertVariable.cpp
    M flang/lib/Lower/OpenMP/OpenMP.cpp
    M flang/test/Lower/OpenMP/omp-declare-target-program-var.f90
    M flang/test/Lower/OpenMP/threadprivate-host-association-2.f90
    M flang/test/Lower/OpenMP/threadprivate-host-association-3.f90
    A flang/test/Lower/OpenMP/threadprivate-lenparams.f90
    M flang/test/Lower/OpenMP/threadprivate-non-global.f90

  Log Message:
  -----------
  [flang][OpenMP] share global variable initialization code (#138672)

Fixes #108136

In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.

Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.

Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.

  Commit: e3ee6bbd384ef4c583b9f7bca4253ae0fba90a70
      https://github.com/llvm/llvm-project/commit/e3ee6bbd384ef4c583b9f7bca4253ae0fba90a70
  Author: Lang Hames <lhames at gmail.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/ExecutionEngine/JITLink/ELF_i386.cpp

  Log Message:
  -----------
  [JITLink][i386] Make ELFLinkGraphBuilder_i386 a regular (non-template) class.

The ELF type for i386 is always ELF32LE so we can pass ELF32LE directly to the
base class template (ELFLinkGraphBuilder).

  Commit: 01813e89295b9229760bc9a62926e04bfbe866c2
      https://github.com/llvm/llvm-project/commit/01813e89295b9229760bc9a62926e04bfbe866c2
  Author: Paul Walker <paul.walker at arm.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M clang/lib/Driver/ToolChains/CommonArgs.cpp
    M clang/test/Driver/fveclib.c
    M llvm/include/llvm/Analysis/TargetLibraryInfo.h
    M llvm/lib/Analysis/TargetLibraryInfo.cpp
    M llvm/lib/Frontend/Driver/CodeGenOptions.cpp
    M llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
    M llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-VF2-VF8.ll
    M llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll
    M llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll
    M llvm/test/Transforms/Util/add-TLI-mappings.ll

  Log Message:
  -----------
  [LLVM][VecLib] Refactor LIBMVEC integration to be target neutral. (#138262)

Renames LIBMVEC-X86 to LIBMVEC and updates TLI to only add the existing
x86 specific mapping when targeting x86.

  Commit: 62385b848757f2dc35070eadb2ccd921508497dc
      https://github.com/llvm/llvm-project/commit/62385b848757f2dc35070eadb2ccd921508497dc
  Author: David Spickett <david.spickett at linaro.org>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/docs/resources/debugging.rst

  Log Message:
  -----------
  [lldb][docs] Correct spelling in debugging doc

  Commit: c3ce5684a8b408220eed983d065edba0e6ed5016
      https://github.com/llvm/llvm-project/commit/c3ce5684a8b408220eed983d065edba0e6ed5016
  Author: Aniket Lal <lalaniket8 at gmail.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M clang/lib/CodeGen/CodeGenModule.cpp
    M clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
    M clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
    M clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
    M clang/test/CodeGenOpenCL/convergent.cl
    M clang/test/CodeGenOpenCL/enqueue-kernel-non-entry-block.cl

  Log Message:
  -----------
  [Clang][OpenCL][AMDGPU]  OpenCL Kernel stubs should be assigned alwaysinline attribute (#137769)

OpenCL Kernels body is emitted as stubs and the kernel is emitted as
call to respective stub.
(https://github.com/llvm/llvm-project/pull/115821).
The stub function should be alwaysinlined, since call to stub can cause
performance drop.

Co-authored-by: anikelal <anikelal at amd.com>

  Commit: 2fb288d4b8e0fb6c08a1a72b64cbf6a0752fdac7
      https://github.com/llvm/llvm-project/commit/2fb288d4b8e0fb6c08a1a72b64cbf6a0752fdac7
  Author: Kareem Ergawy <kareem.ergawy at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M flang/lib/Lower/Bridge.cpp
    M flang/lib/Optimizer/Builder/FIRBuilder.cpp
    M flang/test/Lower/do_concurrent.f90
    M flang/test/Lower/do_concurrent_local_default_init.f90
    M flang/test/Lower/loops.f90
    M flang/test/Lower/loops3.f90
    M flang/test/Lower/nsw.f90
    M flang/test/Transforms/DoConcurrent/basic_host.f90
    M flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90
    M flang/test/Transforms/DoConcurrent/loop_nest_test.f90
    M flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90
    M flang/test/Transforms/DoConcurrent/non_const_bounds.f90
    M flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90

  Log Message:
  -----------
  [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#137928)

Adds support for lowering `do concurrent` nests from PFT to the new
`fir.do_concurrent` MLIR op as well as its special terminator
`fir.do_concurrent.loop` which models the actual loop nest.

To that end, this PR emits the allocations for the iteration variables
within the block of the `fir.do_concurrent` op and creates a region for
the `fir.do_concurrent.loop` op that accepts arguments equal in number
to the number of the input `do concurrent` iteration ranges.

For example, given the following input:
```fortran
   do concurrent(i=1:10, j=11:20)
   end do
```
the changes in this PR emit the following MLIR:
```mlir
    fir.do_concurrent {
      %22 = fir.alloca i32 {bindc_name = "i"}
      %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
      %24 = fir.alloca i32 {bindc_name = "j"}
      %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
      fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) {
        %26 = fir.convert %arg1 : (index) -> i32
        fir.store %26 to %23#0 : !fir.ref<i32>
        %27 = fir.convert %arg2 : (index) -> i32
        fir.store %27 to %25#0 : !fir.ref<i32>
      }
    }
```

  Commit: 5be080edf73abd9d980ced8a432aaf2861d4445e
      https://github.com/llvm/llvm-project/commit/5be080edf73abd9d980ced8a432aaf2861d4445e
  Author: Orlando Cazalet-Hyams <orlando.hyams at sony.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/include/llvm/IR/DebugInfoMetadata.h
    M llvm/lib/Transforms/Utils/InlineFunction.cpp
    A llvm/test/DebugInfo/KeyInstructions/Generic/inline-nodbg.ll

  Log Message:
  -----------
  [KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (#133485)

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

  Commit: 2f877c2722e882fe6aaaab44d25b7a49ba0612e1
      https://github.com/llvm/llvm-project/commit/2f877c2722e882fe6aaaab44d25b7a49ba0612e1
  Author: Mehdi Amini <joker.eph at gmail.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M mlir/test/IR/invalid-custom-print-parse.mlir
    M mlir/tools/mlir-tblgen/OpFormatGen.cpp

  Log Message:
  -----------
  [MLIR] Check that the prop-dict dictionnary does not have extra unknown entries (#138668)

At the moment we would just ignore them, which can be surprising and is
error prone (a typo for a unit attribute flag for example).

  Commit: c02aa91939d174a1efda934706d7b523b2fb7e31
      https://github.com/llvm/llvm-project/commit/c02aa91939d174a1efda934706d7b523b2fb7e31
  Author: Mehdi Amini <joker.eph at gmail.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M mlir/include/mlir/IR/BuiltinTypes.td
    M mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp
    M mlir/lib/IR/BuiltinTypes.cpp
    M mlir/lib/IR/TypeDetail.h
    M mlir/test/Dialect/Vector/vector-warp-distribute.mlir
    M mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp

  Log Message:
  -----------
  Revert "[mlir][MemRef] Remove integer address space builders" (#138853)

Reverts llvm/llvm-project#138579

An integration test is broken on the mlir-nvidia* bots.

  Commit: 7157228667396f1c113a96e9e9ecb9f0ca82a645
      https://github.com/llvm/llvm-project/commit/7157228667396f1c113a96e9e9ecb9f0ca82a645
  Author: Tomohiro Kashiwada <kikairoya at gmail.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
    M llvm/test/CodeGen/X86/mingw-comdats-xdata.ll
    M llvm/test/CodeGen/X86/mingw-comdats.ll

  Log Message:
  -----------
  [Cygwin] Emit COMDAT name correctly for Cygwin (#138621)

Cygwin-gcc emits COMDAT in the same format as MinGW-gcc.

  Commit: a83bb35e9989f9d27bb6c0578caa4183b8cbefdc
      https://github.com/llvm/llvm-project/commit/a83bb35e9989f9d27bb6c0578caa4183b8cbefdc
  Author: Kareem Ergawy <kareem.ergawy at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M flang/include/flang/Optimizer/Dialect/FIRAttr.td
    M flang/include/flang/Optimizer/Dialect/FIROps.td
    M flang/lib/Optimizer/Dialect/FIROps.cpp
    M flang/test/Fir/do_concurrent.fir
    M flang/test/Fir/invalid.fir

  Log Message:
  -----------
  [flang][fir] Add `fir.local` op for locality specifiers (#138505)

Adds a new `fir.local` op to model `local` and `local_init` locality
specifiers. This op is a clone of `omp.private`. In particular, this new
op also models the privatization/localization logic of an SSA value in
the `fir` dialect just like `omp.private` does for OpenMP.

PR stack:
- https://github.com/llvm/llvm-project/pull/137928
- https://github.com/llvm/llvm-project/pull/138505 (this PR)
- https://github.com/llvm/llvm-project/pull/138506
- https://github.com/llvm/llvm-project/pull/138512
- https://github.com/llvm/llvm-project/pull/138534
- https://github.com/llvm/llvm-project/pull/138816

  Commit: c3a638caabf96fedce09f4b58b4ba550a015e150
      https://github.com/llvm/llvm-project/commit/c3a638caabf96fedce09f4b58b4ba550a015e150
  Author: Pierre van Houtryve <pierre.vanhoutryve at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
    A llvm/test/CodeGen/AMDGPU/GlobalISel/selected-inst-flags.mir

  Log Message:
  -----------
  [GlobalISel] Fix silently dropped MIFlags on selected instructions (#138851)

We used uint16 for flags but flags now go up to 24 bits, so all flags in bits 16-24 were lost.

Fixes #110801

  Commit: c22081c320340d0e7542b247ee093ca515509b52
      https://github.com/llvm/llvm-project/commit/c22081c320340d0e7542b247ee093ca515509b52
  Author: Pierre van Houtryve <pierre.vanhoutryve at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp
    M llvm/test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
    M llvm/test/CodeGen/AArch64/arm64-preserve-all.ll
    M llvm/test/CodeGen/AArch64/arm64-preserve-most.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/inline-asm-lowering-errors.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inline-asm-mismatched-size.ll

  Log Message:
  -----------
  [GlobalISel] Diagnose inline assembly constraint lowering errors (#135782)

Instead of printing something to dbgs (which is not visible to all users),
emit a diagnostic like the DAG does. We still crash later because we fail to
select the inline assembly, but at least now users will know why it's crashing.

In a future patch we could also recover from the error like the DAG does, so the
lowering can keep going until it either crashes or gives a different error later.

  Commit: 17b2b6ddef4b1dc74a4b459d06510c25fa883329
      https://github.com/llvm/llvm-project/commit/17b2b6ddef4b1dc74a4b459d06510c25fa883329
  Author: Aniket Lal <lalaniket8 at gmail.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M clang/test/CodeGenOpenCL/opencl-kernel-call.cl

  Log Message:
  -----------
  [Clang][OpenCL][AMDGPU] Add tests for optnone attribute assigned to OpenCL Kernels (#138849)

OpenCL Kernel stubs should be always inlined
https://github.com/llvm/llvm-project/pull/137769
In case optnone is assigned to kernel, respective stub should not be
assigned alwaysinline, we add test for the same.

Co-authored-by: anikelal <anikelal at amd.com>

  Commit: c7b2d98c934c9578dd880370905b5abafdeccbe3
      https://github.com/llvm/llvm-project/commit/c7b2d98c934c9578dd880370905b5abafdeccbe3
  Author: Kees Cook <kees at kernel.org>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M clang/docs/SanitizerCoverage.rst
    M clang/include/clang/Basic/CodeGenOptions.def
    M clang/include/clang/Driver/Options.td
    M clang/include/clang/Driver/SanitizerArgs.h
    M clang/lib/CodeGen/BackendUtil.cpp
    M clang/lib/Driver/SanitizerArgs.cpp
    M clang/test/Driver/fsanitize-coverage.c
    M llvm/include/llvm/Transforms/Utils/Instrumentation.h
    M llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
    A llvm/test/Instrumentation/SanitizerCoverage/stack-depth-callback.ll

  Log Message:
  -----------
  [sancov] Introduce optional callback for stack-depth tracking (#138323)

Normally -fsanitize-coverage=stack-depth inserts inline arithmetic to
update thread_local __sancov_lowest_stack. To support stack depth
tracking in the Linux kernel, which does not implement traditional
thread_local storage, provide the option to call a function instead.

This matches the existing "stackleak" implementation that is supported
in Linux via a GCC plugin. To make this coverage more performant, a
minimum estimated stack depth can be chosen to enable the callback mode,
skipping instrumentation of functions with smaller stacks.

With -fsanitize-coverage-stack-depth-callback-min set greater than 0,
the __sanitize_cov_stack_depth() callback will be injected when the
estimated stack depth is greater than or equal to the given minimum.

  Commit: 0db040576d4ccb313fc58a90e1b4149f7589cc8c
      https://github.com/llvm/llvm-project/commit/0db040576d4ccb313fc58a90e1b4149f7589cc8c
  Author: pvanhout <pierre.vanhoutryve at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp
    M llvm/test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
    M llvm/test/CodeGen/AArch64/arm64-preserve-all.ll
    M llvm/test/CodeGen/AArch64/arm64-preserve-most.ll
    R llvm/test/CodeGen/AMDGPU/GlobalISel/inline-asm-lowering-errors.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/inline-asm-mismatched-size.ll

  Log Message:
  -----------
  Revert "[GlobalISel] Diagnose inline assembly constraint lowering errors (#135782)"

This reverts commit c22081c320340d0e7542b247ee093ca515509b52.

  Commit: 21501d1cf290a63760904fb125e77b432db49933
      https://github.com/llvm/llvm-project/commit/21501d1cf290a63760904fb125e77b432db49933
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/include/lldb/Target/Target.h
    M lldb/source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.cpp
    M lldb/source/Target/Target.cpp
    M lldb/test/API/lang/cpp/dynamic-value/TestDynamicValue.py

  Log Message:
  -----------
  [lldb] Fix dynamic type resolutions for core files (#138698)

We're reading from the object's vtable to determine the pointer to the
full object. The vtable is normally in the "rodata" section of the
executable, which is often not included in the core file because it's
not supposed to change and the debugger can extrapolate its contents
from the executable file. We weren't doing that.

This patch changes the read operation to use the target class (which
falls back onto the executable module as expected) and adds the missing
ReadSignedIntegerFromMemory API. The fix is tested by creating a core
(minidump) file which deliberately omits the vtable pointer.

  Commit: 7c5f5f3ef83b1d1d43d63862a8431af3dded15bb
      https://github.com/llvm/llvm-project/commit/7c5f5f3ef83b1d1d43d63862a8431af3dded15bb
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/source/Host/windows/PipeWindows.cpp
    M lldb/source/Host/windows/ProcessLauncherWindows.cpp
    M lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp
    M lldb/tools/lldb-server/lldb-platform.cpp
    M lldb/unittests/Host/HostTest.cpp

  Log Message:
  -----------
  [lldb] Inherit DuplicateFileAction(HANDLE, HANDLE) handles on windows (#137978)

This is a follow-up to https://github.com/llvm/llvm-project/pull/126935,
which enables passing handles to a child
process on windows systems. Unlike on unix-like systems, the handles
need to be created with the "inheritable" flag because there's to way to
change the flag value after it has been created. This is why I don't
respect the child_process_inherit flag but rather always set the flag to
true. (My next step is to delete the flag entirely.)

This does mean that pipe may be created as inheritable even if its not
necessary, but I think this is offset by the fact that windows (unlike
unixes, which pass all ~O_CLOEXEC descriptors through execve and *all*
descriptors through fork) has a way to specify the precise set of
handles to pass to a specific child process.

If this turns out to be insufficient, instead of a constructor flag, I'd
rather go with creating a separate api to create an inheritable copy of
a handle (as typically, you only want to inherit one end of the pipe).

  Commit: d865f32fe820f543f0a53bfeba08774f2c270589
      https://github.com/llvm/llvm-project/commit/d865f32fe820f543f0a53bfeba08774f2c270589
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M lldb/include/lldb/Symbol/DWARFCallFrameInfo.h
    M lldb/source/Symbol/DWARFCallFrameInfo.cpp
    M lldb/source/Symbol/FuncUnwinders.cpp
    M lldb/source/Symbol/UnwindTable.cpp
    M lldb/source/Target/RegisterContextUnwind.cpp
    M lldb/test/Shell/Unwind/Inputs/basic-block-sections-with-dwarf.s
    M lldb/test/Shell/Unwind/basic-block-sections-with-dwarf-static.test
    M lldb/unittests/Symbol/TestDWARFCallFrameInfo.cpp

  Log Message:
  -----------
  [lldb] Parse DWARF CFI for discontinuous functions (#137006)

This patch uses the previously build infrastructure to parse multiple
FDE entries into a single unwind plan. There is one catch though: we
parse only one FDE entry per unwind range. This is not fully correct
because lldb coalesces adjecant address ranges, which means that
something that originally looked like two separate address ranges (and
two FDE entries) may get merged into one because if the linker decides
to put the two ranges next to each other. In this case, we will ignore
the second FDE entry.

It would be more correct to try to parse another entry when the one we
found turns out to be short, but I'm not doing this (yet), because:
- this is how we've done things so far (although, monolithic functions
are unlikely to have more than one FDE entry)
- in cases where we don't have debug info or (full) symbol tables, we
can end up with "symbols" which appear to span many megabytes
(potentially, the whole module). If we tried to fill short FDE entries,
we could end up parsing the entire eh_frame section in a single go. In a
way, this would be more correct, but it would also probably be very
slow.

I haven't quite decided what to do about this case yet, though it's not
particularly likely to happen in the "production" cases as typically the
functions are split into two parts (hot/cold) instead of one part per
basic block.

  Commit: 5dd1421da6c60700f2cb81a13fb5231bb965f0a6
      https://github.com/llvm/llvm-project/commit/5dd1421da6c60700f2cb81a13fb5231bb965f0a6
  Author: Shilei Tian <i at tianshilei.me>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp

  Log Message:
  -----------
  [NFC] Fix a compile warning of comparison of integers of different signs

  Commit: 3feb8b42e973f935883bc9e779645ecdae1a586d
      https://github.com/llvm/llvm-project/commit/3feb8b42e973f935883bc9e779645ecdae1a586d
  Author: Alex Voicu <alexandru.voicu at amd.com>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M clang/docs/HIPSupport.rst
    M clang/lib/Frontend/InitPreprocessor.cpp
    M clang/test/Preprocessor/predefined-macros.c
    M llvm/lib/Transforms/HipStdPar/HipStdPar.cpp
    M llvm/test/Transforms/HipStdPar/allocation-interposition.ll

  Log Message:
  -----------
  [HIP][HIPSTDPAR] Re-work allocation interposition for `hipstdpar` (#138790)

The allocation interposition mode had a number of issues, which are
primarily addressed in the library component via
<https://github.com/ROCm/rocThrust/pull/543>. However, it is necessary
to interpose some additional symbols, which this patch does.
Furthermore, to implement this in a compatible way, we guard the new
implementation under a V1 macro, which is defined in addition to the
existing `__HIPSTDPAR_INTERPOSE_ALLOC__` one.

  Commit: 4ad60d538e92d575e5b882003da7df3b06c6fdd6
      https://github.com/llvm/llvm-project/commit/4ad60d538e92d575e5b882003da7df3b06c6fdd6
  Author: Iris Shi <0.0 at owo.li>
  Date:   2025-05-07 (Wed, 07 May 2025)

  Changed paths:
    M clang/docs/HIPSupport.rst
    M clang/docs/SanitizerCoverage.rst
    M clang/include/clang/Basic/CodeGenOptions.def
    M clang/include/clang/Driver/Options.td
    M clang/include/clang/Driver/SanitizerArgs.h
    M clang/lib/CodeGen/BackendUtil.cpp
    M clang/lib/CodeGen/CodeGenModule.cpp
    M clang/lib/Driver/SanitizerArgs.cpp
    M clang/lib/Driver/ToolChains/CommonArgs.cpp
    M clang/lib/Frontend/InitPreprocessor.cpp
    M clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
    M clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
    M clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
    M clang/test/CodeGenOpenCL/convergent.cl
    M clang/test/CodeGenOpenCL/enqueue-kernel-non-entry-block.cl
    M clang/test/CodeGenOpenCL/opencl-kernel-call.cl
    M clang/test/Driver/fsanitize-coverage.c
    M clang/test/Driver/fveclib.c
    M clang/test/Preprocessor/predefined-macros.c
    M flang/docs/OpenMPSupport.md
    M flang/include/flang/Lower/ConvertVariable.h
    M flang/include/flang/Optimizer/Dialect/FIRAttr.td
    M flang/include/flang/Optimizer/Dialect/FIROps.td
    M flang/include/flang/Parser/dump-parse-tree.h
    M flang/include/flang/Parser/parse-tree.h
    M flang/lib/Lower/Bridge.cpp
    M flang/lib/Lower/ConvertVariable.cpp
    M flang/lib/Lower/OpenMP/OpenMP.cpp
    M flang/lib/Optimizer/Builder/FIRBuilder.cpp
    M flang/lib/Optimizer/Dialect/FIROps.cpp
    M flang/lib/Parser/openmp-parsers.cpp
    M flang/lib/Parser/unparse.cpp
    M flang/lib/Semantics/check-omp-structure.cpp
    M flang/lib/Semantics/check-omp-structure.h
    M flang/lib/Semantics/resolve-names.cpp
    M flang/test/Fir/do_concurrent.fir
    M flang/test/Fir/invalid.fir
    A flang/test/Lower/OpenMP/Todo/declare-variant.f90
    M flang/test/Lower/OpenMP/omp-declare-target-program-var.f90
    M flang/test/Lower/OpenMP/threadprivate-host-association-2.f90
    M flang/test/Lower/OpenMP/threadprivate-host-association-3.f90
    A flang/test/Lower/OpenMP/threadprivate-lenparams.f90
    M flang/test/Lower/OpenMP/threadprivate-non-global.f90
    M flang/test/Lower/do_concurrent.f90
    M flang/test/Lower/do_concurrent_local_default_init.f90
    M flang/test/Lower/loops.f90
    M flang/test/Lower/loops3.f90
    M flang/test/Lower/nsw.f90
    A flang/test/Parser/OpenMP/declare-variant.f90
    A flang/test/Semantics/OpenMP/declare-variant.f90
    M flang/test/Transforms/DoConcurrent/basic_host.f90
    M flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90
    M flang/test/Transforms/DoConcurrent/loop_nest_test.f90
    M flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90
    M flang/test/Transforms/DoConcurrent/non_const_bounds.f90
    M flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90
    M lldb/docs/resources/debugging.rst
    M lldb/include/lldb/Symbol/Block.h
    M lldb/include/lldb/Symbol/DWARFCallFrameInfo.h
    M lldb/include/lldb/Target/Target.h
    M lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
    M lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py
    M lldb/source/Host/windows/PipeWindows.cpp
    M lldb/source/Host/windows/ProcessLauncherWindows.cpp
    M lldb/source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.cpp
    M lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp
    M lldb/source/Symbol/Block.cpp
    M lldb/source/Symbol/DWARFCallFrameInfo.cpp
    M lldb/source/Symbol/FuncUnwinders.cpp
    M lldb/source/Symbol/UnwindTable.cpp
    M lldb/source/Target/RegisterContextUnwind.cpp
    M lldb/source/Target/Target.cpp
    M lldb/test/API/lang/cpp/dynamic-value/TestDynamicValue.py
    M lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py
    M lldb/test/API/tools/lldb-dap/attach/TestDAP_attachByPortNum.py
    M lldb/test/API/tools/lldb-dap/breakpoint-events/TestDAP_breakpointEvents.py
    M lldb/test/API/tools/lldb-dap/completions/TestDAP_completions.py
    M lldb/test/API/tools/lldb-dap/console/TestDAP_console.py
    M lldb/test/API/tools/lldb-dap/disconnect/TestDAP_disconnect.py
    M lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py
    M lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py
    M lldb/test/API/tools/lldb-dap/progress/TestDAP_Progress.py
    M lldb/test/API/tools/lldb-dap/repl-mode/TestDAP_repl_mode_detection.py
    M lldb/test/API/tools/lldb-dap/restart/TestDAP_restart.py
    M lldb/test/API/tools/lldb-dap/restart/TestDAP_restart_runInTerminal.py
    M lldb/test/API/tools/lldb-dap/stop-hooks/TestDAP_stop_hooks.py
    A lldb/test/Shell/Commands/command-disassemble-sections.s
    M lldb/test/Shell/Unwind/Inputs/basic-block-sections-with-dwarf.s
    M lldb/test/Shell/Unwind/basic-block-sections-with-dwarf-static.test
    M lldb/tools/debugserver/source/DNBTimer.h
    M lldb/tools/lldb-dap/DAP.cpp
    M lldb/tools/lldb-dap/DAP.h
    M lldb/tools/lldb-dap/EventHelper.cpp
    M lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/ConfigurationDoneRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/InitializeRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/LaunchRequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/RequestHandler.cpp
    M lldb/tools/lldb-dap/Handler/RequestHandler.h
    M lldb/tools/lldb-server/lldb-platform.cpp
    M lldb/unittests/Host/HostTest.cpp
    M lldb/unittests/Symbol/TestDWARFCallFrameInfo.cpp
    M llvm/include/llvm/Analysis/TargetLibraryInfo.h
    M llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
    M llvm/include/llvm/Frontend/OpenMP/OMP.td
    M llvm/include/llvm/IR/DebugInfoMetadata.h
    M llvm/include/llvm/Transforms/Utils/Instrumentation.h
    M llvm/lib/Analysis/TargetLibraryInfo.cpp
    M llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
    M llvm/lib/ExecutionEngine/JITLink/ELF_i386.cpp
    M llvm/lib/Frontend/Driver/CodeGenOptions.cpp
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Transforms/HipStdPar/HipStdPar.cpp
    M llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
    M llvm/lib/Transforms/Utils/InlineFunction.cpp
    M llvm/lib/Transforms/Utils/SimplifyCFG.cpp
    M llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-abs.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext-inreg.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smax.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smin.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-smulh.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.abs.ll
    A llvm/test/CodeGen/AMDGPU/GlobalISel/selected-inst-flags.mir
    M llvm/test/CodeGen/AMDGPU/GlobalISel/sext_inreg.ll
    M llvm/test/CodeGen/AMDGPU/vector-reduce-smax.ll
    M llvm/test/CodeGen/AMDGPU/vector-reduce-smin.ll
    M llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib.ll
    M llvm/test/CodeGen/X86/mingw-comdats-xdata.ll
    M llvm/test/CodeGen/X86/mingw-comdats.ll
    A llvm/test/DebugInfo/KeyInstructions/Generic/inline-nodbg.ll
    A llvm/test/DebugInfo/KeyInstructions/Generic/simplifycfg-thread-phi.ll
    A llvm/test/Instrumentation/SanitizerCoverage/stack-depth-callback.ll
    M llvm/test/Transforms/HipStdPar/allocation-interposition.ll
    M llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-VF2-VF8.ll
    M llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll
    M llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls.ll
    M llvm/test/Transforms/Util/add-TLI-mappings.ll
    M mlir/include/mlir/IR/BuiltinTypes.td
    M mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp
    M mlir/lib/IR/BuiltinTypes.cpp
    M mlir/lib/IR/TypeDetail.h
    M mlir/test/Dialect/Vector/vector-warp-distribute.mlir
    M mlir/test/IR/invalid-custom-print-parse.mlir
    M mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp
    M mlir/tools/mlir-tblgen/OpFormatGen.cpp

  Log Message:
  -----------
  Merge branch 'main' into users/el-ev/fix-frexp-fold

Compare: https://github.com/llvm/llvm-project/compare/d83bf26d9ba1...4ad60d538e92

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications