[all-commits] [llvm/llvm-project] 83b01a: [LoopIdiom] Support 'shift until less-than' idiom ...

Alexey Bataev via All-commits all-commits at lists.llvm.org
Mon Jul 8 08:42:22 PDT 2024


  Branch: refs/heads/users/alexey-bataev/spr/slpkeep-the-original-order-in-the-reductions
  Home:   https://github.com/llvm/llvm-project
  Commit: 83b01aaf51072a07261ee2e5fc14102f71273bc0
      https://github.com/llvm/llvm-project/commit/83b01aaf51072a07261ee2e5fc14102f71273bc0
  Author: Hari Limaye <hari.limaye at arm.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
    A llvm/test/Transforms/LoopIdiom/AArch64/ctlz.ll

  Log Message:
  -----------
  [LoopIdiom] Support 'shift until less-than' idiom (#95002)

The current loop idiom code for recognising and inserting a CTLZ
intrinsic does not support loops where the loopback control is based on
an unsigned less-than condition. This patch adds support for recognising
these loops and inserting a CTLZ intrinsic.

Fixes the missed optimization cases in #51064

---------

Co-authored-by: David Sherwood <david.sherwood at arm.com>


  Commit: da827d0896e5e66fe9130f8f4479537d3bbee1da
      https://github.com/llvm/llvm-project/commit/da827d0896e5e66fe9130f8f4479537d3bbee1da
  Author: Michael Buch <michaelbuch12 at gmail.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M lldb/source/Plugins/Language/CPlusPlus/LibCxxUnorderedMap.cpp
    M lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/iterator/TestDataFormatterLibccIterator.py
    M lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/iterator/main.cpp

  Log Message:
  -----------
  [lldb][DataFormatter] Simplify std::unordered_map::iterator formatter (#97754)

Depends on https://github.com/llvm/llvm-project/pull/97752

This patch changes the way we retrieve the key/value pair in the
`std::unordered_map::iterator` formatter (similar to how we are changing
it for `std::map::iterator` in
https://github.com/llvm/llvm-project/pull/97713, the motivations being
the same).

The old logic was not very easy to follow, and encoded the libc++ layout
in non-obvious ways. But mainly it was also fragile to alignment
miscalculations (https://github.com/llvm/llvm-project/pull/97443); this
would break once the new layout of `std::unordered_map` landed as part
of https://github.com/llvm/llvm-project/issues/93069.

Instead, this patch simply casts the `__hash_iterator` to a
`__node_pointer` (which is what libc++ does too) and uses a
straightforward `GetChildMemberWithName("__value_")` to get to the
key/value we care about.

The `std::unordered_map` already does it this way, so we align the
iterator counterpart to do the same. We can eventually re-use the
core-part of the `std::unordered_map` and `std::unordered_map::iterator`
formatters. But it will be an easier to change to review once both
simplifications landed.


  Commit: f4f5e25467263adf60289ed0725d696368230b71
      https://github.com/llvm/llvm-project/commit/f4f5e25467263adf60289ed0725d696368230b71
  Author: Nikita Popov <npopov at redhat.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M llvm/include/llvm/Analysis/ValueLattice.h

  Log Message:
  -----------
  [ValueLattice] Add missing const qualifier (NFC)


  Commit: 5c40e561bbc102f47553732fcebc0876b45d68b2
      https://github.com/llvm/llvm-project/commit/5c40e561bbc102f47553732fcebc0876b45d68b2
  Author: Joseph Huber <huberjn at outlook.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M libc/include/llvm-libc-macros/math-macros.h
    M libc/src/math/amdgpu/CMakeLists.txt
    M libc/src/math/nvptx/CMakeLists.txt
    M libc/src/math/nvptx/llrint.cpp
    M libc/src/math/nvptx/llrintf.cpp
    M libc/src/math/nvptx/lrint.cpp

  Log Message:
  -----------
  [libc] Make GPU `libm` use generic implementations (#98014)

Summary:
This patch moves a lot of the old vendor implementations to the new
generic math functions. Previously a lot of these were done through the
vendor functions, but the long term goal is to completely phase these
out. In order to make the tests pass I had to disable exceptions so they
only perform functional tests.


  Commit: 7e054c33d42b0a6bc8f6b168dab688f8e7762ef0
      https://github.com/llvm/llvm-project/commit/7e054c33d42b0a6bc8f6b168dab688f8e7762ef0
  Author: Simon Pilgrim <llvm-dev at redking.me.uk>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
    M llvm/test/Transforms/VectorCombine/X86/shuffle-of-casts.ll

  Log Message:
  -----------
  [VectorCombine] foldShuffleOfCastops - don't restrict to oneuse but compare total costs instead

Some casts (especially bitcasts but others as well) are incredibly cheap (or free), so don't limit the shuffle(cast(x),cast(y)) -> cast(shuffle(x,y)) to oneuse cases, but instead compare the total before/after costs of possibly repeating some casts.


  Commit: c9ee6b1977e7dc88e3bd89b5e361c703721711fd
      https://github.com/llvm/llvm-project/commit/c9ee6b1977e7dc88e3bd89b5e361c703721711fd
  Author: lntue <35648136+lntue at users.noreply.github.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M libc/config/darwin/arm/entrypoints.txt
    M libc/config/gpu/entrypoints.txt
    M libc/config/linux/aarch64/entrypoints.txt
    M libc/config/linux/arm/entrypoints.txt
    M libc/config/linux/riscv/entrypoints.txt
    M libc/config/linux/x86_64/entrypoints.txt
    M libc/config/windows/entrypoints.txt
    M libc/docs/math/index.rst
    M libc/spec/stdc.td
    M libc/src/__support/FPUtil/CMakeLists.txt
    M libc/src/__support/FPUtil/FEnvImpl.h
    M libc/src/math/CMakeLists.txt
    A libc/src/math/cbrtf.h
    M libc/src/math/generic/CMakeLists.txt
    A libc/src/math/generic/cbrtf.cpp
    M libc/test/src/math/CMakeLists.txt
    A libc/test/src/math/cbrtf_test.cpp
    M libc/test/src/math/exhaustive/CMakeLists.txt
    A libc/test/src/math/exhaustive/cbrtf_test.cpp
    M libc/test/src/math/exhaustive/exhaustive_test.h
    M libc/test/src/math/smoke/CMakeLists.txt
    A libc/test/src/math/smoke/cbrtf_test.cpp
    M libc/utils/MPFRWrapper/MPFRUtils.cpp
    M libc/utils/MPFRWrapper/MPFRUtils.h

  Log Message:
  -----------
  [libc][math] Implement cbrtf function correctly rounded to all rounding modes. (#97936)

Fixes https://github.com/llvm/llvm-project/issues/92874

Algorithm: Let `x = (-1)^s * 2^e * (1 + m)`.
- Step 1: Range reduction: reduce the exponent with:
```
  y = cbrt(x) = (-1)^s * 2^(floor(e/3)) * 2^((e % 3)/3) * (1 + m)^(1/3)
```
- Step 2: Use the first 4 bit fractional bits of `m` to look up for a
degree-7 polynomial approximation to:
```
  (1 + m)^(1/3) ~ 1 + m * P(m).
```
- Step 3: Perform the multiplication:
```
  2^((e % 3)/3) * (1 + m)^(1/3).
```
- Step 4: Check for exact cases to prevent rounding and clear
`FE_INEXACT` floating point exception.
- Step 5: Combine with the exponent and sign before converting down to
`float` and return.


  Commit: d3a5589684baed14e82e57ca31fd4e988cb1436d
      https://github.com/llvm/llvm-project/commit/d3a5589684baed14e82e57ca31fd4e988cb1436d
  Author: Joseph Huber <huberjn at outlook.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M libc/src/math/generic/tan.cpp

  Log Message:
  -----------
  [libc] Add maybe_unused for functions only used on slow path (#98024)

Summary:
When the fast math options are enabled these functions are uncalled,
which makes it error.


  Commit: 12d6832d86156904aecc10e8612bd77b66aeef51
      https://github.com/llvm/llvm-project/commit/12d6832d86156904aecc10e8612bd77b66aeef51
  Author: Nikita Popov <npopov at redhat.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M llvm/lib/Transforms/Utils/SCCPSolver.cpp

  Log Message:
  -----------
  [SCCP] Skip bitcasts entirely

The only bitcasts the existing code might be able to handle are
bitcasts between iN and <1 x iN>. Don't bother.


  Commit: 727ecaf7d16d44ecf434284c60ad65b22d814092
      https://github.com/llvm/llvm-project/commit/727ecaf7d16d44ecf434284c60ad65b22d814092
  Author: jeanPerier <jperier at nvidia.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M flang/include/flang/Optimizer/Builder/IntrinsicCall.h
    M flang/lib/Lower/ConvertCall.cpp
    M flang/lib/Optimizer/Builder/IntrinsicCall.cpp

  Log Message:
  -----------
  [flang] allow intrinsic module procedures to be implemented in Fortran (#97743)

Currently, all procedures from intrinsic modules that are not BIND(C)
are expected to be intercepted by the compiler in lowering and to have a
handler in IntrinsicCall.cpp.

As more "intrinsic" modules are being added (OpenMP, OpenACC, CUF, ...),
this requirement is preventing seamless implementation of intrinsic
modules in Fortran. Procedures from intrinsic modules are different from
generic intrinsics defined in section 16 of the standard. They are
declared in Fortran file seating in the intrinsic module directory and
inside the compiler they look like regular user call except for the
INTRINSIC attribute set on their module. So an easy implementation is
just to have the implementation done in Fortran and linked to the
runtime without any need for the compiler to necessarily understand and
handle these calls in special ways.

This patch splits the lookup and generation part of IntrinsicCall.cpp so
that it can be allowed to only intercept calls to procedure from
intrinsic module if they have a handler. Otherwise, the assumption is
that they should be implemented in Fortran.

Add explicit TODOs handler for the IEEE procedure that are known to not
yet been implemented and won't be implemented via Fortran code so that
this patch is an NFC for what is currently supported.

This patch also prevents doing two lookups in the intrinsic table (There
was one to get argument lowering rules, and another one to generate the
code).


  Commit: a5bfe20f6fd8957f870f8ffcf836d7737864e063
      https://github.com/llvm/llvm-project/commit/a5bfe20f6fd8957f870f8ffcf836d7737864e063
  Author: Jay Foad <jay.foad at amd.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M llvm/lib/Target/AMDGPU/MIMGInstructions.td

  Log Message:
  -----------
  [AMDGPU] Comment MIMGEncGfx12


  Commit: 9dca3ac2efb180398ef8e84bfa9f0ef283d0e6fd
      https://github.com/llvm/llvm-project/commit/9dca3ac2efb180398ef8e84bfa9f0ef283d0e6fd
  Author: Michael Buch <michaelbuch12 at gmail.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
    M lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/iterator/TestDataFormatterLibccIterator.py

  Log Message:
  -----------
  [lldb][DataFormatter] Simplify libc++ std::map::iterator formatter (#97713)

Depends on https://github.com/llvm/llvm-project/pull/97687

Similar to https://github.com/llvm/llvm-project/pull/97579, this patch
simplifies the way in which we retrieve the key/value pair of a
`std::map` (in this case of the `std::map::iterator`).

We do this for the same reason: not only was the old logic hard to
follow, and encoded the libc++ layout in non-obvious ways, it was also
fragile to alignment miscalculations
(https://github.com/llvm/llvm-project/pull/97443); this would break once
the new layout of std::map landed as part of
https://github.com/llvm/llvm-project/issues/93069.

Instead, this patch simply casts the `__iter_pointer` to the
`__node_pointer` and uses a straightforward
`GetChildMemberWithName("__value_")` to get to the key/value we care
about.

We can eventually re-use the core-part of the `std::map` and
`std::map::iterator` formatters. But it will be an easier to change to
review once both simplifications landed.


  Commit: 385118644ccabe27a634804c7db60734746c170f
      https://github.com/llvm/llvm-project/commit/385118644ccabe27a634804c7db60734746c170f
  Author: Alexey Bataev <a.bataev at outlook.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/test/DebugInfo/Generic/assignment-tracking/slp-vectorizer/merge-scalars.ll
    M llvm/test/Transforms/SLPVectorizer/AArch64/spillcost-di.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-add-usat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-add.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fix.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshl-rot.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshl.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshr-rot.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshr.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-mul.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-smax.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-smin.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-sub-ssat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-sub-usat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-sub.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-umax.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-umin.ll
    M llvm/test/Transforms/SLPVectorizer/X86/horizontal-list.ll
    M llvm/test/Transforms/SLPVectorizer/X86/shift-ashr.ll
    M llvm/test/Transforms/SLPVectorizer/X86/shift-lshr.ll
    M llvm/test/Transforms/SLPVectorizer/X86/shift-shl.ll

  Log Message:
  -----------
  [SLP]Remove operands upon marking instruction for deletion.

If the instruction is marked for deletion, better to drop all its
operands and mark them for deletion too (if allowed). It allows to have
more vectorizable patterns and generate less useless extractelement
instructions.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/97409


  Commit: 135fc8af27a99391972a6639f47e3943e0770e68
      https://github.com/llvm/llvm-project/commit/135fc8af27a99391972a6639f47e3943e0770e68
  Author: Alexey Bataev <a.bataev at outlook.com>
  Date:   2024-07-08 (Mon, 08 Jul 2024)

  Changed paths:
    M flang/include/flang/Optimizer/Builder/IntrinsicCall.h
    M flang/lib/Lower/ConvertCall.cpp
    M flang/lib/Optimizer/Builder/IntrinsicCall.cpp
    M libc/config/darwin/arm/entrypoints.txt
    M libc/config/gpu/entrypoints.txt
    M libc/config/linux/aarch64/entrypoints.txt
    M libc/config/linux/arm/entrypoints.txt
    M libc/config/linux/riscv/entrypoints.txt
    M libc/config/linux/x86_64/entrypoints.txt
    M libc/config/windows/entrypoints.txt
    M libc/docs/math/index.rst
    M libc/include/llvm-libc-macros/math-macros.h
    M libc/spec/stdc.td
    M libc/src/__support/FPUtil/CMakeLists.txt
    M libc/src/__support/FPUtil/FEnvImpl.h
    M libc/src/math/CMakeLists.txt
    M libc/src/math/amdgpu/CMakeLists.txt
    A libc/src/math/cbrtf.h
    M libc/src/math/generic/CMakeLists.txt
    A libc/src/math/generic/cbrtf.cpp
    M libc/src/math/generic/tan.cpp
    M libc/src/math/nvptx/CMakeLists.txt
    M libc/src/math/nvptx/llrint.cpp
    M libc/src/math/nvptx/llrintf.cpp
    M libc/src/math/nvptx/lrint.cpp
    M libc/test/src/math/CMakeLists.txt
    A libc/test/src/math/cbrtf_test.cpp
    M libc/test/src/math/exhaustive/CMakeLists.txt
    A libc/test/src/math/exhaustive/cbrtf_test.cpp
    M libc/test/src/math/exhaustive/exhaustive_test.h
    M libc/test/src/math/smoke/CMakeLists.txt
    A libc/test/src/math/smoke/cbrtf_test.cpp
    M libc/utils/MPFRWrapper/MPFRUtils.cpp
    M libc/utils/MPFRWrapper/MPFRUtils.h
    M lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
    M lldb/source/Plugins/Language/CPlusPlus/LibCxxUnorderedMap.cpp
    M lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/iterator/TestDataFormatterLibccIterator.py
    M lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx/iterator/main.cpp
    M llvm/include/llvm/Analysis/ValueLattice.h
    M llvm/lib/Target/AMDGPU/MIMGInstructions.td
    M llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
    M llvm/lib/Transforms/Utils/SCCPSolver.cpp
    M llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    M llvm/lib/Transforms/Vectorize/VectorCombine.cpp
    M llvm/test/DebugInfo/Generic/assignment-tracking/slp-vectorizer/merge-scalars.ll
    A llvm/test/Transforms/LoopIdiom/AArch64/ctlz.ll
    M llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll
    M llvm/test/Transforms/SLPVectorizer/AArch64/spillcost-di.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-add-ssat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-add-usat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-add.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fix.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshl-rot.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshl.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshr-rot.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-fshr.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-mul.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-smax.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-smin.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-sub-ssat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-sub-usat.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-sub.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-umax.ll
    M llvm/test/Transforms/SLPVectorizer/X86/arith-umin.ll
    M llvm/test/Transforms/SLPVectorizer/X86/shift-ashr.ll
    M llvm/test/Transforms/SLPVectorizer/X86/shift-lshr.ll
    M llvm/test/Transforms/SLPVectorizer/X86/shift-shl.ll
    M llvm/test/Transforms/VectorCombine/X86/shuffle-of-casts.ll

  Log Message:
  -----------
  Rebase

Created using spr 1.3.5


Compare: https://github.com/llvm/llvm-project/compare/24233ed3ac1b...135fc8af27a9

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list