[llvm] [MLIR] Apply clang-tidy fixes for readability-container-size-empty in VectorTransforms.cpp (NFC) (PR #156358)

Mon Sep 1 11:05:45 PDT 2025

https://github.com/s-barannikov created https://github.com/llvm/llvm-project/pull/156358

[MLIR] Apply clang-tidy fixes for readability-container-size-empty in VectorTransforms.cpp (NFC)

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in TensorOps.cpp (NFC)

[mlir][Transforms] Allow RemoveDeadValues to process a function whose the last block is not the exit. (#156123)

'processFuncOp' queries the number of returned values of a function
using the terminator of the last block's getNumOperands(). It presumes
the last block is the exit. It is not always the case. 
This patch fixes the bug by querying from FunctionInterfaceOp directly.

[TargetLowering] Only freeze LHS and RHS if they are used multiple times in expandABD (#156193)

Not all paths in expandABD are using LHS and RHS twice.

[MLIR] Apply clang-tidy fixes for bugprone-argument-comment in ArmSMEToSCF.cpp (NFC)

[NFC][ARM][MC] Rearrange decoder functions 2/N (#155464)

Move some of the non-static-decode functions to the end of the file.
Note: moving `ARMDisassembler::AddThumbPredicate` the same way causes
the diff to be non-trivial, so not doing that here.

[AArch64][SDAG] Add f16 -> i16 rounding NEON conversion intrinsics (#155851)

Add dedicated .i16.f16 formats for rounding NEON conversion intrinsics
in order to avoid issues with incorrect overflow behaviour caused by
using .i32.f16 formats to perform the same conversions.

Added intrinsic formats:
i16 @llvm.aarch64.neon.fcvtzs.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtzu.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtas.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtau.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtms.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtmu.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtns.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtnu.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtps.i16.f16(half)
i16 @llvm.aarch64.neon.fcvtpu.i16.f16(half)

Backend side of the solution to
https://github.com/llvm/llvm-project/issues/154343

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski at arm.com>

[NFC][MC][DecoderEmitter] Simplify loop to find the best filter (#156237)

We can just use `max_element` on the array of filters.

[LV] Emit remarks for vectorization decision before execute (NFCI).

Move the emission of remarks for the vectorization decision before
executing the plan, in preparation for
https://github.com/llvm/llvm-project/pull/154510.

[MLIR] Apply clang-tidy fixes for readability-simplify-boolean-expr in LowerContractToNeonPatterns.cpp (NFC)

[AArch64] Add patterns for sub from add negative immediates (#156024)

`sub 3` will be canonicalized in llvm to `add -3`. This adds some
tablegen patterns for add from a negative immediate so that we can still
generate sub imm SVE instructions.

The alternative is to add a isel combine, which seemed to work but
created problems for mad and index patterns. This version does still
need to add a lower-than-default Complexity to the ComplexPatterns to
ensure that index was selected over sub-imm + index, as the default
Complexity on ComplexPatterns is quite high.

Fixes #155928

[LV] Correctly cost chains of replicating calls in legacy CM.

Check for scalarized calls in needsExtract to fix a divergence between
legacy and VPlan-based cost model.

The legacy cost model was missing a check for scalarized calls in
needsExtract, which meant if incorrectly assumed the result of a
scalarized call needs extracting.

Exposed by https://github.com/llvm/llvm-project/pull/154617.

Fixes https://github.com/llvm/llvm-project/issues/156091.

[mlir] Fix warnings

This patch fixes:

  mlir/lib/Dialect/Tensor/IR/TensorOps.cpp:3205:13: error: unused
  function 'printInferType' [-Werror,-Wunused-function]

  mlir/lib/Dialect/Tensor/IR/TensorOps.cpp:3210:1: error: unused
  function 'parseInferType' [-Werror,-Wunused-function]

[llvm] Proofread LangRef.rst (#156204)

[memprof] Make the AllocSites and CallSites sections optional in YAML (#156226)

This patch makes the AllocSites and CallSites sections optional in the
YAML format.  This is useful for situations where a function has only
one section.

[ARM] Simplify LowerCMP (NFC) (#156198)

Pass the opcode directly.

[InstCombine] Slightly optimize visitFcmp (NFC) (#156097)

Studying the code related to float found a slightly optimal sequence of
actions.

feat: Add AVX512 support for constant interpreter test (#156227)

Adds test coverage for `fexperimental-new-constant-interpreter` to
`avx512cd-builtins.c` and `avx512vlcd-builtins.c` as part of the work to
ensure all x86 intrinsics are handled correctly by the new interpreter.
Part of #155814.

[SLP]Do not remove reduced value, if it is a copyable

If the value is checked for the reduction and it is a copyable element
in a root node, it should not be deleted, since it may still be used
after vectorization.

Fixes #155512

[VPlan] Unconditionally run attachRuntimeChecks (NFCI).

Instead of conditionally executing attachRuntimeChecks, directly check
if the blocks to be added are still disconnected.

[SLP]Do not to try to revectorize previously vectorized phis in loops

No need to try to revectorize previously vectorized phis in loops, it leads to
a compile time blow-up.

Fixes #155998

[CodeGen] Drop disjoint flag when reassociating (#156218)

This fixes a latent miscompile. To understand why the flag can't be
preserved, consider the case where a0=0, a1=0, a2=-1, and s3=-1.

[SelectionDAG] Return std::optional<unsigned> from getValidShiftAmount and friends. NFC (#156224)

Instead of std::optional<uint64_t>. Shift amounts must be less than or
equal to our maximum supported bit widths which fit in unsigned. Most of
the callers already assumed it fit in unsigned.

Revert "[VPlan] Support plans with vector pointers in narrowInterleaveGroups."

This reverts commit 806a797c52d8018639f5cdcce5eb375b17c87f5e as it
introduced a miscompile.

[Mips][XCore] Use MCRegisterClass::getRegister() instead of begin()+RegNo. NFC

[MLIR] Add --allow-unregistered-dialect to mlir-reduce (#156245)

Fixes #155544

[VPlan] Support scalable VFs in narrowInterleaveGroups. (#154842)

Update narrowInterleaveGroups to support scalable VFs. After the
transform, the vector loop will process a single iteration of the
original vector loop for fixed-width vectors and vscale iterations for
scalable vectors.

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in Var.cpp (NFC)

[MLIR] Apply clang-tidy fixes for llvm-else-after-return in OpenMPToLLVMIRTranslation.cpp (NFC)

[lldb][test] Mark TestGetBaseName.py as expected failure on Windows

TestGetBaseName.py introduced in PR #155939 is failing on windows
LLDB bots. This patch adds @expectedFailureAll(oslist=["windows"])
decorator to mark it as an expected failure on Windows to make
buildbots green while the underlying issue is investigated.

(see: https://lab.llvm.org/buildbot/#/builders/141/builds/11176).

[polly][CMake] Replace `elseif ()` with `else ()`

The no-argument elseif resulted in the warning below:

CMake Warning (dev) at polly/lib/External/CMakeLists.txt:30 (elseif):
  ELSEIF called with no arguments, it will be skipped.

[clang-tidy] Teach `readability-uppercase-literal-suffix` about C++23 and C23 suffixes (#148275)

Clang doesn't actually support any of the new floating point types yet
(except for `f16`). I've decided to add disabled tests for them, so that
when the support comes, we can flip the switch and support them with no
delay.

[PseudoProbe] Support emitting to COFF object (#123870)

RFC:
https://discourse.llvm.org/t/rfc-support-pseudo-probe-for-windows-coff/83820
Support emitting pseudo probe to .pseudo_probe and .pseudo_probe_desc
COFF sections.

[libunwind] fix pc range condition check bug (#154902)

There is an off-by-one error with current condition check for PC fallen
into the range or not. There is another check within libunwind that use
the correct checks in
https://github.com/llvm/llvm-project/blob/5050da7ba18fc876f80fbeaaca3564d3b4483bb8/libunwind/src/UnwindCursor.hpp#L2757
```
      if ((fdeInfo.pcStart <= pc) && (pc < fdeInfo.pcEnd))
```

[ADT] Refactor StringMap iterators (NFC) (#156137)

StringMap has four iterator classes:

- StringMapIterBase
- StringMapIterator
- StringMapConstIterator
- StringMapKeyIterator

This patch consolidates the first three into one class, namely
StringMapIterBase, adds a boolean template parameter to indicate
desired constness, and then use "using" directives to specialize the
common class:

  using const_iterator = StringMapIterBase<ValueTy, true>;
  using iterator = StringMapIterBase<ValueTy, false>;

just like how we simplified DenseMapIterator.

Remarks:

- This patch drops CRTP and iterator_facade_base for simplicity.  For
  fairly simple forward iterators, iterator_facade_base doesn't buy us
  much.  We just have to write a few "using" directives and operator!=
  manually.

- StringMapIterBase has a SFINAE-based constructor to construct a
  const iterator from a non-const one just like DenseMapIterator.

- We now rely on compiler-generated copy and assignment operators.

[ADT] Remove Mask in PointerEmbedded (#156201)

Mask, a private enum, isn't used anywhere in the class.

[RISCV][NFC] Simplify some rvv regbankselect cases (#155961)

[libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU (#152275)

It is necessary to add MemorySemantic argument for AMDGPU which means
the memory or address space to which the memory ordering is applied.

The MemorySemantic is also necessary for implementing the SPIR-V
MemoryBarrier instruction. Additionally, the implementation of
__clc_mem_fence on Intel GPUs requires the MemorySemantic argument.

Using __builtin_amdgcn_fence for AMDGPU is follow-up of
https://github.com/llvm/llvm-project/pull/151446#discussion_r2254006508

llvm-diff shows no change to nvptx64--nvidiacl.bc.

[ProfCheck] Exclude some more tests

These tests are currently showing up as red on the buildbot. We have not
gotten to any of these passes yet, so add them to the exclude list for now.

[PowerPC] Exploit xxeval instruction for operations of the form ternary(A,X,B) and ternary(A,X,C). (#152956)

Adds support for ternary equivalent operations of the form `ternary(A,
X, B)` and `ternary(A, X, C)` where `X=[and(B,C)| nor(B,C)| eqv(B,C)|
nand(B,C)]`.

The following are the patterns involved and the imm values:

| **Operation**              | **Immediate Value** |
|----------------------------|---------------------|
| ternary(A,  and(B,C),   B) | 49                  |
| ternary(A,  nor(B,C),   B) | 56                  |
| ternary(A,  eqv(B,C),   B) | 57                  |
| ternary(A,  nand(B,C),  B) | 62                  |
|                            |                     |
| ternary(A,  and(B,C),   C) | 81                  |
| ternary(A,  nor(B,C),   C) | 88                  |
| ternary(A,  eqv(B,C),   C) | 89                  |
| ternary(A,  nand(B,C),  C) | 94                  |

eg.  `xxeval XT, XA, XB, XC, 49` 
- performs `XA ? and(XB, XC) : B`and places the result in `XT`.

This is the continuation of [[PowerPC] Exploit xxeval instruction for
ternary patterns - ternary(A, X,
and(B,C))](https://github.com/llvm/llvm-project/pull/141733#top).

---------

Co-authored-by: Tony Varghese <tony.varghese at ibm.com>

[PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, res_bit_shift) (#154388)

This change implements a patfrag based pattern matching ~dag combiner~
that combines consecutive `VSRO (Vector Shift Right Octet)` and `VSR
(Vector Shift Right)` instructions into a single `VSRQ (Vector Shift
Right Quadword)` instruction on Power10+ processors.

Vector right shift operations like `vec_srl(vec_sro(input, byte_shift),
bit_shift)` generate two separate instructions `(VSRO + VSR)` when they
could be optimised into a single `VSRQ `instruction that performs the
equivalent operation.

```
vsr(vsro (input, vsro_byte_shift), vsr_bit_shift) to vsrq(input, vsrq_bit_shift) 
where vsrq_bit_shift = (vsro_byte_shift * 8) + vsr_bit_shift
```

Note:
```
 vsro : Vector Shift Right by Octet VX-form
- vsro VRT, VRA, VRB
- The contents of VSR[VRA+32] are shifted right by the number of bytes specified in bits 121:124 of VSR[VRB+32].
	- Bytes shifted out of byte 15 are lost. 
	- Zeros are supplied to the vacated bytes on the left.
- The result is placed into VSR[VRT+32].

vsr : Vector Shift Right VX-form
- vsr VRT, VRA, VRB
- The contents of VSR[VRA+32] are shifted right by the number of bits specified in bits 125:127 of VSR[VRB+32]. 3 bits.
	- Bits shifted out of bit 127 are lost.
	- Zeros are supplied to the vacated bits on the left.
- The result is place into VSR[VRT+32], except if, for any byte element in VSR[VRB+32], the low-order 3 bits are not equal to the shift amount, then VSR[VRT+32] is undefined.

vsrq : Vector Shift Right Quadword VX-form
- vsrq VRT,VRA,VRB 
- Let src1 be the contents of VSR[VRA+32]. Let src2 be the contents of VSR[VRB+32]. 
- src1 is shifted right by the number of bits specified in the low-order 7 bits of src2.
	- Bits shifted out the least-significant bit are lost. 
	- Zeros are supplied to the vacated bits on the left. 
	- The result is placed into VSR[VRT+32].
```

---------

Co-authored-by: Tony Varghese <tony.varghese at ibm.com>

[NFC][PowerPC] adding the arguments for register names and VSR to VR (#155991)

NFC patch to add the flags `-ppc-asm-full-reg-names
--ppc-vsr-nums-as-vr` to the test file
`llvm/test/CodeGen/PowerPC/check-zero-vector.ll`.

Created this PR based on this discussion:
https://github.com/llvm/llvm-project/pull/151971#issuecomment-3234090675

Co-authored-by: himadhith <himadhith.v at ibm.com>
Co-authored-by: Lei Huang <lei at ca.ibm.com>

[RISCV] Split the attribute test for THead to attributes-thead.ll. NFC.

[ARM] Use t2LDRLIT_ga_pcrel for loading stack guards with no-movt in PIC mode. (#156208)

When using no-movt we don't use the pcrel version of the literal load.
This change also unifies logic with the ARM version of this function as
well,
which has:

```
  if (!Subtarget.useMovt() || ForceELFGOTPIC) {
    // For ELF non-PIC, use GOT PIC code sequence as well because R_ARM_GOT_ABS
    // does not have assembler support.
    if (TM.isPositionIndependent() || ForceELFGOTPIC)
      expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_pcrel, ARM::LDRi12);
    else
      expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_abs, ARM::LDRi12);
    return;
  }
```

rdar://138334512

[mlir][emitc] Only mark operator with fundamental type have no side effect (#144990)

[ADT] Mark BitVector::find_prev_unset const (NFC) (#156272)

find_prev_unset calls find_last_unset_in, a const method, but
find_prev_unset itself isn't marked const.

[ADT] Remove BitVector::next_unset_in_word (#156273)

This patch removes BitVector::next_unset_in_word as the private method
doesn't seem to be used anywhere.

[llvm] Proofread CMake.rst (#156276)

[VPlan] Fold common edges away in convertPhisToBlends (#150368)

If a phi is widened with tail folding, all of its predecessors will have
a mask of the form

    %x = logical-and %active-lane-mask, %foo
    %y = logical-and %active-lane-mask, %bar
    %z = logical-and %active-lane-mask, %baz
    ...

We can remove the common %active-lane-mask from all of these edge masks,
which in turn allows us to simplify a lot of VPBlendRecipes.

In particular, it allows the header mask to be removed in selects with
EVL tail folding, improving RISC-V codegen on SPEC CPU 2017 for
525.x264_r, and supersedes #147243.

This also allows us to remove VPBlendRecipe and directly emit
VPInstruction::Select in another patch.

[RISCV] Compress shxadd to qc.c.muliadd when rd = rs2 (#155843)

Do this when Zba and Xqciac are both enabled.

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in mlir-linalg-ods-yaml-gen.cpp (NFC)

[MLIR] Apply clang-tidy fixes for readability-identifier-naming in LowerContractToSVEPatterns.cpp (NFC)

[LoongArch] Enable vector tests for 32-bit target (#152094)

[LoongArch] Perform SELECT_CC combine (#155994)

Fold `((srl (and X, 1<<C), C), 0, eq/ne)` -> `((shl X, GRLen-1-C), 0, ge/lt)`

[MemoryBuiltins] Add getBaseObjectSize() (NFCI) (#155911)

getObjectSize() is based on ObjectSizeOffsetVisitor, which has become
very expensive over time. The implementation is geared towards computing
as-good-as-possible results for the objectsize intrinsics and similar.
However, we also use it in BasicAA, which is very hot, and really only
cares about the base cases like alloca/malloc/global, not any of the
analysis for GEPs, phis, or loads.

Add a new getBaseObjectSize() API for this use case, which only handles
the non-recursive cases. As a bonus, this API can easily return a
TypeSize and thus support scalable vectors. For now, I'm explicitly
discarding the scalable sizes in BasicAA just to avoid unnecessary
behavior changes during this refactor.

[clang-reorder-fields] Support designated initializers (#142150)

Initializer lists with designators, missing elements or omitted braces
can now be rewritten. Any missing designators are added and they get
sorted according to the new order.

```
struct Foo {
  int a;
  int b;
  int c;
};
struct Foo foo = { .a = 1, 2, 3 }
```

when reordering elements to "b,a,c" becomes:

```
struct Foo {
  int b;
  int a;
  int c;
};
struct Foo foo = { .b = 2, .a = 1, .c = 3 }
```

[RelLookupTableConverter] Make GEP type independent (#155404)

This makes the RelLookupTableConverter independent of the type used in
the GEP. In particular, it removes the requirement to have a leading
zero index.

[SLP][NFC] Refactor `if`s into `&&` (#156278)

[InstCombine] Strip leading zero indices from GEP (#155415)

GEPs are often in the form `gep [N x %T], ptr %p, i64 0, i64 %idx`.
Canonicalize these to `gep %T, ptr %p, i64 %idx`.

This enables transforms that only support one GEP index to work and
improves CSE.

Various transforms were recently hardened to make sure they still work
without the leading index.

[clang-reorder-fields] Fix unused private field warning (NFC)

Fixes:
```
In file included from /home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clang-reorder-fields/Designator.cpp:15:
/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clang-reorder-fields/Designator.h:157:23: error: private field 'ILE' is not used [-Werror,-Wunused-private-field]
157 |   const InitListExpr *ILE;
```

[VPlan] Use IsaPred to improve code (NFC) (#156037)

[AMDGPU] Expand scratch atomics to flat atomics if GAS is enabled (#154710)

[LV] Improve code around operands-iterator (NFC) (#156016)

[Coroutines] Remove assert about a promise being present (#156007)

This commit removes an assert in the generation of debug info for a
coroutine frame. This assert checked if a promise alloca is present,
even though it's not used. While this might always be the case when the
coroutine was produced by clang++, this doesn't hold in the general
case.

Note: We generate coroutine intrinsics from downstream passes. In our
case, there is no guarantee that a coroutine has any promise, but they
can originate from some non-coro C++ code.

[Coroutines] Enhance DILabel generation with support for inlined locs (#155989)

This commit fixes an issue in the generation of DILabels. The previous
code did not cover cases where the suspend intrinsic had an inlined
location. Because of this, it took an incorrect DIScope, that broke an
internal pre-condition of `DIBuilder::insertLabel`.

This has been addressed by taking the DIScope of the "inlined at"
location, which should be the DISubprogram of the function holding the
label.

[clang-tidy] Support direct initialization in modernize smart pointer (#154732)

Support for direct initialization detection in modernize smart pointer checks.

[RISC-V] Added the mips extension instructions like ehb,ihb and pause etc for MIPS RV64 P8700. (#155747)

Please refer the
https://mips.com/wp-content/uploads/2025/06/P8700_Programmers_Reference_Manual_Rev1.84_5-31-2025.pdf
for more information .

and files like RISCVInstrInfoXMips.td clang formatted .

No Regression found.

---------

Co-authored-by: Craig Topper <craig.topper at sifive.com>

[VPlan] Add VPBlockBase::hasPredecessors (NFC).

Split off from https://github.com/llvm/llvm-project/pull/154510/, add
helper to check if a block has any predecessors.

[Dexter] Implement DexStepFunction and DexContinue (#152721)

Adding together in a single commit as their implementations are linked.

Only supported for DAP debuggers. These two commands make it a bit easier to
drive dexter: DexStepFunction tells dexter to step-next though a function and
DexContinue tells dexter to continue (run free) from one breakpoint to another
within the current DexStepFunction function.

When the DexStepFunction function breakpoint is triggered, dexter sets an
instruction breakpoint at the return-address. This is so that stepping can
resume in any other DexStepFunctions deeps in the call stack.

[VPlan] Introduce replaceSymbolicStrides (NFC) (#155842)

Introduce VPlanTransforms::replaceSymbolicStrides factoring some code
from LoopVectorize.

[CI] Enable -Werror in pre-merge CI (#155627)

We have many buildbots that run with -Werror, but it's currently not
enabled in pre-merge CI, so adding a warning causes a slew of
post-commit failures.

Note that we only guarantee warning freedom when compiling with a recent
version of clang, but not when using gcc or msvc. The monolithic-linux
build uses recent clang (currently 21.1.0), so it should be safe to
enable the option there.

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in TranslateToCpp.cpp (NFC)

[MLIR] Apply clang-tidy fixes for performance-move-const-arg in Location.cpp (NFC)

[MLIR] Fix unit-test missing guard `SKIP_WITHOUT_JIT` to exclude it on sparc

[libc] Migrate from baremetal stdio.h to generic stdio.h (#152748)

This is a follow up to the RFC here:

https://discourse.llvm.org/t/rfc-implementation-of-stdio-on-baremetal/86944

This provides the stdout/stderr/stdin symbols (which now don't have to
provided by the user). This allows the user to have access to all
functions, currently I've only tested `fprintf` but in theory everything
that works in the generic folder should work in the baremetal
configuration.

All streams are _non-buffered_, which does NOT require flushing. It is
based on the CookieFile that already existed

[TableGen][Decoder] Decode operands with zero width or all bits known

>From 3e6a84641756455ec5b0cc7e5274703ecd11ae55 Mon Sep 17 00:00:00 2001
From: Sergei Barannikov <barannikov88 at gmail.com>
Date: Mon, 1 Sep 2025 19:35:12 +0300
Subject: [PATCH] [TableGen][Decoder] Decode operands with zero width or all
 bits known

---
 llvm/lib/Target/AArch64/CMakeLists.txt        |  4 +-
 llvm/lib/Target/AMDGPU/CMakeLists.txt         |  4 +-
 llvm/lib/Target/ARM/CMakeLists.txt            |  3 +-
 llvm/lib/Target/AVR/CMakeLists.txt            |  3 +-
 llvm/lib/Target/BPF/CMakeLists.txt            |  3 +-
 llvm/lib/Target/CSKY/CMakeLists.txt           |  3 +-
 llvm/lib/Target/Hexagon/CMakeLists.txt        |  3 +-
 .../Target/Hexagon/HexagonDepInstrFormats.td  |  1 -
 .../M68k/Disassembler/M68kDisassembler.cpp    |  9 ++
 llvm/lib/Target/Mips/CMakeLists.txt           |  3 +-
 llvm/lib/Target/PowerPC/PPCInstr64Bit.td      |  2 +-
 llvm/lib/Target/RISCV/CMakeLists.txt          |  3 +-
 llvm/lib/Target/Sparc/SparcInstrInfo.td       |  8 +-
 llvm/utils/TableGen/DecoderEmitter.cpp        | 99 ++++++++++++++-----
 14 files changed, 110 insertions(+), 38 deletions(-)

diff --git a/llvm/lib/Target/AArch64/CMakeLists.txt b/llvm/lib/Target/AArch64/CMakeLists.txt
index 803943fd57c4d..833ce48ea1d7a 100644
--- a/llvm/lib/Target/AArch64/CMakeLists.txt
+++ b/llvm/lib/Target/AArch64/CMakeLists.txt
@@ -7,7 +7,9 @@ tablegen(LLVM AArch64GenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM AArch64GenAsmWriter1.inc -gen-asm-writer -asmwriternum=1)
 tablegen(LLVM AArch64GenCallingConv.inc -gen-callingconv)
 tablegen(LLVM AArch64GenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM AArch64GenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM AArch64GenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands
+              -ignore-fully-defined-operands)
 tablegen(LLVM AArch64GenFastISel.inc -gen-fast-isel)
 tablegen(LLVM AArch64GenGlobalISel.inc -gen-global-isel)
 tablegen(LLVM AArch64GenO0PreLegalizeGICombiner.inc -gen-global-isel-combiner
diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt
index 619ff4e5c73c4..108b7f7489567 100644
--- a/llvm/lib/Target/AMDGPU/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt
@@ -6,7 +6,9 @@ tablegen(LLVM AMDGPUGenAsmMatcher.inc -gen-asm-matcher)
 tablegen(LLVM AMDGPUGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM AMDGPUGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM AMDGPUGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM AMDGPUGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM AMDGPUGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands
+              -ignore-fully-defined-operands)
 tablegen(LLVM AMDGPUGenInstrInfo.inc -gen-instr-info)
 tablegen(LLVM AMDGPUGenMCCodeEmitter.inc -gen-emitter)
 tablegen(LLVM AMDGPUGenMCPseudoLowering.inc -gen-pseudo-lowering)
diff --git a/llvm/lib/Target/ARM/CMakeLists.txt b/llvm/lib/Target/ARM/CMakeLists.txt
index a39629bd8aeb0..fa778cad4af8e 100644
--- a/llvm/lib/Target/ARM/CMakeLists.txt
+++ b/llvm/lib/Target/ARM/CMakeLists.txt
@@ -6,7 +6,8 @@ tablegen(LLVM ARMGenAsmMatcher.inc -gen-asm-matcher)
 tablegen(LLVM ARMGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM ARMGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM ARMGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM ARMGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM ARMGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM ARMGenFastISel.inc -gen-fast-isel)
 tablegen(LLVM ARMGenGlobalISel.inc -gen-global-isel)
 tablegen(LLVM ARMGenInstrInfo.inc -gen-instr-info)
diff --git a/llvm/lib/Target/AVR/CMakeLists.txt b/llvm/lib/Target/AVR/CMakeLists.txt
index a31c545f48ba3..2d5cb7e048778 100644
--- a/llvm/lib/Target/AVR/CMakeLists.txt
+++ b/llvm/lib/Target/AVR/CMakeLists.txt
@@ -6,7 +6,8 @@ tablegen(LLVM AVRGenAsmMatcher.inc -gen-asm-matcher)
 tablegen(LLVM AVRGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM AVRGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM AVRGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM AVRGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM AVRGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM AVRGenInstrInfo.inc -gen-instr-info)
 tablegen(LLVM AVRGenMCCodeEmitter.inc -gen-emitter)
 tablegen(LLVM AVRGenRegisterInfo.inc -gen-register-info)
diff --git a/llvm/lib/Target/BPF/CMakeLists.txt b/llvm/lib/Target/BPF/CMakeLists.txt
index eade4cacb7100..678cb42c35f13 100644
--- a/llvm/lib/Target/BPF/CMakeLists.txt
+++ b/llvm/lib/Target/BPF/CMakeLists.txt
@@ -6,7 +6,8 @@ tablegen(LLVM BPFGenAsmMatcher.inc -gen-asm-matcher)
 tablegen(LLVM BPFGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM BPFGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM BPFGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM BPFGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM BPFGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM BPFGenInstrInfo.inc -gen-instr-info)
 tablegen(LLVM BPFGenMCCodeEmitter.inc -gen-emitter)
 tablegen(LLVM BPFGenRegisterInfo.inc -gen-register-info)
diff --git a/llvm/lib/Target/CSKY/CMakeLists.txt b/llvm/lib/Target/CSKY/CMakeLists.txt
index 4b900bc99c271..101b34ea18442 100644
--- a/llvm/lib/Target/CSKY/CMakeLists.txt
+++ b/llvm/lib/Target/CSKY/CMakeLists.txt
@@ -7,7 +7,8 @@ tablegen(LLVM CSKYGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM CSKYGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM CSKYGenCompressInstEmitter.inc -gen-compress-inst-emitter)
 tablegen(LLVM CSKYGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM CSKYGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM CSKYGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM CSKYGenInstrInfo.inc -gen-instr-info)
 tablegen(LLVM CSKYGenMCCodeEmitter.inc -gen-emitter)
 tablegen(LLVM CSKYGenMCPseudoLowering.inc -gen-pseudo-lowering)
diff --git a/llvm/lib/Target/Hexagon/CMakeLists.txt b/llvm/lib/Target/Hexagon/CMakeLists.txt
index d758260a8ab5d..b615536af03be 100644
--- a/llvm/lib/Target/Hexagon/CMakeLists.txt
+++ b/llvm/lib/Target/Hexagon/CMakeLists.txt
@@ -7,7 +7,8 @@ tablegen(LLVM HexagonGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM HexagonGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM HexagonGenDAGISel.inc -gen-dag-isel)
 tablegen(LLVM HexagonGenDFAPacketizer.inc -gen-dfa-packetizer)
-tablegen(LLVM HexagonGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM HexagonGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM HexagonGenInstrInfo.inc -gen-instr-info)
 tablegen(LLVM HexagonGenMCCodeEmitter.inc -gen-emitter)
 tablegen(LLVM HexagonGenRegisterInfo.inc -gen-register-info)
diff --git a/llvm/lib/Target/Hexagon/HexagonDepInstrFormats.td b/llvm/lib/Target/Hexagon/HexagonDepInstrFormats.td
index 75e87c95f2c48..661b948346126 100644
--- a/llvm/lib/Target/Hexagon/HexagonDepInstrFormats.td
+++ b/llvm/lib/Target/Hexagon/HexagonDepInstrFormats.td
@@ -3049,7 +3049,6 @@ class Enc_cf1927 : OpcodeHexagon {
 class Enc_d0fe02 : OpcodeHexagon {
   bits <5> Rxx32;
   let Inst{20-16} = Rxx32{4-0};
-  bits <0> sgp10;
 }
 class Enc_d15d19 : OpcodeHexagon {
   bits <1> Mu2;
diff --git a/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp b/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp
index d3ad65390143b..01a92087829bf 100644
--- a/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp
+++ b/llvm/lib/Target/M68k/Disassembler/M68kDisassembler.cpp
@@ -107,6 +107,15 @@ static DecodeStatus DecodeFPCSCRegisterClass(MCInst &Inst, uint64_t RegNo,
 }
 #define DecodeFPICRegisterClass DecodeFPCSCRegisterClass
 
+static void DecodeCCRCRegisterClass(MCInst &Inst,
+                                    const MCDisassembler *Decoder) {
+  Inst.addOperand(MCOperand::createReg(M68k::CCR));
+}
+
+static void DecodeSRCRegisterClass(MCInst &Inst, const MCDisassembler *Decoder) {
+  Inst.addOperand(MCOperand::createReg(M68k::SR));
+}
+
 static DecodeStatus DecodeImm32(MCInst &Inst, uint64_t Imm, uint64_t Address,
                                 const void *Decoder) {
   Inst.addOperand(MCOperand::createImm(M68k::swapWord<uint32_t>(Imm)));
diff --git a/llvm/lib/Target/Mips/CMakeLists.txt b/llvm/lib/Target/Mips/CMakeLists.txt
index 21d1765107ae6..4a2277e9a80dc 100644
--- a/llvm/lib/Target/Mips/CMakeLists.txt
+++ b/llvm/lib/Target/Mips/CMakeLists.txt
@@ -6,7 +6,8 @@ tablegen(LLVM MipsGenAsmMatcher.inc -gen-asm-matcher)
 tablegen(LLVM MipsGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM MipsGenCallingConv.inc -gen-callingconv)
 tablegen(LLVM MipsGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM MipsGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM MipsGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM MipsGenFastISel.inc -gen-fast-isel)
 tablegen(LLVM MipsGenGlobalISel.inc -gen-global-isel)
 tablegen(LLVM MipsGenPostLegalizeGICombiner.inc -gen-global-isel-combiner
diff --git a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
index 9359311e99cf6..269d30318bca8 100644
--- a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+++ b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
@@ -1984,7 +1984,7 @@ def : Pat<(int_ppc_darnraw), (DARN 2)>;
 
 class X_RA5_RB5<bits<6> opcode, bits<10> xo, string opc, RegisterOperand ty,
                    InstrItinClass itin, list<dag> pattern>
-  : X_L1_RS5_RS5<opcode, xo, (outs), (ins ty:$RA, ty:$RB, u1imm:$L),
+  : X_L1_RS5_RS5<opcode, xo, (outs), (ins ty:$RA, ty:$RB),
                  !strconcat(opc, " $RA, $RB"), itin, pattern>{
    let L = 1;
 }
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index 47329b2c2f4d2..9713d623ea614 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -7,7 +7,8 @@ tablegen(LLVM RISCVGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM RISCVGenCompressInstEmitter.inc -gen-compress-inst-emitter)
 tablegen(LLVM RISCVGenMacroFusion.inc -gen-macro-fusion-pred)
 tablegen(LLVM RISCVGenDAGISel.inc -gen-dag-isel)
-tablegen(LLVM RISCVGenDisassemblerTables.inc -gen-disassembler)
+tablegen(LLVM RISCVGenDisassemblerTables.inc -gen-disassembler
+              -ignore-non-decodable-operands)
 tablegen(LLVM RISCVGenInstrInfo.inc -gen-instr-info)
 tablegen(LLVM RISCVGenMCCodeEmitter.inc -gen-emitter)
 tablegen(LLVM RISCVGenMCPseudoLowering.inc -gen-pseudo-lowering)
diff --git a/llvm/lib/Target/Sparc/SparcInstrInfo.td b/llvm/lib/Target/Sparc/SparcInstrInfo.td
index 1a32eafb0e83d..53972d6c105a4 100644
--- a/llvm/lib/Target/Sparc/SparcInstrInfo.td
+++ b/llvm/lib/Target/Sparc/SparcInstrInfo.td
@@ -1785,22 +1785,22 @@ let Predicates = [HasV9], Uses = [ASR3], Constraints = "$swap = $rd" in
 // as inline assembler-supported instructions.
 let Predicates = [HasUMAC_SMAC], Defs = [Y, ASR18], Uses = [Y, ASR18] in {
   def SMACrr :  F3_1<2, 0b111111,
-                   (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2, ASRRegs:$asr18),
+                   (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2),
                    "smac $rs1, $rs2, $rd",
                    [], IIC_smac_umac>;
 
   def SMACri :  F3_2<2, 0b111111,
-                  (outs IntRegs:$rd), (ins IntRegs:$rs1, simm13Op:$simm13, ASRRegs:$asr18),
+                  (outs IntRegs:$rd), (ins IntRegs:$rs1, simm13Op:$simm13),
                    "smac $rs1, $simm13, $rd",
                    [], IIC_smac_umac>;
 
   def UMACrr :  F3_1<2, 0b111110,
-                  (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2, ASRRegs:$asr18),
+                  (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2),
                    "umac $rs1, $rs2, $rd",
                    [], IIC_smac_umac>;
 
   def UMACri :  F3_2<2, 0b111110,
-                  (outs IntRegs:$rd), (ins IntRegs:$rs1, simm13Op:$simm13, ASRRegs:$asr18),
+                  (outs IntRegs:$rd), (ins IntRegs:$rs1, simm13Op:$simm13),
                    "umac $rs1, $simm13, $rd",
                    [], IIC_smac_umac>;
 }
diff --git a/llvm/utils/TableGen/DecoderEmitter.cpp b/llvm/utils/TableGen/DecoderEmitter.cpp
index e4992b9e9e725..153b4ecaf3db9 100644
--- a/llvm/utils/TableGen/DecoderEmitter.cpp
+++ b/llvm/utils/TableGen/DecoderEmitter.cpp
@@ -93,6 +93,18 @@ static cl::opt<bool> UseFnTableInDecodeToMCInst(
         "of the generated code."),
     cl::init(false), cl::cat(DisassemblerEmitterCat));
 
+static cl::opt<bool> IgnoreNonDecodableOperands(
+    "ignore-non-decodable-operands",
+    cl::desc(
+        "Do not issue an error if an operand cannot be decoded automatically."),
+    cl::init(false), cl::cat(DisassemblerEmitterCat));
+
+static cl::opt<bool> IgnoreFullyDefinedOperands(
+    "ignore-fully-defined-operands",
+    cl::desc(
+        "Do not automatically decode operands with no '?' in their encoding."),
+    cl::init(false), cl::cat(DisassemblerEmitterCat));
+
 STATISTIC(NumEncodings, "Number of encodings considered");
 STATISTIC(NumEncodingsLackingDisasm,
           "Number of encodings without disassembler info");
@@ -130,7 +142,7 @@ struct OperandInfo {
   std::vector<EncodingField> Fields;
   std::string Decoder;
   bool HasCompleteDecoder;
-  uint64_t InitValue = 0;
+  std::optional<uint64_t> InitValue;
 
   OperandInfo(std::string D, bool HCD) : Decoder(D), HasCompleteDecoder(HCD) {}
 
@@ -1058,11 +1070,23 @@ FilterChooser::getIslands(const KnownBits &EncodingBits) const {
 
 void DecoderTableBuilder::emitBinaryParser(raw_ostream &OS, indent Indent,
                                            const OperandInfo &OpInfo) const {
-  bool UseInsertBits = OpInfo.numFields() != 1 || OpInfo.InitValue != 0;
+  // Special case for 'bits<0>'.
+  if (OpInfo.Fields.empty() && !OpInfo.InitValue) {
+    if (IgnoreNonDecodableOperands)
+      return;
+    assert(!OpInfo.Decoder.empty());
+    OS << Indent << OpInfo.Decoder << "(MI, Decoder);\n";
+    return;
+  }
+
+  if (OpInfo.Fields.empty() && OpInfo.InitValue && IgnoreFullyDefinedOperands)
+    return;
+
+  bool UseInsertBits = OpInfo.numFields() > 1 || OpInfo.InitValue.value_or(0);
 
   if (UseInsertBits) {
     OS << Indent << "tmp = 0x";
-    OS.write_hex(OpInfo.InitValue);
+    OS.write_hex(OpInfo.InitValue.value_or(0));
     OS << ";\n";
   }
 
@@ -1106,8 +1130,7 @@ void DecoderTableBuilder::emitDecoder(raw_ostream &OS, indent Indent,
   }
 
   for (const OperandInfo &Op : Encoding.getOperands())
-    if (Op.numFields())
-      emitBinaryParser(OS, Indent, Op);
+    emitBinaryParser(OS, Indent, Op);
 }
 
 unsigned DecoderTableBuilder::getDecoderIndex(unsigned EncodingID) const {
@@ -1888,15 +1911,48 @@ static void debugDumpRecord(const Record &Rec) {
 /// insert from the decoded instruction.
 static void addOneOperandFields(const Record *EncodingDef, const BitsInit &Bits,
                                 std::map<StringRef, StringRef> &TiedNames,
-                                StringRef OpName, OperandInfo &OpInfo) {
-  // Some bits of the operand may be required to be 1 depending on the
-  // instruction's encoding. Collect those bits.
-  if (const RecordVal *EncodedValue = EncodingDef->getValue(OpName))
-    if (const BitsInit *OpBits = dyn_cast<BitsInit>(EncodedValue->getValue()))
-      for (unsigned I = 0; I < OpBits->getNumBits(); ++I)
-        if (const BitInit *OpBit = dyn_cast<BitInit>(OpBits->getBit(I)))
-          if (OpBit->getValue())
-            OpInfo.InitValue |= 1ULL << I;
+                                const Record *OpRec, StringRef OpName,
+                                OperandInfo &OpInfo) {
+  // Find a field with the operand's name.
+  const RecordVal *OpEncodingField = EncodingDef->getValue(OpName);
+
+  // If there is no such field, try tied operand's name.
+  if (!OpEncodingField) {
+    if (auto I = TiedNames.find(OpName); I != TiedNames.end())
+      OpEncodingField = EncodingDef->getValue(I->second);
+
+    // If still no luck, the old behavior is to not decode this operand
+    // automatically and let the target do it. This is error-prone, so
+    // the new behavior is to report an error.
+    if (!OpEncodingField) {
+      if (!IgnoreNonDecodableOperands)
+        PrintError(EncodingDef->getLoc(),
+                   "could not find field for operand '" + OpName + "'");
+      return;
+    }
+  }
+
+  // Some or all bits of the operand may be required to be 0 or 1 depending
+  // on the instruction's encoding. Collect those bits.
+  if (const auto *OpBit = dyn_cast<BitInit>(OpEncodingField->getValue())) {
+    OpInfo.InitValue = OpBit->getValue();
+    return;
+  }
+  if (const auto *OpBits = dyn_cast<BitsInit>(OpEncodingField->getValue())) {
+    if (OpBits->getNumBits() == 0) {
+      if (OpInfo.Decoder.empty()) {
+        PrintError(EncodingDef->getLoc(), "operand '" + OpName + "' of type '" +
+                                              OpRec->getName() +
+                                              "' must have a decoder method");
+      }
+      return;
+    }
+    for (unsigned I = 0; I < OpBits->getNumBits(); ++I) {
+      if (const auto *OpBit = dyn_cast<BitInit>(OpBits->getBit(I)))
+        OpInfo.InitValue = OpInfo.InitValue.value_or(0) |
+                           static_cast<uint64_t>(OpBit->getValue()) << I;
+    }
+  }
 
   for (unsigned I = 0, J = 0; I != Bits.getNumBits(); I = J) {
     const VarInit *Var;
@@ -1967,8 +2023,10 @@ void InstructionEncoding::parseFixedLenOperands(const BitsInit &Bits) {
       // Decode each of the sub-ops separately.
       for (auto [SubOpName, SubOp] :
            zip_equal(Op.SubOpNames, Op.MIOperandInfo->getArgs())) {
-        OperandInfo SubOpInfo = getOpInfo(cast<DefInit>(SubOp)->getDef());
-        addOneOperandFields(EncodingDef, Bits, TiedNames, SubOpName, SubOpInfo);
+        const Record *SubOpRec = cast<DefInit>(SubOp)->getDef();
+        OperandInfo SubOpInfo = getOpInfo(SubOpRec);
+        addOneOperandFields(EncodingDef, Bits, TiedNames, SubOpRec, SubOpName,
+                            SubOpInfo);
         Operands.push_back(std::move(SubOpInfo));
       }
       continue;
@@ -1989,13 +2047,8 @@ void InstructionEncoding::parseFixedLenOperands(const BitsInit &Bits) {
       }
     }
 
-    addOneOperandFields(EncodingDef, Bits, TiedNames, Op.Name, OpInfo);
-    // FIXME: it should be an error not to find a definition for a given
-    // operand, rather than just failing to add it to the resulting
-    // instruction! (This is a longstanding bug, which will be addressed in an
-    // upcoming change.)
-    if (OpInfo.numFields() > 0)
-      Operands.push_back(std::move(OpInfo));
+    addOneOperandFields(EncodingDef, Bits, TiedNames, Op.Rec, Op.Name, OpInfo);
+    Operands.push_back(std::move(OpInfo));
   }
 }