[all-commits] [llvm/llvm-project] cd269f: [StrTable] Switch Clang builtins to use string tables

Chandler Carruth via All-commits all-commits at lists.llvm.org
Tue Feb 4 10:16:18 PST 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: cd269fee05a0f78fb53b65f701b4e06e9ddab424
      https://github.com/llvm/llvm-project/commit/cd269fee05a0f78fb53b65f701b4e06e9ddab424
  Author: Chandler Carruth <chandlerc at gmail.com>
  Date:   2025-02-04 (Tue, 04 Feb 2025)

  Changed paths:
    M clang/include/clang/Basic/Builtins.h
    R clang/include/clang/Basic/BuiltinsLoongArch.def
    M clang/include/clang/Basic/BuiltinsPPC.def
    M clang/include/clang/Basic/TargetBuiltins.h
    M clang/include/clang/Basic/TargetInfo.h
    M clang/include/module.modulemap
    M clang/lib/Basic/Builtins.cpp
    M clang/lib/Basic/Targets/AArch64.cpp
    M clang/lib/Basic/Targets/AArch64.h
    M clang/lib/Basic/Targets/AMDGPU.cpp
    M clang/lib/Basic/Targets/AMDGPU.h
    M clang/lib/Basic/Targets/ARC.h
    M clang/lib/Basic/Targets/ARM.cpp
    M clang/lib/Basic/Targets/ARM.h
    M clang/lib/Basic/Targets/AVR.h
    M clang/lib/Basic/Targets/BPF.cpp
    M clang/lib/Basic/Targets/BPF.h
    M clang/lib/Basic/Targets/CSKY.cpp
    M clang/lib/Basic/Targets/CSKY.h
    M clang/lib/Basic/Targets/DirectX.h
    M clang/lib/Basic/Targets/Hexagon.cpp
    M clang/lib/Basic/Targets/Hexagon.h
    M clang/lib/Basic/Targets/Lanai.h
    M clang/lib/Basic/Targets/LoongArch.cpp
    M clang/lib/Basic/Targets/LoongArch.h
    M clang/lib/Basic/Targets/M68k.cpp
    M clang/lib/Basic/Targets/M68k.h
    M clang/lib/Basic/Targets/MSP430.h
    M clang/lib/Basic/Targets/Mips.cpp
    M clang/lib/Basic/Targets/Mips.h
    M clang/lib/Basic/Targets/NVPTX.cpp
    M clang/lib/Basic/Targets/NVPTX.h
    M clang/lib/Basic/Targets/PNaCl.h
    M clang/lib/Basic/Targets/PPC.cpp
    M clang/lib/Basic/Targets/PPC.h
    M clang/lib/Basic/Targets/RISCV.cpp
    M clang/lib/Basic/Targets/RISCV.h
    M clang/lib/Basic/Targets/SPIR.cpp
    M clang/lib/Basic/Targets/SPIR.h
    M clang/lib/Basic/Targets/Sparc.h
    M clang/lib/Basic/Targets/SystemZ.cpp
    M clang/lib/Basic/Targets/SystemZ.h
    M clang/lib/Basic/Targets/TCE.h
    M clang/lib/Basic/Targets/VE.cpp
    M clang/lib/Basic/Targets/VE.h
    M clang/lib/Basic/Targets/WebAssembly.cpp
    M clang/lib/Basic/Targets/WebAssembly.h
    M clang/lib/Basic/Targets/X86.cpp
    M clang/lib/Basic/Targets/X86.h
    M clang/lib/Basic/Targets/XCore.cpp
    M clang/lib/Basic/Targets/XCore.h
    M clang/lib/Basic/Targets/Xtensa.h
    M clang/lib/CodeGen/CGBuiltin.cpp
    M clang/lib/CodeGen/CodeGenModule.cpp
    M clang/lib/Sema/SemaChecking.cpp
    M clang/lib/Sema/SemaExpr.cpp
    M clang/lib/StaticAnalyzer/Core/CheckerContext.cpp

  Log Message:
  -----------
  [StrTable] Switch Clang builtins to use string tables

This both reapplies #118734, the initial attempt at this, and updates it
significantly.

First, it uses the newly added `StringTable` abstraction for string
tables, and simplifies the construction to build the string table and
info arrays separately. This should reduce any `constexpr` compile time
memory or CPU cost of the original PR while significantly improving the
APIs throughout.

It also restructures the builtins to support sharding across several
independent tables. This accomplishes two improvements from the
original PR:

1) It improves the APIs used significantly.

2) When builtins are defined from different sources (like SVE vs MVE in
   AArch64), this allows each of them to build their own string table
   independently rather than having to merge the string tables and info
   structures.

3) It allows each shard to factor out a common prefix, often cutting the
   size of the strings needed for the builtins by a factor two.

The second point is important both to allow different mechanisms of
construction (for example a `.def` file and a tablegen'ed `.inc` file,
or different tablegen'ed `.inc files), it also simply reduces the sizes
of these tables which is valuable given how large they are in some
cases. The third builds on that size reduction.

Initially, we use this new sharding rather than merging tables in
AArch64, LoongArch, RISCV, and X86. Mostly this helps ensure the system
works, as without further changes these still push scaling limits.
Subsequent commits will more deeply leverage the new structure,
including using the prefix capabilities which cannot be easily factored
out here and requires deep changes to the targets.


  Commit: 1cb979f001b24c661b7d7adf50d7c9cf8adc593a
      https://github.com/llvm/llvm-project/commit/1cb979f001b24c661b7d7adf50d7c9cf8adc593a
  Author: Chandler Carruth <chandlerc at gmail.com>
  Date:   2025-02-04 (Tue, 04 Feb 2025)

  Changed paths:
    R clang/include/clang/Basic/BuiltinsRISCVVector.def
    M clang/include/clang/Basic/TargetBuiltins.h
    M clang/lib/Basic/Targets/RISCV.cpp
    M clang/utils/TableGen/RISCVVEmitter.cpp

  Log Message:
  -----------
  [StrTable] Switch RISCV to leverage sharded, prefixed builtins w/ TableGen

This lets the TableGen-ed code be much cleaner, directly building an
efficient string table without duplicates and without the repeated
prefix.


  Commit: 64ea3f5a4720105d166b034d5a34d92475579e64
      https://github.com/llvm/llvm-project/commit/64ea3f5a4720105d166b034d5a34d92475579e64
  Author: Chandler Carruth <chandlerc at gmail.com>
  Date:   2025-02-04 (Tue, 04 Feb 2025)

  Changed paths:
    M clang/include/clang/Basic/BuiltinsARM.def
    R clang/include/clang/Basic/BuiltinsNEON.def
    M clang/include/clang/Basic/TargetBuiltins.h
    M clang/lib/Basic/Targets/AArch64.cpp
    M clang/lib/Basic/Targets/ARM.cpp
    M clang/lib/Sema/SemaARM.cpp
    M clang/utils/TableGen/MveEmitter.cpp
    M clang/utils/TableGen/NeonEmitter.cpp
    M clang/utils/TableGen/SveEmitter.cpp

  Log Message:
  -----------
  [StrTable] Switch AArch64 and ARM to use directly TableGen-ed builtin tables

This leverages the sharded structure of the builtins to make it easy to
directly tablegen most of the AArch64 and ARM builtins while still using
X-macros for a few edge cases. It also extracts common prefixes as part
of that.

This makes the string tables for these targets dramatically smaller.
This is especially important as the SVE builtins represent (by far) the
largest string table and largest builtin table across all the targets in
Clang.


  Commit: 212ecb9d5caaa7cc721edd981f36384ddfccfa5d
      https://github.com/llvm/llvm-project/commit/212ecb9d5caaa7cc721edd981f36384ddfccfa5d
  Author: Chandler Carruth <chandlerc at gmail.com>
  Date:   2025-02-04 (Tue, 04 Feb 2025)

  Changed paths:
    M clang/include/clang/AST/Expr.h
    M clang/include/clang/Basic/Builtins.h
    M clang/include/clang/Basic/IdentifierTable.h
    M clang/include/clang/Basic/TargetBuiltins.h
    M clang/lib/AST/StmtPrinter.cpp
    M clang/lib/Basic/Builtins.cpp
    M clang/lib/Basic/Targets/BPF.cpp
    M clang/lib/Basic/Targets/Hexagon.cpp
    M clang/lib/Basic/Targets/NVPTX.cpp
    M clang/lib/Basic/Targets/RISCV.cpp
    M clang/lib/Basic/Targets/SPIR.cpp
    M clang/lib/Basic/Targets/X86.cpp
    M clang/lib/Sema/SemaChecking.cpp
    M clang/test/TableGen/target-builtins-prototype-parser.td
    M clang/utils/TableGen/ClangBuiltinsEmitter.cpp

  Log Message:
  -----------
  [StrTable] Teach main builtin TableGen to use direct enums, strings, and info

This moves the main builtins and several targets to use nice generated
string tables and info structures rather than X-macros. Even without
obvious prefixes factored out, the resulting tables are significantly
smaller and much cheaper to compile with out all the X-macro overhead.

This leaves the X-macros in place for atomic builtins which have a wide
range of uses that don't seem reasonable to fold into TableGen.

As future work, these should move to their own file (whether as X-macros
or just generated patterns) so the AST headers don't have to include all
the data for other builtins.


  Commit: 2ff42bdac3b9a131ce1c652d08edded4eac9d3f7
      https://github.com/llvm/llvm-project/commit/2ff42bdac3b9a131ce1c652d08edded4eac9d3f7
  Author: Chandler Carruth <chandlerc at gmail.com>
  Date:   2025-02-04 (Tue, 04 Feb 2025)

  Changed paths:
    M clang/include/clang/Basic/BuiltinsBase.td
    M clang/include/clang/Basic/BuiltinsX86Base.td
    M clang/lib/Basic/Targets/X86.cpp
    M clang/utils/TableGen/ClangBuiltinsEmitter.cpp

  Log Message:
  -----------
  [StrTable] Add prefixes for x86 builtins.

This requires adding support to the general builtins emission for
producing prefixed builtin infos separately from un-prefixed which is
a bit crufty. But we don't currently have any good way of having a more
refined model than a single hard-coded prefix string per TableGen
emission. Something more powerful and/or elegant is possible, but this
is a fairly minimal first step that at least allows factoring out the
builtin prefix for something like X86.


  Commit: 51d0ad7de0ad4636ae39783469cf555a1392b4ea
      https://github.com/llvm/llvm-project/commit/51d0ad7de0ad4636ae39783469cf555a1392b4ea
  Author: Chandler Carruth <chandlerc at gmail.com>
  Date:   2025-02-04 (Tue, 04 Feb 2025)

  Changed paths:
    M clang/include/clang/Basic/BuiltinsHexagon.td
    M clang/lib/Basic/Targets/Hexagon.cpp

  Log Message:
  -----------
  [StrTable] Add factored prefix for Hexagon

This target's builtins have an especially long prefix and so we get over
2x reduction in string table size required with this change.


Compare: https://github.com/llvm/llvm-project/compare/f308af757d72...51d0ad7de0ad

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list