[clang] 713a202 - [CGData] Clang Options (#90304)
via cfe-commits
cfe-commits at lists.llvm.org
Sun Sep 15 16:04:45 PDT 2024
Author: Kyungwoo Lee
Date: 2024-09-15T16:04:42-07:00
New Revision: 713a2029578eb36a29793105948d8e4fe965da18
URL: https://github.com/llvm/llvm-project/commit/713a2029578eb36a29793105948d8e4fe965da18
DIFF: https://github.com/llvm/llvm-project/commit/713a2029578eb36a29793105948d8e4fe965da18.diff
LOG: [CGData] Clang Options (#90304)
This adds new Clang flags to support codegen (CG) data:
- `-fcodegen-data-generate{=path}`: This flag passes
`-codegen-data-generate` as a boolean to the LLVM backend, causing the
raw CG data to be emitted into a custom section. Currently, for LLD
MachO only, it also passes `--codegen-data-generate-path=<path>` so that
the indexed CG data file can be automatically produced at link time. For
linkers that do not yet support this feature, `llvm-cgdata` can be used
manually to merge this CG data in object files.
- `-fcodegen-data-use{=path}`: This flag passes
`-codegen-data-use-path=<path>` to the LLVM backend, enabling the use of
specified CG data to optimistically outline functions.
- The default `<path>` is set to `default.cgdata` when not specified.
This depends on https://github.com/llvm/llvm-project/pull/108733.
This is a patch for
https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
Added:
clang/test/Driver/codegen-data.c
Modified:
clang/docs/UsersManual.rst
clang/include/clang/Driver/Options.td
clang/lib/Driver/ToolChains/CommonArgs.cpp
clang/lib/Driver/ToolChains/Darwin.cpp
Removed:
################################################################################
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index f27fa4ace917ea..57d78f867bab6e 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2410,6 +2410,39 @@ are listed below.
link-time optimizations like whole program inter-procedural basic block
reordering.
+.. option:: -fcodegen-data-generate[=<path>]
+
+ Emit the raw codegen (CG) data into custom sections in the object file.
+ Currently, this option also combines the raw CG data from the object files
+ into an indexed CG data file specified by the <path>, for LLD MachO only.
+ When the <path> is not specified, `default.cgdata` is created.
+ The CG data file combines all the outlining instances that occurred locally
+ in each object file.
+
+ .. code-block:: console
+
+ $ clang -fuse-ld=lld -Oz -fcodegen-data-generate code.cc
+
+ For linkers that do not yet support this feature, `llvm-cgdata` can be used
+ manually to merge this CG data in object files.
+
+ .. code-block:: console
+
+ $ clang -c -fuse-ld=lld -Oz -fcodegen-data-generate code.cc
+ $ llvm-cgdata --merge -o default.cgdata code.o
+
+.. option:: -fcodegen-data-use[=<path>]
+
+ Read the codegen data from the specified path to more effectively outline
+ functions across compilation units. When the <path> is not specified,
+ `default.cgdata` is used. This option can create many identically outlined
+ functions that can be optimized by the conventional linker’s identical code
+ folding (ICF).
+
+ .. code-block:: console
+
+ $ clang -fuse-ld=lld -Oz -Wl,--icf=safe -fcodegen-data-use code.cc
+
Profile Guided Optimization
---------------------------
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index dc8bfc69e9889b..7f123335ce8cfa 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1894,6 +1894,18 @@ def fprofile_selected_function_group :
Visibility<[ClangOption, CC1Option]>, MetaVarName<"<i>">,
HelpText<"Partition functions into N groups using -fprofile-function-groups and select only functions in group i to be instrumented. The valid range is 0 to N-1 inclusive">,
MarshallingInfoInt<CodeGenOpts<"ProfileSelectedFunctionGroup">>;
+def fcodegen_data_generate_EQ : Joined<["-"], "fcodegen-data-generate=">,
+ Group<f_Group>, Visibility<[ClangOption, CLOption]>, MetaVarName<"<path>">,
+ HelpText<"Emit codegen data into the object file. LLD for MachO (currently) merges them into the specified <path>.">;
+def fcodegen_data_generate : Flag<["-"], "fcodegen-data-generate">,
+ Group<f_Group>, Visibility<[ClangOption, CLOption]>, Alias<fcodegen_data_generate_EQ>, AliasArgs<["default.cgdata"]>,
+ HelpText<"Emit codegen data into the object file. LLD for MachO (currently) merges them into default.cgdata.">;
+def fcodegen_data_use_EQ : Joined<["-"], "fcodegen-data-use=">,
+ Group<f_Group>, Visibility<[ClangOption, CLOption]>, MetaVarName<"<path>">,
+ HelpText<"Use codegen data read from the specified <path>.">;
+def fcodegen_data_use : Flag<["-"], "fcodegen-data-use">,
+ Group<f_Group>, Visibility<[ClangOption, CLOption]>, Alias<fcodegen_data_use_EQ>, AliasArgs<["default.cgdata"]>,
+ HelpText<"Use codegen data read from default.cgdata to optimize the binary">;
def fswift_async_fp_EQ : Joined<["-"], "fswift-async-fp=">,
Group<f_Group>,
Visibility<[ClangOption, CC1Option, CC1AsOption, CLOption]>,
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index f58b816a9709dd..502aba2ce4aa9c 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -2753,6 +2753,25 @@ void tools::addMachineOutlinerArgs(const Driver &D,
addArg(Twine("-enable-machine-outliner=never"));
}
}
+
+ auto *CodeGenDataGenArg =
+ Args.getLastArg(options::OPT_fcodegen_data_generate_EQ);
+ auto *CodeGenDataUseArg = Args.getLastArg(options::OPT_fcodegen_data_use_EQ);
+
+ // We only allow one of them to be specified.
+ if (CodeGenDataGenArg && CodeGenDataUseArg)
+ D.Diag(diag::err_drv_argument_not_allowed_with)
+ << CodeGenDataGenArg->getAsString(Args)
+ << CodeGenDataUseArg->getAsString(Args);
+
+ // For codegen data gen, the output file is passed to the linker
+ // while a boolean flag is passed to the LLVM backend.
+ if (CodeGenDataGenArg)
+ addArg(Twine("-codegen-data-generate"));
+
+ // For codegen data use, the input file is passed to the LLVM backend.
+ if (CodeGenDataUseArg)
+ addArg(Twine("-codegen-data-use-path=") + CodeGenDataUseArg->getValue());
}
void tools::addOpenMPDeviceRTL(const Driver &D,
diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp b/clang/lib/Driver/ToolChains/Darwin.cpp
index 5e7f9290e2009d..ebc9ed1aadb0ab 100644
--- a/clang/lib/Driver/ToolChains/Darwin.cpp
+++ b/clang/lib/Driver/ToolChains/Darwin.cpp
@@ -476,6 +476,13 @@ void darwin::Linker::AddLinkArgs(Compilation &C, const ArgList &Args,
llvm::sys::path::append(Path, "default.profdata");
CmdArgs.push_back(Args.MakeArgString(Twine("--cs-profile-path=") + Path));
}
+
+ auto *CodeGenDataGenArg =
+ Args.getLastArg(options::OPT_fcodegen_data_generate_EQ);
+ if (CodeGenDataGenArg)
+ CmdArgs.push_back(
+ Args.MakeArgString(Twine("--codegen-data-generate-path=") +
+ CodeGenDataGenArg->getValue()));
}
}
@@ -633,6 +640,32 @@ void darwin::Linker::ConstructJob(Compilation &C, const JobAction &JA,
CmdArgs.push_back("-mllvm");
CmdArgs.push_back("-enable-linkonceodr-outlining");
+ // Propagate codegen data flags to the linker for the LLVM backend.
+ auto *CodeGenDataGenArg =
+ Args.getLastArg(options::OPT_fcodegen_data_generate_EQ);
+ auto *CodeGenDataUseArg = Args.getLastArg(options::OPT_fcodegen_data_use_EQ);
+
+ // We only allow one of them to be specified.
+ const Driver &D = getToolChain().getDriver();
+ if (CodeGenDataGenArg && CodeGenDataUseArg)
+ D.Diag(diag::err_drv_argument_not_allowed_with)
+ << CodeGenDataGenArg->getAsString(Args)
+ << CodeGenDataUseArg->getAsString(Args);
+
+ // For codegen data gen, the output file is passed to the linker
+ // while a boolean flag is passed to the LLVM backend.
+ if (CodeGenDataGenArg) {
+ CmdArgs.push_back("-mllvm");
+ CmdArgs.push_back("-codegen-data-generate");
+ }
+
+ // For codegen data use, the input file is passed to the LLVM backend.
+ if (CodeGenDataUseArg) {
+ CmdArgs.push_back("-mllvm");
+ CmdArgs.push_back(Args.MakeArgString(Twine("-codegen-data-use-path=") +
+ CodeGenDataUseArg->getValue()));
+ }
+
// Setup statistics file output.
SmallString<128> StatsFile =
getStatsFileName(Args, Output, Inputs[0], getToolChain().getDriver());
diff --git a/clang/test/Driver/codegen-data.c b/clang/test/Driver/codegen-data.c
new file mode 100644
index 00000000000000..28638f61d641c5
--- /dev/null
+++ b/clang/test/Driver/codegen-data.c
@@ -0,0 +1,38 @@
+// Verify only one of codegen-data flag is passed.
+// RUN: not %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-generate -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=CONFLICT
+// RUN: not %clang -### -S --target=arm64-apple-darwin -fcodegen-data-generate -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=CONFLICT
+// CONFLICT: error: invalid argument '-fcodegen-data-generate' not allowed with '-fcodegen-data-use'
+
+// Verify the codegen-data-generate (boolean) flag is passed to LLVM
+// RUN: %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-generate %s 2>&1| FileCheck %s --check-prefix=GENERATE
+// RUN: %clang -### -S --target=arm64-apple-darwin -fcodegen-data-generate %s 2>&1| FileCheck %s --check-prefix=GENERATE
+// GENERATE: "-mllvm" "-codegen-data-generate"
+
+// Verify the codegen-data-use-path flag (with a default value) is passed to LLVM.
+// RUN: %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-use %s 2>&1| FileCheck %s --check-prefix=USE
+// RUN: %clang -### -S --target=arm64-apple-darwin -fcodegen-data-use %s 2>&1| FileCheck %s --check-prefix=USE
+// RUN: %clang -### -S --target=aarch64-linux-gnu -fcodegen-data-use=file %s 2>&1 | FileCheck %s --check-prefix=USE-FILE
+// RUN: %clang -### -S --target=arm64-apple-darwin -fcodegen-data-use=file %s 2>&1 | FileCheck %s --check-prefix=USE-FILE
+// USE: "-mllvm" "-codegen-data-use-path=default.cgdata"
+// USE-FILE: "-mllvm" "-codegen-data-use-path=file"
+
+// Verify the codegen-data-generate (boolean) flag with a LTO.
+// RUN: %clang -### -flto --target=aarch64-linux-gnu -fcodegen-data-generate %s 2>&1 | FileCheck %s --check-prefix=GENERATE-LTO
+// GENERATE-LTO: {{ld(.exe)?"}}
+// GENERATE-LTO-SAME: "-plugin-opt=-codegen-data-generate"
+// RUN: %clang -### -flto --target=arm64-apple-darwin -fcodegen-data-generate %s 2>&1 | FileCheck %s --check-prefix=GENERATE-LTO-DARWIN
+// GENERATE-LTO-DARWIN: {{ld(.exe)?"}}
+// GENERATE-LTO-DARWIN-SAME: "-mllvm" "-codegen-data-generate"
+
+// Verify the codegen-data-use-path flag with a LTO is passed to LLVM.
+// RUN: %clang -### -flto=thin --target=aarch64-linux-gnu -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=USE-LTO
+// USE-LTO: {{ld(.exe)?"}}
+// USE-LTO-SAME: "-plugin-opt=-codegen-data-use-path=default.cgdata"
+// RUN: %clang -### -flto=thin --target=arm64-apple-darwin -fcodegen-data-use %s 2>&1 | FileCheck %s --check-prefix=USE-LTO-DARWIN
+// USE-LTO-DARWIN: {{ld(.exe)?"}}
+// USE-LTO-DARWIN-SAME: "-mllvm" "-codegen-data-use-path=default.cgdata"
+
+// For now, LLD MachO supports for generating the codegen data at link time.
+// RUN: %clang -### -fuse-ld=lld -B%S/Inputs/lld --target=arm64-apple-darwin -fcodegen-data-generate %s 2>&1 | FileCheck %s --check-prefix=GENERATE-LLD-DARWIN
+// GENERATE-LLD-DARWIN: {{ld(.exe)?"}}
+// GENERATE-LLD-DARWIN-SAME: "--codegen-data-generate-path=default.cgdata"
More information about the cfe-commits
mailing list