PGO: Statically generate data structures
Duncan P. N. Exon Smith
dexonsmith at apple.com
Fri Mar 14 21:57:35 PDT 2014
Thanks for the review!
On 2014 Mar 14, at 21:01, Justin Bogner <mail at justinbogner.com> wrote:
>> - PGO.emitWriteoutFunction();
>> + PGO.emitInstrGenerateData();
>
> This name is awkward and confusing. Why does this say "generate"? The
> data's the same for generation and use. How about either
> "emitInstrProfData", to match the backend naming, or just
> "emitInstrumentationData", which should be obvious from context?
emitInstrumentationData() sounds better.
>> void CodeGenPGO::setFuncName(llvm::Function *Fn) {
>> - StringRef Func = Fn->getName();
>> + NormalFuncName = Fn->getName();
>
> NormalFuncName? Is there a SpecialFuncName? This is just FunctionName,
> isn't it?
There are three function names kicking around there:
- Fn->getName(), which sometimes has a \01 prefix.
- NormalFuncName, which takes off the \01 prefix. It’s used as a suffix
for symbol names, like the “foo” in __llvm_pgo_counts_foo.
- FuncName, which already existed, and whose name I didn’t want to
change, and for static variables has a filename prefix. E.g.,
“somefile.cpp:foo" if foo() is static and defined in somefile.cpp, or
“bar" if bar() is not static. This is what gets saved in
__llvm_pgo_name_foo (or __llvm_pgo_name_bar) and used to be sent as an
argument into llvm_pgo_writeout().
NormalFuncName is normal in that it has no \01 or filename prefix.
I’m open to suggestions, but FunctionName would clash awkwardly with
FuncName.
>> + auto *RegisterF
>> + = llvm::Function::Create(RegisterFTy,
>> + llvm::GlobalValue::InternalLinkage,
>> + "__llvm_pgo_register_functions", &CGM.getModule());
>
> The indentation is funny here:
>
> auto *RegisterF = llvm::Function::Create(RegisterFTy,
> llvm::GlobalValue::InternalLinkage,
> "__llvm_pgo_register_functions",
> &CGM.getModule());
Sure, I’ll change that.
>> +static StringRef getDataSection(const CodeGenModule &CGM) {
>> + if (CGM.getTarget().getTriple().getObjectFormat() == llvm::Triple::MachO)
>> + return "__DATA,__llvm_pgo_data";
>> + else
>> + return "__llvm_pgo_data";
>> +}
>
> This smells wrong to me. (1) We're at a pretty high level for checking the
> triple, and (2) why is MachO special?
In MachO, you need to specify a segment and a section. Not true for ELF
(I admit I don’t know anything about COFF, or much at all about linkage).
Since we want symbols to go to a special section, we need to know which
triple so we can format the section name correctly.
>>
>> - llvm::Type *Int64PtrTy = llvm::Type::getInt64PtrTy(Ctx);
>> - llvm::Type *Args[] = {
>> - Int8PtrTy, // const char *FuncName
>> - Int32Ty, // uint32_t NumCounters
>> - Int64PtrTy // uint64_t *Counters
>> +llvm::GlobalVariable *CodeGenPGO::buildDataVar() {
>> + // Create name variable.
>> + llvm::LLVMContext &Ctx = CGM.getLLVMContext();
>> + auto *VarName =
>> + llvm::ConstantDataArray::getString(Ctx, getFuncName(), false);
>> + auto *Name =
>> + new llvm::GlobalVariable(CGM.getModule(), VarName->getType(), true,
>> + FuncLinkage, VarName, getFuncVarName("name"));
>
> I think this reads better without the newlines after =:
>
> auto *VarName = llvm::ConstantDataArray::getString(Ctx, getFuncName(), false);
> auto *Name = new llvm::GlobalVariable(CGM.getModule(), VarName->getType(),
> true, FuncLinkage, VarName,
> getFuncVarName("name"));
I’ll try that out.
>> llvm::Function *CodeGenPGO::emitInitialization(CodeGenModule &CGM) {
>> - llvm::Function *WriteoutF =
>> - CGM.getModule().getFunction("__llvm_pgo_writeout");
>> - if (!WriteoutF)
>> - return NULL;
>> + if (!CGM.getCodeGenOpts().ProfileInstrGenerate)
>> + return 0;
>
> Why wasn't this check needed here before?
This check was implicit before. Since this is an in the CodeGenPGO
API, I think it’s better to make it obvious that nothing happens if
-fprofile-instr-generate is off.
More practically...
>> - // Create a small bit of code that registers the "__llvm_pgo_writeout" to
>> - // be executed at exit.
>> - llvm::Function *F = CGM.getModule().getFunction("__llvm_pgo_init");
>> - if (F)
>> - return NULL;
>> + // Only need to create this once per module.
>> + if (CGM.getModule().getFunction("__llvm_pgo_init"))
>> + return 0;
>>
>> - llvm::LLVMContext &Ctx = CGM.getLLVMContext();
>> - llvm::FunctionType *FTy = llvm::FunctionType::get(llvm::Type::getVoidTy(Ctx),
>> - false);
>> - F = llvm::Function::Create(FTy, llvm::GlobalValue::InternalLinkage,
>> - "__llvm_pgo_init", &CGM.getModule());
>> + // Get the functions to call at initialization.
>> + llvm::Constant *RegisterF = getRegisterFunc(CGM);
>> + llvm::Constant *WriteAtExitF = getOrInsertRuntimeWriteAtExit(CGM);
>> + if (!RegisterF && !WriteAtExitF)
>> + return 0;
...getOrInsertRuntimeWriteAtExit() currently inserts a declaration
(eventually, that will depend on a command-line switch and possibly
return 0). Without the early exit above, we will always add the
declaration, and always continue on to add __llvm_pgo_init().
>> +
>> /// counters depending on whether we are generating or using instrumentation.
>> void assignRegionCounters(const Decl *D, llvm::Function *Fn);
>> /// Emit code to write counts for a given function to disk, if necessary.
>> - void emitWriteoutFunction();
>> + void emitInstrGenerateData();
>
> Update the doc comment here?
Nice catch.
>> +/* TODO: Calculate these with linker magic.
>> + */
>
> Why is the */ on it's own line? That's unusual for a one-line c-style
> comment. This applies to several comments later in the file as well.
I’ll fix them up. My C fu is weak.
>> +static DataType *First = NULL;
>> +static DataType *Final = NULL;
>> +/*! \brief Register an instrumented function.
>> + *
>> + * Calls to this are inserted by clang with -fprofile-instr-generate. They
>> + * should only be required on targets where we can't use linker magic to find
>> + * the bounds of the section.
>> + */
>
> This comment isn't actually true yet. I'd leave it out until we actually
> do the linker magic.
I’ll reword it to be more clear. Technically the comment is true. They
*should* only be required on such targets.
>> +void __llvm_pgo_register_write_atexit() {
>> + /* TODO: Requires userspace: move to separate file.
>> + */
>> + static int WriteAtExit = 0;
>
> Past tense was more clear here. If you don't like "ran", something like
> "RegisteredAtExit" would be fine.
I agree. I’ll fix that up.
More information about the cfe-commits
mailing list