PGO: Statically generate data structures

Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Mar 14 21:57:35 PDT 2014


Thanks for the review!

On 2014 Mar 14, at 21:01, Justin Bogner <mail at justinbogner.com> wrote:

>> -    PGO.emitWriteoutFunction();
>> +    PGO.emitInstrGenerateData();
> 
> This name is awkward and confusing.  Why does this say "generate"? The
> data's the same for generation and use. How about either
> "emitInstrProfData", to match the backend naming, or just
> "emitInstrumentationData", which should be obvious from context?

emitInstrumentationData() sounds better.

>> void CodeGenPGO::setFuncName(llvm::Function *Fn) {
>> -  StringRef Func = Fn->getName();
>> +  NormalFuncName = Fn->getName();
> 
> NormalFuncName? Is there a SpecialFuncName? This is just FunctionName,
> isn't it?

There are three function names kicking around there:

 - Fn->getName(), which sometimes has a \01 prefix.

 - NormalFuncName, which takes off the \01 prefix.  It’s used as a suffix
   for symbol names, like the “foo” in __llvm_pgo_counts_foo.

 - FuncName, which already existed, and whose name I didn’t want to
   change, and for static variables has a filename prefix.  E.g.,
   “somefile.cpp:foo" if foo() is static and defined in somefile.cpp, or
   “bar" if bar() is not static.  This is what gets saved in
   __llvm_pgo_name_foo (or __llvm_pgo_name_bar) and used to be sent as an
   argument into llvm_pgo_writeout().

NormalFuncName is normal in that it has no \01 or filename prefix.

I’m open to suggestions, but FunctionName would clash awkwardly with
FuncName.

>> +  auto *RegisterF
>> +    = llvm::Function::Create(RegisterFTy,
>> +                             llvm::GlobalValue::InternalLinkage,
>> +                             "__llvm_pgo_register_functions", &CGM.getModule());
> 
> The indentation is funny here:
> 
>  auto *RegisterF = llvm::Function::Create(RegisterFTy,
>                                           llvm::GlobalValue::InternalLinkage,
>                                           "__llvm_pgo_register_functions",
>                                           &CGM.getModule());

Sure, I’ll change that.

>> +static StringRef getDataSection(const CodeGenModule &CGM) {
>> +  if (CGM.getTarget().getTriple().getObjectFormat() == llvm::Triple::MachO)
>> +    return "__DATA,__llvm_pgo_data";
>> +  else
>> +    return "__llvm_pgo_data";
>> +}
> 
> This smells wrong to me. (1) We're at a pretty high level for checking the
> triple, and (2) why is MachO special?

In MachO, you need to specify a segment and a section.  Not true for ELF
(I admit I don’t know anything about COFF, or much at all about linkage).

Since we want symbols to go to a special section, we need to know which
triple so we can format the section name correctly.

>> 
>> -  llvm::Type *Int64PtrTy = llvm::Type::getInt64PtrTy(Ctx);
>> -  llvm::Type *Args[] = {
>> -    Int8PtrTy,                       // const char *FuncName
>> -    Int32Ty,                         // uint32_t NumCounters
>> -    Int64PtrTy                       // uint64_t *Counters
>> +llvm::GlobalVariable *CodeGenPGO::buildDataVar() {
>> +  // Create name variable.
>> +  llvm::LLVMContext &Ctx = CGM.getLLVMContext();
>> +  auto *VarName =
>> +    llvm::ConstantDataArray::getString(Ctx, getFuncName(), false);
>> +  auto *Name =
>> +    new llvm::GlobalVariable(CGM.getModule(), VarName->getType(), true,
>> +                             FuncLinkage, VarName, getFuncVarName("name"));
> 
> I think this reads better without the newlines after =:
> 
>  auto *VarName = llvm::ConstantDataArray::getString(Ctx, getFuncName(), false);
>  auto *Name = new llvm::GlobalVariable(CGM.getModule(), VarName->getType(),
>                                        true, FuncLinkage, VarName,
>                                        getFuncVarName("name"));

I’ll try that out.

>> llvm::Function *CodeGenPGO::emitInitialization(CodeGenModule &CGM) {
>> -  llvm::Function *WriteoutF =
>> -    CGM.getModule().getFunction("__llvm_pgo_writeout");
>> -  if (!WriteoutF)
>> -    return NULL;
>> +  if (!CGM.getCodeGenOpts().ProfileInstrGenerate)
>> +    return 0;
> 
> Why wasn't this check needed here before?

This check was implicit before.  Since this is an in the CodeGenPGO
API, I think it’s better to make it obvious that nothing happens if
-fprofile-instr-generate is off.

More practically...

>> -  // Create a small bit of code that registers the "__llvm_pgo_writeout" to
>> -  // be executed at exit.
>> -  llvm::Function *F = CGM.getModule().getFunction("__llvm_pgo_init");
>> -  if (F)
>> -    return NULL;
>> +  // Only need to create this once per module.
>> +  if (CGM.getModule().getFunction("__llvm_pgo_init"))
>> +    return 0;
>> 
>> -  llvm::LLVMContext &Ctx = CGM.getLLVMContext();
>> -  llvm::FunctionType *FTy = llvm::FunctionType::get(llvm::Type::getVoidTy(Ctx),
>> -                                                    false);
>> -  F = llvm::Function::Create(FTy, llvm::GlobalValue::InternalLinkage,
>> -                             "__llvm_pgo_init", &CGM.getModule());
>> +  // Get the functions to call at initialization.
>> +  llvm::Constant *RegisterF = getRegisterFunc(CGM);
>> +  llvm::Constant *WriteAtExitF = getOrInsertRuntimeWriteAtExit(CGM);
>> +  if (!RegisterF && !WriteAtExitF)
>> +    return 0;

...getOrInsertRuntimeWriteAtExit() currently inserts a declaration
(eventually, that will depend on a command-line switch and possibly
return 0).  Without the early exit above, we will always add the
declaration, and always continue on to add __llvm_pgo_init().

>> +
>>   /// counters depending on whether we are generating or using instrumentation.
>>   void assignRegionCounters(const Decl *D, llvm::Function *Fn);
>>   /// Emit code to write counts for a given function to disk, if necessary.
>> -  void emitWriteoutFunction();
>> +  void emitInstrGenerateData();
> 
> Update the doc comment here?

Nice catch.

>> +/* TODO: Calculate these with linker magic.
>> + */
> 
> Why is the */ on it's own line? That's unusual for a one-line c-style
> comment.  This applies to several comments later in the file as well.

I’ll fix them up.  My C fu is weak.

>> +static DataType *First = NULL;
>> +static DataType *Final = NULL;
>> +/*! \brief Register an instrumented function.
>> + *
>> + * Calls to this are inserted by clang with -fprofile-instr-generate.  They
>> + * should only be required on targets where we can't use linker magic to find
>> + * the bounds of the section.
>> + */
> 
> This comment isn't actually true yet.  I'd leave it out until we actually
> do the linker magic.

I’ll reword it to be more clear.  Technically the comment is true.  They
*should* only be required on such targets.

>> +void __llvm_pgo_register_write_atexit() {
>> +  /* TODO: Requires userspace: move to separate file.
>> +   */
>> +  static int WriteAtExit = 0;
> 
> Past tense was more clear here. If you don't like "ran", something like
> "RegisteredAtExit" would be fine.

I agree.  I’ll fix that up.



More information about the cfe-commits mailing list