[llvm-dev] [PGO] Thoughts on adding a key-value store to profile data formats

Nathan Slingerland via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 15 11:06:24 PST 2016


Hi all,

I'd liked to get your thoughts on possibly adding a generic key-value store
to the profile data formats for 'metadata'. Some potential uses cases:

*I. Profile Features*

The most basic use could be as a central repository for internal bits of
housekeeping information about the profile data. For example, to
differentiate between FE and IR instrumentation:

  llvm.instrumentation_source: "IR"

A key-value store would make it simple to add new bits of information and
help keep everything human-readable for the text-based test formats. This
could potentially also help with error checking at the llvm-profdata level
if the Reader classes exposed it.

*II. Profile Context*

Basic (lightweight) information about the profile could be automatically
gathered at profile time. The idea would be to automatically label profiles
with contextual information so that the age/origin of a profile could be
inspected using the llvm-profdata tool.

  $ llvm-profdata show -metadata foo.profdata
  llvm.profile_start_time: "2016-01-08T23:41:56.755Z"
  llvm.profile_duration: 5.102s
  llvm.exe_time: "2016-01-08T23:35:56.745Z"
  Total functions: 4
  Maximum function count: 866988873
  Maximum internal block count: 267914296

Other possibilities: executable path, command line arguments, system info
(uname)

*III. Custom Content*

The key-value store itself could be exposed to developers via the
llvm-profdata tool. This would allow for users to associate arbitrary
custom data with a profile, as well as inspect it:

  $ llvm-profdata merge -metadata=customkey,value1 foo.profraw -o
foo.profdata
  $ llvm-profdata show -metadata foo.profdata
  customkey: "value1"
  Total functions: 4
  Maximum function count: 866988873
  Maximum internal block count: 267914296

Developers could add as much custom context as they find valuable:

  $ llvm-profdata merge -metadata="mysoft.version,${SOFTWARE_VERSION}
(${BUILD_NUMBER})" -metadata="mysoft.exe_md5,`md5 -q foo.exe` foo.profraw
-o foo.profdata
  $ llvm-profdata show -metadata foo.profdata
  mysoft.version: "0.1.0"
  mysoft.exe_md5: "337b5c5bc29cbdca090a1921a58465d6"
  Total functions: 4
  Maximum function count: 866988873
  Maximum internal block count: 267914296

Other information that might be interesting: git/svn revision, workload
description, system info (uname -a)

This would be a way to embed almost any platform-specific or heavy-weight
data without requiring the addition of platform-specific code in
compiler-rt and without impacting other developers.


When profiles are merged it might be simplest to keep all input metadata
(machine-readable things such as feature bits might need to be handled
differently):

  $ llvm-profdata merge -weighted-input=3,foo.profdata bar.profdata -o
foobar.profdata
  $ llvm-profdata show -metadata foobar.profdata
  foo.profdata
    llvm.profile_weight: 3
    llvm.profile_start_time: "2016-01-08T23:41:56.755Z"
    llvm.profile_duration: 5.102s
    llvm.exe_time: "2016-01-08T23:35:56.745Z"
    customkey: "value1"
  bar.profdata
    llvm.profile_weight: 1
    llvm.profile_start_time: "2016-01-15T00:08:41.168Z"
    llvm.profile_duration: "1.001s"
    llvm.exe_time: "2016-01-15T00:08:13.000Z"
    customkey: "value2"
  Total functions: 4
  Maximum function count: 866988873
  Maximum internal block count: 267914296

In terms of implementation, the metadata could live as a separate
contiguous section in the binary profile formats. It might make sense to
encode it in something like YAML so that it could also be directly embedded
in the various text formats.

----

What do you think? How useful would any of the above be to you or other PGO
users?
Can you think of any other use cases?

-Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/bfb9b5e1/attachment.html>


More information about the llvm-dev mailing list