[PATCH] D102039: [profile] WIP Add binary id into profiles

Roland McGrath via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 1 18:26:21 PDT 2021


mcgrathr added inline comments.


================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformLinux.c:99
+    while (Note < NotesEnd) {
+      if (Note->n_type == NT_GNU_BUILD_ID)
+        BinaryIdsSize++;
----------------
This is insufficient.  The note "name" determines the space of n_type values.  You must verify that the n_namesz=4 and the name bytes are "GNU\0" as well, or else this is some unrelated note that happens to use n_type=3 where 3 does not mean NT_GNU_BUILD_ID.



================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformLinux.c:102
+
+      Note = Note + sizeof(ElfW(Nhdr)) + RoundUp(Note->n_namesz, 4) +
+             RoundUp(Note->n_descsz, 4);
----------------
This arithmetic is not right since Note has a type with sizeof>1.  An example of correct arithmetic is:
```
  Note = (const ElfW(Nhdr)*)((const char *)(Note + 1) + RoundUp(Note->n_namesz, 4) + RoundUp(Note->n_descsz, 4));
```


================
Comment at: compiler-rt/lib/profile/InstrProfilingPlatformLinux.c:117
+
+  uint32_t I;
+  /* Iterate through entries in the program header. */
----------------
Can you abstract this logic into a subroutine so the code is not repeated in the two functions?
A simple approach would be a subroutine that takes a maybe-null ProfDataWriter pointer and when called with a null pointer just returns the size instead of calling the writer.



================
Comment at: compiler-rt/test/profile/binary-id.c:1
+// RUN: %clang_profgen -O2 -o %t %s
+// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t
----------------
You may need to pass explicit -Wl,--build-id=none through the compiler here since it's sometimes on by default.


================
Comment at: llvm/include/llvm/ProfileData/InstrProfReader.h:85
 
+  // Get number of binary ids.
+  /// TODO: Can we implement the following three functions
----------------
Is this the number of distinct IDs, or the byte size of the total block?  (AIUI the total block is a sequence of {uint64_t n, uint8_t[n]} blocks.)  If it's the number of distinct IDs, then I don't understand how a single uint8_t* from getBinaryIds() is meant to be used.  For a number like that, I think Count is a clearer name to use than Size.



================
Comment at: llvm/include/llvm/ProfileData/InstrProfReader.h:91
+  // Get binary ids start position.
+  virtual const uint8_t *getBinaryIds() { return nullptr; };
+
----------------
Should these really be two separate methods?  Or should it just be one method that returns both the size and the data in a span-style data structure?


================
Comment at: llvm/include/llvm/ProfileData/InstrProfReader.h:499
+  // Otherwise, returns the input Cur
+  // const unsigned char *readBinaryIdikos(const unsigned char *Cur);
+  const unsigned char *readBinaryIds(const unsigned char *Cur);
----------------
I don't understand the commented-out declaration.


================
Comment at: llvm/lib/ProfileData/InstrProfReader.cpp:558
+
+    BinaryId += BinaryIdLen;
+  }
----------------
This is wildly susceptible to bad data.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102039/new/

https://reviews.llvm.org/D102039



More information about the llvm-commits mailing list