[PATCH] D124804: [Object][DX] Parse DXContainer Parts

Jakub Kuderski via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 19 17:24:17 PDT 2022


kuhar requested changes to this revision.
kuhar added a comment.
This revision now requires changes to proceed.

On high level, would it be possible to use an existing class like `llvm::BinaryStreamReader` instead that most of this data reading machinery?

And a general nit: I'm biased towards using `ArrayRef`s to represent data and `StringRef`s to represent text, but I don't think there's a consensus on this.



================
Comment at: llvm/include/llvm/BinaryFormat/DXContainer.h:86
+
+  void byteSwap() { sys::swapByteOrder(Size); }
   // Structure is followed directly by part data: uint8_t PartData[PartSize].
----------------
nit: `swapBytes`? LLVM prefers starting function names with verbs,  especially the state-mutating ones: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly


================
Comment at: llvm/include/llvm/Object/DXContainer.h:39-42
+  // The PartIterator is a wrapper around the iterator for the PartOffsets
+  // member of the DXContainer. It contains a refernce to the container, and the
+  // current iterator value, as well as storage for a parsed part header.
+  class PartIterator {
----------------
nit: could we pull this class out of `DXContainer`? I think this would make it a bit more readable, especially since we have another level of nesting with the `PartData` struct.


================
Comment at: llvm/include/llvm/Object/DXContainer.h:58
+
+    void updateIterator();
+
----------------
Could you add a comment with a brief description of what this function does? It's impossible to tell based on the signature alone.


================
Comment at: llvm/lib/Object/DXContainer.cpp:34
+template <typename T>
+static Error readValue(StringRef Buffer, const char *P, T &Val) {
+  // Don't read before the beginning or past the end of the file
----------------
Can you give `P` a more descriptive name and/or add a function comment?
Also, why not return `Expected<T>`


================
Comment at: llvm/lib/Object/DXContainer.cpp:39
+
+  Val = *reinterpret_cast<const T *>(P);
+  // DXContainer is always little endian
----------------
What if `P` is not properly aligned for `T*`?
If it must be, can we add an assertion? If not, I think we would have to use a `memcpy`.

Also, shouldn't we check that `T` is trivially copyable in the first place?


================
Comment at: llvm/lib/Object/DXContainer.cpp:76-77
+void DXContainer::PartIterator::updateIterator() {
+  if (OffsetIt == Container.PartOffsets.end())
+    return;
+  StringRef Buffer = Container.Data.getBuffer();
----------------
Would it be possible for `OffsetIt` to equal the end iterator and end up with an uninitialized `IteratorState`?


================
Comment at: llvm/unittests/Object/DXContainerTest.cpp:56
+
+TEST(DXCFile, ParsePartMissingOffsets) {
+  uint8_t Buffer[] = {
----------------
Could we have a test with an empty buffer, so that we know that this corner case was considered and whether this is supported or not?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124804/new/

https://reviews.llvm.org/D124804



More information about the llvm-commits mailing list