[PATCH] D60270: [llvm-objcopy] Add support for Intel HEX input/output format

Mon Apr 22 13:16:46 PDT 2019

jakehehrlich added a comment.

There's a lot of code to review here. I'll keep reviewing it everyday but this is going to take a while to review. Any help on splitting this up and making into smaller chunks would be helpful. Splitting reading and writing up into two separate patches would be helpful and removing features that we can add later would be helpful.

================
Comment at: include/llvm/Support/Error.h:1180
   friend Error createFileError(const Twine &, Error);
+  friend Error createFileError(const Twine &, size_t, Error);

----------------
size_t here and below is kind of confusing, can we use uint32_t?

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:157
+
+template <class T> static T checkedGetHex(StringRef S) {
+  T Value;
----------------
Unless an check that generates an error always proceeds this I think its best to return an error in this case, not assert fail. It would be better to roll this into an Expected function in that case I think anyway.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:197
+  *Iter++ = ':';
+  Iter = utohexstr(Data.size(), Iter, 2);
+  Iter = utohexstr(Addr, Iter, 4);
----------------
Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to lend itself to streams where as my opinion was the opposite before. You wouldn't need utohexstr since those formatting options are already supplied by the library I believe.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:209
+
+static Error checkChars(StringRef Line) {
+  assert(!Line.empty());
----------------
This is a very generic name with no comment. In general your comments have been awesome. I'd like to have an idea what this function does without reading the contents.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:298
+
+static uint64_t sectionPhysicalAddr(const SectionBase *Sec) {
+  Segment *Seg = Sec->ParentSegment;
----------------
Does this ever make sense if there is no segment?

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:310
+  const uint32_t ChunkSize = 16;
+  uint32_t Addr = sectionPhysicalAddr(Sec) & 0xFFFFFFFFU;
+  while (!Data.empty()) {
----------------
Masking that like this seems redundant, in general the number of places we're converting from 64 to 32 in an unchecked way is really shocking. I'd feel a lot more comfortable if we encapsulated these checks more and made them more clear.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:313
+    uint64_t DataSize = std::min<uint64_t>(Data.size(), ChunkSize);
+    if (Addr > SegmentAddr + BaseAddr + 0xFFFFU) {
+      if (Addr > 0xFFFFFU) {
----------------
Maybe we could split support for extended records out into a sperate patch and error out here for now?

================
Comment at: tools/llvm-objcopy/ELF/Object.h:266
+// calculation in IHexWriter::finalize.
+class IHexSectionWriterBase : public BinarySectionWriter {
+  // 20-bit segment address
----------------
What's the point in splitting this into two classes?

Also does inheriting from BinarySectionWriter make sense? The same visitors will need to be implemented I would suppose the offsets and everything would be very different.

================
Comment at: tools/llvm-objcopy/ELF/Object.h:296
+// Real IHEX section writer
+class IHexSectionWriter : public IHexSectionWriterBase {
+public:
----------------
In general I think it might be worth considering weather there is a need to use sections at all. Originally with the binary writer we only used program headers. It turns out that people did a lot of stuff in a really odd way with GNU objcopy when using -O binary that required that we use sections as the primary basis for output. I would imagine that ihex users would not be doing the same sorts of odd tricks and that you could write the output strait from program headers. This would simplify the implementation greatly I think and harden the implementation against all sorts of odd corner cases.

================
Comment at: tools/llvm-objcopy/ELF/Object.h:498-507
+  OwnedDataSection(const Twine &SecName, uint64_t SecAddr, uint64_t SecFlags,
+                   uint64_t SecOff) {
+    Name = SecName.str();
+    Type = ELF::SHT_PROGBITS;
+    Addr = SecAddr;
+    Flags = SecFlags;
+    OriginalOffset = SecOff;
----------------
Why do we need this to output a new format?

================
Comment at: tools/llvm-objcopy/ELF/Object.h:932

+class IHexReader : public Reader {
+  MemoryBuffer *MemBuf;
----------------
We can probably split this into two changes to make things smaller, one for reading, and one for writing yeah?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60270/new/

https://reviews.llvm.org/D60270