[PATCH] D60270: [llvm-objcopy] Add support for Intel HEX input/output format

Fri Apr 5 10:36:41 PDT 2019

evgeny777 marked 4 inline comments as done.
evgeny777 added inline comments.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:1450-1460
+Error IHexReader::checkChars(StringRef Line, size_t LineNo) const {
+  if (Line[0] != ':')
+    return parseError("line %zu: missing ':' in the beginning of line.",
+                      LineNo);
+
+  for (size_t Pos = 1; Pos < Line.size(); ++Pos)
+    if (hexDigitValue(Line[Pos]) == -1U)
----------------
rupprecht wrote:
> WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if it just attempts to match ":[0-9A-F]+".  It would produce less precise error messages, though.
I think that precise error message is more important. It might be hard in some cases to identify wrong character, e.g: `I` instead of `1`, `O` instead of `0`, russian `A` instead of english `A` and so on.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:1451
+Error IHexReader::checkChars(StringRef Line, size_t LineNo) const {
+  if (Line[0] != ':')
+    return parseError("line %zu: missing ':' in the beginning of line.",
----------------
rupprecht wrote:
> I think this will crash (or be UB) on an empty line?
Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert here on `!Line.empty()`

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:1469-1470
+    break;
+  case IHexRecord::EndOfFile:
+    break;
+  case IHexRecord::SegmentAddr:
----------------
rupprecht wrote:
> I think there should be validation (somewhere) that there are no more records after this
I think that `EndOfFile` record should unconditionally cancel further processing. . This allows moving EOF record within a file to temporarily prevent part of records from loading. This can be useful for testing. Also it seems GNU objcopy behaves this way.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:1525-1532
+    size_t DataLen = checkedGetHex<uint8_t>(Line.substr(1, 2));
+    if (Line.size() != IHexRecord::getLength(DataLen))
+      return parseError("line %zu: invalid line length %zu (should be %zu)",
+                        LineNo, Line.size(), IHexRecord::getLength(DataLen));
+
+    Rec.Addr = checkedGetHex<uint16_t>(Line.substr(3, 4));
+    Rec.Type = checkedGetHex<uint8_t>(Line.substr(7, 2));
----------------
rupprecht wrote:
> Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t> fields for each record, so we don't have to worry about it being valid everywhere (i.e. using checkedGetHex)?
It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're checking string with `checkChars` we shouldn't really step on conversion error, unless something really weird happens.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60270/new/

https://reviews.llvm.org/D60270