[PATCH] D60270: [llvm-objcopy] Add support for Intel HEX input/output format

Fri May 24 04:20:46 PDT 2019

evgeny777 added inline comments.

================
Comment at: test/tools/llvm-objcopy/ELF/ihex-writer.test:1
+# RUN: yaml2obj %p/Inputs/ihex-elf-sections.yaml -o %t
+# RUN: llvm-objcopy -O ihex %t %t2.hex
----------------
rupprecht wrote:
> All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected? I haven't yet debugged exactly why.
After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.yaml` is identical to one of GNU objcopy.
However if input ELF file contains segments situation is different - GNU objcopy seems to ignore segments completely and always uses section virtual address. This doesn't seem logical to me and also doesn't look consistent with the way we're currently generating binary output in llvm-objcopy.

================
Comment at: test/tools/llvm-objcopy/ELF/ihex-writer.test:20
+# RUN: yaml2obj %p/Inputs/ihex-elf-sections2.yaml -o %t-sec2
+# RUN: llvm-objcopy -O ihex --only-section=.text1 %t-sec2 %t-sec2.hex
+# RUN: cat %t-sec2.hex | FileCheck %s --check-prefix=SIGN_EXTENDED
----------------
rupprecht wrote:
> When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of range for Intel Hex file`. Maybe we shouldn't be supporting it? Are we able to handle it correctly somehow even though GNU objcopy can't?
Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.32.51.20190227 works fine

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:155
+  // Sign extended 32 bit addresses (e.g 0xFFFFFFFF80000000) are ok
+  return Addr > UINT32_MAX && Addr + 0x80000000 > UINT32_MAX;
+}
----------------
rupprecht wrote:
> Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr & 0xffffffff80000000 == 0xffffffff80000000` instead?
As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t.

================
Comment at: tools/llvm-objcopy/ELF/Object.cpp:205
+  Iter = utohexstr(getChecksum(S), Iter, 2);
+  *Iter++ = '\n';
+  assert(Iter == Line.end());
----------------
rupprecht wrote:
> It looks like ihex uses `\r\n` line endings 😦
> 
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l752
> 
> It seems weird for me to request this, but I think we should write `\r\n`, as this seems like a strange detail that people might need when consuming these files. I don't actually have any examples of this, however.
Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells that:
```
Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems
```
Probably the easiest thing to do is to stick to GNU behavior. I'll update the patch

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60270/new/

https://reviews.llvm.org/D60270