[PATCH] D78927: [yaml2obj] - Introduce the "Offset" property for sections.

Mon May 18 01:34:04 PDT 2020

jhenderson added inline comments.

================
Comment at: llvm/lib/ObjectYAML/ELFEmitter.cpp:48
-    return OS;
-  }

----------------
grimar wrote:
> jhenderson wrote:
> > grimar wrote:
> > > grimar wrote:
> > > > FTR, after commiting this patch I faced LLD test failture and fixed it:
> > > > https://github.com/llvm/llvm-project/commit/969c63a2ecfb536062ff2174645abe31e4036067
> > > > (it started to report "section [index 1] has a sh_offset (0x1000000000000000) + sh_size (0x4) that is greater than the file size (0x100000120)" instead of "section sh_addralign is too large" LLD's error)
> > > > 
> > > > The issue was related to "AddressAlign:    0x1000000000000000" line in the YAML:
> > > > 
> > > > ```
> > > >   - Name:            .text
> > > >     Type:            SHT_PROGBITS
> > > >     Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
> > > >     AddressAlign:    0x1000000000000000
> > > >     Content:         "00000000"
> > > > ```
> > > > 
> > > > 
> > > > Previously, we had this piece of code which used `unsigned`, so `AddressAlign` was truncated.
> > > > As a result a little binary with `sh_addrallign==0x1000000000000000` was produced.
> > > > After this patch, this piece does not exist and currently for that test a 500mb binary with with `sh_addrallign==0x1000000000000000` is produced. I.e. it works a bit more correct, but
> > > > the binary is is still truncated.
> > > > 
> > > > It happens because of `ELFState<ELFT>::alignToOffset`, which has the following lines:
> > > > 
> > > > ```
> > > >   CBA.getOS().write_zeros(AlignedOffset - CurrentOffset);
> > > >   return AlignedOffset;
> > > > ```
> > > > 
> > > > `AlignedOffset`  and `CurrectOffset` are `uint64_t`, but `raw_ostream::write_zeros` takes `unsigned`:
> > > > 
> > > > ```
> > > > /// indent - Insert 'NumSpaces' spaces.
> > > > raw_ostream &raw_ostream::indent(unsigned NumSpaces) {
> > > >   return write_padding<' '>(*this, NumSpaces);
> > > > }
> > > > 
> > > > /// write_zeros - Insert 'NumZeros' nulls.
> > > > raw_ostream &raw_ostream::write_zeros(unsigned NumZeros) {
> > > >   return write_padding<'\0'>(*this, NumZeros);
> > > > }
> > > > ```
> > > > 
> > > > Probably we want to change it to take `uint64_t`.
> > > > I do not think there is a way to write a test for it though as the result output would be huge.
> > > Ind I do not believe someone really needs `uint64_t` there to put so many zeroes, so I am not sure if we want to change it for "correctness" or not.
> > You're probably right that nobody is likely to need to write that many zeroes at once, but should we have something instead that prevents a silent truncation? Fixing the issue would of course be one way of doing it.
> What if we limit the size of the object produced by yaml2obj to something? E.g 10mb. And introduce an option to change this limit.
> Since it is used often for writing test cases, the outputs produced ideally should be small. And such limit would also allow to catch this and possible other mistakes that might lead to huge binaries. (I am thinking about the `Offset` field which might cause the same).
> 
> And I guess we can change `write_zeros` to take `uint64_t`, but error out when `NumZeros>UINT32_MAX` there too.
> 
I have nothing against a file size limit in yaml2obj, as long as it doesn't significantly increase complexity. It's probably a good safety net.

I'll leave it up to you what to do about `write_zeros`. I don't have a strong opinion on it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78927/new/

https://reviews.llvm.org/D78927