[PATCH] D78927: [yaml2obj] - Introduce the "Offset" property for sections.

Fri May 15 05:53:19 PDT 2020

grimar marked 2 inline comments as done.
grimar added inline comments.

================
Comment at: llvm/lib/ObjectYAML/ELFEmitter.cpp:48
-    return OS;
-  }

----------------
jhenderson wrote:
> grimar wrote:
> > grimar wrote:
> > > FTR, after commiting this patch I faced LLD test failture and fixed it:
> > > https://github.com/llvm/llvm-project/commit/969c63a2ecfb536062ff2174645abe31e4036067
> > > (it started to report "section [index 1] has a sh_offset (0x1000000000000000) + sh_size (0x4) that is greater than the file size (0x100000120)" instead of "section sh_addralign is too large" LLD's error)
> > > 
> > > The issue was related to "AddressAlign:    0x1000000000000000" line in the YAML:
> > > 
> > > ```
> > >   - Name:            .text
> > >     Type:            SHT_PROGBITS
> > >     Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
> > >     AddressAlign:    0x1000000000000000
> > >     Content:         "00000000"
> > > ```
> > > 
> > > 
> > > Previously, we had this piece of code which used `unsigned`, so `AddressAlign` was truncated.
> > > As a result a little binary with `sh_addrallign==0x1000000000000000` was produced.
> > > After this patch, this piece does not exist and currently for that test a 500mb binary with with `sh_addrallign==0x1000000000000000` is produced. I.e. it works a bit more correct, but
> > > the binary is is still truncated.
> > > 
> > > It happens because of `ELFState<ELFT>::alignToOffset`, which has the following lines:
> > > 
> > > ```
> > >   CBA.getOS().write_zeros(AlignedOffset - CurrentOffset);
> > >   return AlignedOffset;
> > > ```
> > > 
> > > `AlignedOffset`  and `CurrectOffset` are `uint64_t`, but `raw_ostream::write_zeros` takes `unsigned`:
> > > 
> > > ```
> > > /// indent - Insert 'NumSpaces' spaces.
> > > raw_ostream &raw_ostream::indent(unsigned NumSpaces) {
> > >   return write_padding<' '>(*this, NumSpaces);
> > > }
> > > 
> > > /// write_zeros - Insert 'NumZeros' nulls.
> > > raw_ostream &raw_ostream::write_zeros(unsigned NumZeros) {
> > >   return write_padding<'\0'>(*this, NumZeros);
> > > }
> > > ```
> > > 
> > > Probably we want to change it to take `uint64_t`.
> > > I do not think there is a way to write a test for it though as the result output would be huge.
> > Ind I do not believe someone really needs `uint64_t` there to put so many zeroes, so I am not sure if we want to change it for "correctness" or not.
> You're probably right that nobody is likely to need to write that many zeroes at once, but should we have something instead that prevents a silent truncation? Fixing the issue would of course be one way of doing it.
What if we limit the size of the object produced by yaml2obj to something? E.g 10mb. And introduce an option to change this limit.
Since it is used often for writing test cases, the outputs produced ideally should be small. And such limit would also allow to catch this and possible other mistakes that might lead to huge binaries. (I am thinking about the `Offset` field which might cause the same).

And I guess we can change `write_zeros` to take `uint64_t`, but error out when `NumZeros>UINT32_MAX` there too.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78927/new/

https://reviews.llvm.org/D78927