[PATCH] D154641: [ELF] Add --compress-sections

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 7 09:49:29 PDT 2023


MaskRay planned changes to this revision.
MaskRay added a comment.

In D154641#4479852 <https://reviews.llvm.org/D154641#4479852>, @peter.smith wrote:

>> We compute the section content/size once in finalizeAddressDependentContent before compression. If the content or size changes, the compressed content will be invalid, but we don't detect changed content (e.g., data commands). However, we detect size changes in assignOffsets.
>
> I guess this means that if the writeTo() has any relocations they won't work with compression. The presence of relocations or possibly use of one of the relocate functions could generate an error. It probably wouldn't be intuitive to a user, but would protect them from wasting hours wondering why their data was corrupt (I'm assuming few people read the documentation). Off the top of my head "Cannot compress <output section>, <input section> from <object> contains relocations."

I agree. The current compressing once approach has a severe limitation and is error-prone. Worse, it does not consider thunks:

- The uncompressed section content decides the compressed section size.
- The compressed section size affects addresses of subsequent sections and symbol assignments. The affected sections include text sections that use range extension thunks.
- Subsequent sections and symbol assignments may affect the uncompressed section content. + PC-relative references to text sections (e.g., `.quad .text.foo-.`) change values when the text section address changes. + data commands in an output section description may change. + location counter increments (e.g., `. += expr;`) in an output section description may change.

  SECTIONS {
    ...
    foo : { *(foo*) QUAD(expr1) . += expr2; }
  }



> In armlink which does read-write data compression, we have this rather complicated scheme:
>
> - Allocate Final VMA Addresses, with predictions for LMA
> - Filter out relocations (in non compressed sections) to linker defined symbols that depend on a compressed address, this is easier in armlink as linker defined symbols are heavily constrained.
> - Resolve relocation
> - Compress RW Data
> - Allocate post compression addresses, VMA remain the same, LMA Addresses may change.
> - Resolve the filtered relocations
>
> This adds considerable complexity though.
>
> Not had a chance to go through the code and tests yet, been a very busy week. Will try and do that as soon as possible.

I am curious how Final VMA Addresses are determined. Doesn't relocations in an uncompressed section content affect the compressed section size?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154641/new/

https://reviews.llvm.org/D154641



More information about the llvm-commits mailing list