[llvm-dev] Code size BoF Minutes
Bartell, Sean M via llvm-dev
llvm-dev at lists.llvm.org
Tue Oct 6 11:54:03 PDT 2020
Thanks to everyone who participated in the code size birds-of-a-feather today! Here are the minutes: https://docs.google.com/document/d/1kDsbFDtkWLceR-Y63ez04CBfvhbDN8RxWV3xXwaNUC0/edit?usp=sharing
The minutes are also copied below for convenience.
Code Size BoF Minutes
See also: Aditya Kumar’s talk on “Code Size Compiler Optimizations and Techniques”
* What use cases for code size are important to the LLVM community?
* Deeply embedded devices: IoT, RISC-V
* Memory is a large part of the cost of the chip
* Gaming consoles
* WebAssembly, for faster page loads
* Mobile apps
* Cache-limited programs on desktops, supercomputers
* Real-time systems (both size & speed are important)
* Find the code that isn’t on the critical path and optimize it for size
* Or find thresholds for heuristics that are good enough for both
* Hyperscale systems with lots of processors but only a little memory for each one
* EMU architecture<https://www.emutechnology.com/>
* Epiphany/Adapteva<https://www.adapteva.com/introduction/> architecture
* Why doesn’t -Oz do much better than -O2?
* There are optimizations that make dramatic speed improvements, but not as many that make dramatic size improvements
* It doesn’t disable things like loop unrolling
* It doesn’t disable inlining
* It doesn’t enable loop rerolling
* -Oz affects heuristics for e.g. inlining, but they can still make the wrong decision
* TargetTransformInfo has options for code size, but they often aren’t implemented
* If -Oz uses a poorly tuned pass sequence, we can autotune it with OpenTuner etc.
* Can get ~5% extra size reduction this way.
* LTO is effective for code size; was recently fixed to support -Oz
* Do we need different levels like -Os1, -Os2, -Os3?
* We have profilers for code speed; can we have something similar for code size?
* There is something like this for WebAssembly: Twiggy🌱<https://github.com/rustwasm/twiggy>
* Remarks<https://llvm.org/docs/Remarks.html> help understand what optimizations are being missed
* What code size benchmarks can we use to measure patches?
* Technique: outlining
* How is outlining affected by the size of the instruction set?
* Machine outliner might be affected, but IR-level outlining shouldn’t be affected much
* How do machine outliner and IR-level outliner compare?
* Technique: function merging
* Research paper: “Function Merging by Sequence Alignment”
* Look into string alignment techniques from biology
* Technique: deleting unused code
* What’s the best resource?
* Technique: compressors like gzip
* UPX, LZEXE<https://bellard.org/lzexe.html> save disk space
* To save RAM, you need a compression algorithm that supports random access, like something based on Huffman codes
* Has been implemented in hardware (IBM CodePack)
* Technique: using different optimizations for hot & cold code
* For best results, people currently have to move their cold code to a separate file so they can use different flags, which is painful
* Could we do code size optimizations in the linker?
* Needed to take advantage of RISC-V’s LUI instruction
* Requires copy propagation and dead code elimination
* Instead of inlining functions into each caller, can we keep one copy of the function and specialize it for all its call sites in the same translation unit?
* Attributor already shows improvements when you have recursion and non-trivial but dead code (the level stuff in Olden/bisort/bitonic)
1. Situations where code size matters
2. What do other compilers do better?
3. Techniques for code size optimization
4. How can we improve LLVM?
* Program design, library design, etc.
* General optimizations
* Loop idiom recognition (memset, memcpy)
* Partial inlining
* Function merging
* Functionality changes
* Partial evaluation
* Delete unused code
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev