[LLVMdev] [RFC] Stripping unusable intrinsics
Chris Bieneman
beanz at apple.com
Tue Dec 23 10:28:24 PST 2014
> On Dec 23, 2014, at 10:19 AM, Bob Wilson <bob.wilson at apple.com> wrote:
>
>
>> On Dec 23, 2014, at 9:45 AM, Chris Lattner <clattner at apple.com <mailto:clattner at apple.com>> wrote:
>>
>> On Dec 22, 2014, at 2:56 PM, Chris Bieneman <beanz at apple.com <mailto:beanz at apple.com>> wrote:
>>> Circling back to Chandler on file size differences. Here are the highlights of what is different.
>>>
>>> For my analysis I built LLVM and Clang using a clang built with my patches. The same clang was used for the baseline and the stripped build. I used the following CMake command:
>>>
>>> cmake -G "Sublime Text 2 - Ninja" -DCMAKE_BUILD_TYPE=Release -DLLVM_BUILD_LLVM_DYLIB=Yes -DLLVM_DISABLE_LLVM_DYLIB_ATEXIT=On -DCMAKE_INSTALL_PREFIX=/Users/cbieneman/dev/llvm-install/ -DLLVM_TARGETS_TO_BUILD="AArch64;ARM;X86" ../llvm
>>>
>>> and built using Ninja.
>>>
>>> Created a fresh build directory and built once as a baseline from the following revisions:
>>>
>>> LLVM - ba05946
>>> lld - 33bd1dc
>>> clang - 1589d90 (With my patches applied)
>>>
>>> I then applied my tablegen and CMake patches, made a new build directory, and built a second time. I then compared the file sizes between the two directories by diffing the output of:
>>>
>>> find . -type f -exec stat -f '%N %z' '{}' + | sort
>>>
>>> The biggest benefits are an 11% reduction in size for libLLVMCore, which is mostly due to Function.cpp.o reducing in size by 300KB (almost 39%). The biggest thing in there that would contribute to actual code size is the almost 28,000 line switch statement that provides the implementation for Function::lookupIntrinsicID.
>>
>> That makes sense. It sounds like there is a better design here: we should move to a model where intrinsic tables are registered by any targets that are activated. That would allow the intrinsic tables (including these switch/lookup mapping tables) to be in the target that uses them.
>>
>> It should be straight-forward to have something like LLVMInitializeX86Target/RegisterTargetMachine install the intrinsics into a registry.
>
> I tried doing that a few years ago. It’s not nearly as easy as it sounds because we’ve got hardcoded references to various target intrinsics scattered throughout the code.
I was just writing to say exactly this. There are a number of ways we could work toward something like this. I’m completely in favor of a world where Intrinsics are properties of the targets and don’t leach out, however today they do in a lot of places.
Some of these places could be replaced with subtarget hooks with very little issue, and we could certainly have target initialization register intrinsics.
One cost of having intrinsics live in the targets is that we actually get some pretty substantial savings today in our constant data size by having all the intrinsics generated together. Today Intrinsic IDs are used as indices to tables that map to other characteristics, and there is uniquing performed on the tables to reduce their size.
One possible solution would be to only have the intrinsic enum remain part of the IR library, but have everything else get broken up by target. We could make this work by having Tablegen specify enum values using the high-order bits to signify which target, and the low-order bits to signify the index into that target’s intrinsic tables. This would probably get us a big chunk of the overall code size savings.
I also think that simply sorting the intrinsics by name, and doing a binary sort like Sean suggested should allow us to save a lot of code size, and probably make this faster too.
-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141223/54b5edbf/attachment.html>
More information about the llvm-dev
mailing list