[llvm-dev] RFC: Adding a code size analysis tool

Jake Ehrlich via llvm-dev llvm-dev at lists.llvm.org
Mon Oct 1 16:57:42 PDT 2018


Well that sounds complicated and non-trivial but I'd still be fine with it.
I'd still want a story about propogating reasons for sizes. In addition I'd
want a story about how we plan on migrating all that non-llvm code to be
llvm style. Sounds like a total rewrite it you ask me.

On Mon, Oct 1, 2018, 4:50 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Is it that it'd be better to have the functionality in LLVM, or in a new
> tool? (is it about it being a different tool, or about it being in the LLVM
> tree, or something else?)
>
> What about possibly moving Bloaty into the LLVM project & improving it
> there?
>
> On Mon, Oct 1, 2018 at 4:48 PM Vedant Kumar <vsk at apple.com> wrote:
>
>> On Oct 1, 2018, at 3:25 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Mon, Oct 1, 2018 at 3:24 PM JF Bastien <jfbastien at apple.com> wrote:
>>
>>> On Oct 1, 2018, at 3:16 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>> (my vote, somewhat biased - is that I'd love to see more investment in
>>> Bloaty (to keep all these sort of size analysis tools and tricks in one
>>> place), but sort of accept folks are probably going to keep building more
>>> infrastructure for this sort of thing in LLVM directly)
>>>
>>>
>>> I get where that comes from, but it seems a bit like a Valgrind versus
>>> sanitizer argument: integrating with the toolchain gives you things you
>>> can’t really get otherwise. Valgrind is still great as a self-standing
>>> thing.
>>>
>>
>> Not sure that's quite the same though - with sanitizer integrating with
>> the optimizers is the key here.
>>
>> With bloaty - it could, at worst, use LLVM's libDebugInfo as a library to
>> implement the more advanced debug-using features without being less
>> functional than an in-LLVM implementation.
>>
>>
>> I’m a bit biased too, but fwiw: my preference would be to add a new size
>> analysis tool to llvm.
>>
>> Such a tool might grow to depend on code for object file parsing, debug
>> info parsing, demangling, and disassembling (all of which bloaty either
>> reimplements or pulls in). Living in-tree should make it easier to pick up
>> bug fixes in these dependencies and reduce maintenance overhead.
>>
>> While I really like bloaty, my impression is that it’d be better to
>> implement the functionality I’d like to use in a new tool.
>>
>> vedant
>>
>>
>>
>> - Dave
>>
>>
>>>
>>>
>>> On Wed, Sep 26, 2018 at 12:03 PM Vedant Kumar <vsk at apple.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I worked on a code size analysis tool for a 'week of code' project and
>>>> think
>>>> that it might be useful enough to upstream.
>>>>
>>>> The tool is inspired by bloaty (https://github.com/google/bloaty), but
>>>> tries to
>>>> do more to attribute code size in actionable ways.
>>>>
>>>> For example, it can calculate how many bytes inlined instances of a
>>>> function
>>>> added to a binary. In its diff mode, it can show how much more
>>>> aggressively a
>>>> function was inlined compared to a baseline. This can be useful when
>>>> you're,
>>>> say, trying to figure out why firmware compiled by a new compiler is
>>>> just a few
>>>> bytes over the size limit imposed by your embedded device :). In this
>>>> case,
>>>> extra information about inlining can help inform a decision to either
>>>> tweak the
>>>> inliner's cost model or to judiciously add a few `noinline` attributes.
>>>> (Note
>>>> that if you're willing to recompile & write a few SQL queries,
>>>> optimization
>>>> remarks can give you similar information, albeit at the IR level.)
>>>>
>>>> As another example, this code size tool can attribute code size to
>>>> semantically
>>>> interesting groups of code, like C++/Swift classes, or files. In the
>>>> diff mode,
>>>> you can see how the code size of a class/file grew compared to a
>>>> baseline. The
>>>> tool understands inheritance, so you can also see interesting
>>>> high-level trends.
>>>> E.g `clang::Sema` grew more than `llvm::Pass` between clang-6 and
>>>> clang-7.
>>>>
>>>> Unlike bloaty, this tool focuses exclusively on the text segment. Also
>>>> unlike
>>>> bloaty, it uses LLVM's DWARF parser instead of rolling its own. The
>>>> tool is
>>>> currently implemented as a sub-tool of llvm-dwarfdump.
>>>>
>>>> To get size information about a program, you do:
>>>>
>>>>   llvm-dwarfdump size-info -baseline <object> -stats-dir <dir>
>>>>
>>>> This emits four *.stats files into <dir>, each containing a distinct
>>>> 'view' into
>>>> the code groups in <object>. There's a file view, a function view, a
>>>> class view,
>>>> and an inlining view. Each view is sorted by code size, so you can see
>>>> the
>>>> largest functions/classes/etc immediately.
>>>>
>>>> The *.stats files are just human-readable text files. As it happens,
>>>> they use
>>>> the flamegraph format (http://brendangregg.com/flamegraphs.html). This
>>>> makes it
>>>> easy to visualize any view as a flamegraph. (If you haven't seen one
>>>> before,
>>>> it's a hierarchical visualization where the width of each entry
>>>> corresponds to
>>>> its frequency (or in this case size).)
>>>>
>>>> To look at code growth between two programs, you'd do:
>>>>
>>>>   llvm-dwarfdump size-info -baseline <object> -target <object>
>>>> -stats-dir <dir>
>>>>
>>>> Similarly, this emits four 'view' files into <dir>, but with a
>>>> *.diffstats
>>>> suffix. The format is the same.
>>>>
>>>> Pending Work
>>>> ------------
>>>>
>>>> I think the main piece of work the tool needs is better testing.
>>>> Currently
>>>> there's just a single end-to-end test in clang. It might be better to
>>>> check in
>>>> a few binaries so we can check that the tool reports sizes correctly.
>>>>
>>>> Also, it may turn out that folks are interested in different ways of
>>>> visualizing
>>>> size data. While the textual format of flamegraphs is really convenient
>>>> for
>>>> humans to read, the graphs themselves do make more sense when the
>>>> underlying
>>>> data have a frequentist interpretation. If there's enough interest I
>>>> can explore
>>>> using an alternative format for visualization, e.g:
>>>>
>>>>   http://neugierig.org/software/chromium/bloat/
>>>>   https://github.com/evmar/webtreemap
>>>>
>>>> (Thanks JF for pointing these out!)
>>>>
>>>> Here's a link to the source code:
>>>>
>>>>   https://github.com/vedantk/llvm-project/tree/sizeinfo
>>>>
>>>> Selected Examples
>>>> -----------------
>>>>
>>>> Here are a few interesting snippets from a comparison of clang-6 vs.
>>>> clang-7.
>>>>
>>>> First, let's take a look at the function view diffstat. Here are the 10
>>>> functions which grew in size the most. On the left hand side, you'll
>>>> see the
>>>> demangled function name. The *change* in code size in bytes is reported
>>>> on the
>>>> right hand side (only positive changes are reported).
>>>>
>>>>   clang::Sema::CheckHexagonBuiltinCpu([snip]) [function] 170316
>>>>   ProcessDeclAttribute([snip]) [function] 125893
>>>>   llvm::AArch64InstPrinter::printAliasInstr([snip]) [function] 105133
>>>>   llvm::AArch64AppleInstPrinter::printAliasInstr([snip]) [function]
>>>> 105133
>>>>   ParseCodeGenArgs([snip]) [function] 64692
>>>>   unswitchNontrivialInvariants([snip]) [function] 40180
>>>>   getAttrKind([snip]) [function] 35811
>>>>   clang::DumpCompilerOptionsAction::ExecuteAction() [function] 32417
>>>>   llvm::UpgradeIntrinsicCall([snip]) [function] 30239
>>>>   bool llvm::InstructionSelector::executeMatchTable<(anonymous
>>>> namespace)::ARMInstructionSelector const, [snip]) const [function] 29352
>>>>
>>>>
>>>> Next, let's look at the file view diffstat. This can be useful because
>>>> it goes
>>>> beyond simply identifying the files which grew the most. It actually
>>>> describes
>>>> which *functions* grew the most in those files, creating more
>>>> opportunites to
>>>> do something about the code growth.
>>>>
>>>>   lib/Target/X86/X86ISelLowering.cpp
>>>> [file];combineX86ShuffleChain([snip]) [function] 24864
>>>>   lib/Target/X86/X86ISelLowering.cpp [file];combineMul([snip])
>>>> [function] 14907
>>>>   lib/Target/X86/X86ISelLowering.cpp [file];combineStore([snip])
>>>> [function] 12220
>>>>   ...
>>>>   tools/clang/lib/Sema/SemaExpr.cpp
>>>> [file];clang::Sema::CheckCompareOperands([snip]) [function] 16024
>>>>   tools/clang/lib/Sema/SemaExpr.cpp
>>>> [file];diagnoseTautologicalComparison([snip]) [function] 1740
>>>>   tools/clang/lib/Sema/SemaExpr.cpp
>>>> [file];clang::Sema::ActOnNumericConstant([snip]) [function] 1436
>>>>   tools/clang/lib/Sema/SemaExpr.cpp
>>>> [file];checkThreeWayNarrowingConversion([snip]) [function] 1356
>>>>   tools/clang/lib/Sema/SemaExpr.cpp
>>>> [file];CheckIdentityFieldAssignment([snip]) [function] 1280
>>>>
>>>>
>>>> The class view diffstat is a bit different because it has more levels of
>>>> nesting than the other views, due to inheritance. This might help give
>>>> a sense
>>>> for the high-level changes in a program, but may also be less
>>>> actionable.
>>>>
>>>>   clang::Sema [class];clang::Sema::CheckHexagonBuiltinCpu([snip])
>>>> [function] 170316
>>>>   clang::Sema [class];clang::Sema::CheckHexagonBuiltinArgument([snip])
>>>> [function] 24156
>>>>   clang::Sema [class];clang::Sema::ActOnTag([snip]) [function] 22373
>>>>   ...
>>>>   llvm::AArch64InstPrinter [class];llvm::AArch64AppleInstPrinter
>>>> [class];llvm::AArch64AppleInstPrinter::printAliasInstr([snip]) [function]
>>>> 105133
>>>>   llvm::AArch64InstPrinter [class];llvm::AArch64AppleInstPrinter
>>>> [class];llvm::AArch64AppleInstPrinter::printInstruction([snip]) [function]
>>>> 5824
>>>>   ...
>>>>   llvm::Pass [class];llvm::FunctionPass
>>>> [class];llvm::MachineFunctionPass
>>>> [class];(anon)::X86SpeculativeLoadHardeningPass [class];(anonymous
>>>> namespace)::X86SpeculativeLoadHardeningPass::checkAllLoads(llvm::MachineFunction&)
>>>> [function] 19287
>>>>   ...
>>>>   llvm::Pass [class];llvm::FunctionPass
>>>> [class];llvm::MachineFunctionPass [class];(anon)::MachineLICMBase
>>>> [class];(anonymous
>>>> namespace)::MachineLICMBase::runOnMachineFunction(llvm::MachineFunction&)
>>>> [function] 20343
>>>>
>>>> Here's a link to a flamegraph of the class view diffstat (warning: it's
>>>> big):
>>>>
>>>>
>>>> http://net.vedantk.com/static/llvm/swift-clang-4.2-vs-5.0.class-view.diffstats.svg
>>>>
>>>> Finally, here are a few interesting entries from the inlining view
>>>> diffstat. As
>>>> with all of the other views, the right hand side still shows code
>>>> growth in
>>>> bytes. For a given inlining target, this size is computed by diffing
>>>> the sum of
>>>> PC range lengths from all DW_TAG_inlined_subroutines referring to that
>>>> target.
>>>> This allows the size tool to attribute code size to an inlining target
>>>> even
>>>> when the inlined code is not contiguous in the caller.
>>>>
>>>>   llvm::raw_ostream::operator<<(char const*) [inlining-target] 66720
>>>>   llvm::MCRegisterClass::contains(unsigned int) const [inlining-target]
>>>> 64161
>>>>   llvm::StringRef::StringRef(char const*) [inlining-target] 39262
>>>>   llvm::MCInst::getOperand(unsigned int) const [inlining-target] 33268
>>>>   clang::CodeCompletionResult::~CodeCompletionResult()
>>>> [inlining-target] 25763
>>>>   llvm::operator+(llvm::Twine const&, llvm::Twine const&)
>>>> [inlining-target] 25525
>>>>   clang::ASTImporter::Import(clang::SourceLocation) [inlining-target]
>>>> 21096
>>>>   clang::Sema::Diag(clang::SourceLocation, unsigned int)
>>>> [inlining-target] 20898
>>>>
>>>> Feedback & questions welcome!
>>>>
>>>> thanks,
>>>> vedant
>>>>
>>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181001/a3158244/attachment-0001.html>


More information about the llvm-dev mailing list