[LLVMdev] Power/Energy Awareness in LLVM

David Tweed david.tweed at gmail.com
Thu Apr 18 05:20:03 PDT 2013


Note that I am most definitely speaking for myself and these views may not
map with my employers view.

I've also seen some interesting work on reducing energy usage, particularly
in embedded systems which actually have the ability to power down
individual memory banks: in this case carefully organising things may mean
that it's possible to completely power down all but one bank and save a
quite significant amount of energy. (This isn't the paper I'm thinking
about, but similar "Optimizing the memory bandwidth with loop fusion", Paul
Marchal, Jose Ignacio Gomez and Francky Catthoor.) However, all these "big
win"
changes seem to only be practically feasible if you're working at a higher
level like a DSL where you actually have a stronger model of what the
program is doing to reason with; it seems likely to me that any
optimizations you can figure out (in terms of correctness and
profitability) at the LLVM-IR level are likely to have at least an order of
magnitude less effect than higher level transformations. There's the
counterargument that IR optimizations can automatically be applied to all
code from any LLVM front-end, but I'm still unsure it's worth the practical
effort given the likely reward.

Indeed, one of my concerns when I'm using LLVM is that after having
carefully analysed and determined doing something which yields slightly odd
IR but would actually be profitable in terms of one or more of latency,
memory bandwidth or energy usage, that the LLVM optimizers might go and
undo it...

Cheers,
Dave


On Tue, Apr 16, 2013 at 9:53 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk
> wrote:

> On 15 Apr 2013, at 16:03, Sean Silva <silvas at purdue.edu> wrote:
>
> > See http://llvm.org/bugs/show_bug.cgi?id=6210.
>
> Chris is correct at the coarse granularity, but there are some trades to
> be made at the fine.  There is some interesting work from MIT in the
> context of image processing kernels related to the relative costs of saving
> intermediates out to cache or DRAM vs recomputing them - often recomputing
> takes one to two orders of magnitude less power.  The tile hashing
> mechanism in recent MALI GPUs is designed to address the same problem:
> accesses to memory - especially off-chip memory - use a surprisingly large
> amount of power.
>
> Optimising for this requires quite intimate knowledge of the target CPU
> (sizes of the caches, relative costs of ALU operations to cache / DRAM
> accesses, and so on), but it would be very interesting if a compiler had
> this knowledge and could take advantage of it.
>
> I don't know of any work using LLVM to do this (the MIT work was based on
> source-to-source transformations via a C++ DSL), but it would certainly be
> interesting to any company that shipped a large number of mobile devices
> based on a small number of distinct SoCs.  If only there were such a
> company that regularly contributed to LLVM...
>
> David
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>



-- 
cheers, dave tweed__________________________
high-performance computing and machine vision expert: david.tweed at gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130418/752b97b1/attachment.html>


More information about the llvm-dev mailing list