[PATCH] Add HLE target feature

Wed Feb 20 18:20:48 PST 2013

On Wed, Feb 20, 2013 at 5:42 PM, Nadav Rotem <nrotem at apple.com> wrote:

> Michael,
>
> Thanks for the explanation. I think that the question that we need to ask
> ourselves is, does this belong in the compiler ?  Before we design
> beautiful abstraction that will add to the complexity of the compiler we
> need to know if the complexity is worth it.
>

While I agree that we shouldn't add needless complexity to the compiler, it
is hard to decide whether a feature is worth the complexity until we have
some rough idea both of how valuable the feature is *and* the complexity it
brings. We can't easily figure out the latter without having a reasonable
design (or if we do, we'll be wrong).

>  Michael, are you committed to doing __optimization__ work to support
> transactional memory ? If so, how will thiswork benefit others ?
>

I don't see the need for this (although it would certainly be interesting)
for the baseline feature to be worthwhile. Consider that we have no such
optimizations for atomics and yet we model them in the IR, and I think we
are right to do so. For example, modeling atomics in the IR helps ensure we
retain the ability to do normal (and existing) scalar optimizations.

> To my understanding, transactional memory is not something that is going
> to benefit many people.
>

You're conflating all of transactional memory into a single feature, which
makes this somewhat confusing. This discussion is specifically about
hardware lock elision. While this has some things to do with TM, it is an
extremely narrow and targeted feature. It is specifically targeted at
making *existing* locking code more efficient on hardware. (Also see
Jeffrey's comments.)

Now, some argue that HLE isn't actually beneficial in practice. I've been
around that discussion a few times, and consistently it boils down to "Does
the code rely on fine-grained locking? If so, then HLE helps. If not, it
doesn't." There are more subtle details, but that's the core of the issue
that I've seen. Do you see other issues with the relevance of HLE?

If not, then I can say that I've been on both sides of this particular
fence, writing both fine-grained and coarse-grained synchronization. I
don't know what the ratio of importance is between the two, but I'm at
least convinced that there exists fine-grained locking in the world, and it
would seem generally useful for LLVM to support the functionality hardware
vendors are building to make that code execute more efficiently.

But even if HLE won't actually help applications that you or I care about,
there is another aspect to this. If generic libraries are written to
leverage HLE when it *does* help performance, but doing so makes them more
opaque to the optimizer, then using such libraries will actively harm
performance of code where HLE is a wash. This all comes back to the fact
that a significant motivation in modeling the most fundamental
synchronization patterns directly in the IR is ensuring that these
synchronizations don't overly penalize standard scalar optimizations.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130220/ed716e64/attachment.html>