[PATCH] Add HLE target feature

Wed Feb 27 19:36:23 PST 2013

----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "Michael Liao" <michael.liao at intel.com>
> Cc: llvm-commits at cs.uiuc.edu, reviews+D437+public+fc8ee3e2c842c344 at llvm-reviews.chandlerc.com
> Sent: Wednesday, February 27, 2013 8:16:32 PM
> Subject: Re: [PATCH] Add HLE target feature
> 
> 
> Michael,
> 
> 
> Thanks for working on this. I agree that LLVM already supports atomic
> operations, but I don't understand why you want to add transactional
> memory into the compiler. This clearly belongs in a library, not in
> the compiler. What you are saying about the implementation may be
> correct, but it is irrelevant to the discussion about the HLE. I
> don't think that HLE belongs in the compiler. Why do you think that
> it belongs to the compiler and not to a library ?

Nadav, I think that your language here is a little too strong: I don't think that it is *clear* that transactional-memory support belongs in a library. I am not sure that Michael has made a good argument that *this* set of patches is the right approach (aside from saying that it mirrors the interface presented by gcc), but HLE seems like a fairly limited domain and there may not be too many alternatives. I agree that modeling the behavior should allow for additional optimization, and this will be important.

>From personal experience with transactional memory (the BG/Q has hardware transactional memory), I'm doubtful that a purely library-based approach will work well because reducing the overhead of the transactional-region is critically important. Using transactional memory profitably requires speculating over regions large enough to amortize the overheads but small enough to have a low collision probability. Getting a low collision probability often requires the regions to be small, and this implies that the overheads must be small in order for the speculative execution to be worthwhile at all. Part of making the transactional regions small requires moving as much out of them as possible, and I suspect that the compiler can do this only if it fully understands that is going on. Also, optimizations like coalescing trivially-nested regions, or removing transactions guaranteed to manually abort might also become important for transactional regions contained in general (such as C++ template) libraries.

Thanks again,
Hal

> 
> 
> Thanks,
> Nadav
> 
> 
> 
> 
> 
> On Feb 27, 2013, at 3:09 PM, Michael Liao < michael.liao at intel.com >
> wrote:
> 
> 
> 
> Hi Nadav
> 
> "LLVM Atomic Instructions and Concurrency Guide" puts more details on
> why atomic instructions are added in LLVM IR and optimizations around
> it. For HLE support, you could treat an enhancement on atomic
> instruction by taking advantage of hardware support or TM support,
> the
> current approach is to add hint in atomic instructions already
> existed
> in LLVM IR to help backend generate proper code. By taking this
> approach, as compiler developers, we leverage the existing
> infrastructure/optimization as much as possible; as application
> developers, they have the minimal change to adapt HLE technology into
> their code and keep code across different compilers.
> 
> Yours
> - Michael
> 
> On Wed, 2013-02-27 at 14:08 -0800, Nadav Rotem wrote:
> 
> 
> Michael,
> 
> 
> I appreciate you rebasing this patch, but this is not helping at all.
> I asked you specific questions and you ignored them. All of this code
> that you are adding has a cost. Its making the compiler more complex,
> and it will make it difficult for us to refactor major parts of the
> compiler. Moreover, it increases the compile time. Every time that
> you
> add special handling for some feature you care about you increase the
> size of the compiler. Every feature that you add adds dozens of IFs
> and hooks that need to be executed by innocent people who don't care
> about this. I asked you a simple question. Why does HLE support need
> to be a part of LLVM ? HLE instructions are probably awesome, but
> they
> do not belong in the compiler. I could be wrong, but it is your
> responsibility to convince us why we should pay the compile time and
> complexity costs.
> 
> 
> Thanks,
> Nadav
> 
> 
> 
> On Feb 27, 2013, at 1:37 PM, darkbuck < michael.hliao at gmail.com >
> wrote:
> 
> 
> 
> the patch is rebased to trunk and latest changes on replacing
> TargetFlags to HLEHint
> 
> Hi nadav, jfifield, hfinkel, chandlerc, eli.friedman,
> 
> http://llvm-reviews.chandlerc.com/D437
> 
> CHANGE SINCE LAST DIFF
> http://llvm-reviews.chandlerc.com/D437?vs=1053&id=1142#toc
> 
> Files:
> lib/Target/X86/X86.td
> lib/Target/X86/X86InstrInfo.td
> lib/Target/X86/X86Subtarget.cpp
> lib/Target/X86/X86Subtarget.h
> 
> Index: lib/Target/X86/X86.td
> ===================================================================
> --- lib/Target/X86/X86.td
> +++ lib/Target/X86/X86.td
> @@ -120,6 +120,8 @@
> "Support BMI2 instructions">;
> def FeatureRTM : SubtargetFeature<"rtm", "HasRTM", "true",
> "Support RTM instructions">;
> +def FeatureHLE : SubtargetFeature<"hle", "HasHLE", "true",
> + "Support HLE">;
> def FeatureADX : SubtargetFeature<"adx", "HasADX", "true",
> "Support ADX instructions">;
> def FeatureLeaForSP : SubtargetFeature<"lea-sp", "UseLeaForSP",
> "true",
> @@ -201,7 +203,7 @@
> FeatureRDRAND, FeatureF16C,
> FeatureFSGSBase,
> FeatureMOVBE, FeatureLZCNT,
> FeatureBMI,
> FeatureBMI2, FeatureFMA,
> - FeatureRTM]>;
> + FeatureRTM, FeatureHLE]>;
> 
> def : Proc<"k6", [FeatureMMX]>;
> def : Proc<"k6-2", [Feature3DNow]>;
> Index: lib/Target/X86/X86InstrInfo.td
> ===================================================================
> --- lib/Target/X86/X86InstrInfo.td
> +++ lib/Target/X86/X86InstrInfo.td
> @@ -603,6 +603,7 @@
> def HasBMI : Predicate<"Subtarget->hasBMI()">;
> def HasBMI2 : Predicate<"Subtarget->hasBMI2()">;
> def HasRTM : Predicate<"Subtarget->hasRTM()">;
> +def HasHLE : Predicate<"Subtarget->hasHLE()">;
> def HasADX : Predicate<"Subtarget->hasADX()">;
> def FPStackf32 : Predicate<"!Subtarget->hasSSE1()">;
> def FPStackf64 : Predicate<"!Subtarget->hasSSE2()">;
> Index: lib/Target/X86/X86Subtarget.cpp
> ===================================================================
> --- lib/Target/X86/X86Subtarget.cpp
> +++ lib/Target/X86/X86Subtarget.cpp
> @@ -310,6 +310,10 @@
> HasBMI = true;
> ToggleFeature(X86::FeatureBMI);
> }
> + if ((EBX >> 4) & 0x1) {
> + HasHLE = true;
> + ToggleFeature(X86::FeatureHLE);
> + }
> if (IsIntel && ((EBX >> 5) & 0x1)) {
> X86SSELevel = AVX2;
> ToggleFeature(X86::FeatureAVX2);
> @@ -439,6 +443,7 @@
> HasBMI = false;
> HasBMI2 = false;
> HasRTM = false;
> + HasHLE = false;
> HasADX = false;
> IsBTMemSlow = false;
> IsUAMemFast = false;
> Index: lib/Target/X86/X86Subtarget.h
> ===================================================================
> --- lib/Target/X86/X86Subtarget.h
> +++ lib/Target/X86/X86Subtarget.h
> @@ -121,6 +121,9 @@
> /// HasRTM - Processor has RTM instructions.
> bool HasRTM;
> 
> + /// HasHLE - Processor has HLE.
> + bool HasHLE;
> +
> /// HasADX - Processor has ADX instructions.
> bool HasADX;
> 
> @@ -253,6 +256,7 @@
> bool hasBMI() const { return HasBMI; }
> bool hasBMI2() const { return HasBMI2; }
> bool hasRTM() const { return HasRTM; }
> + bool hasHLE() const { return HasHLE; }
> bool hasADX() const { return HasADX; }
> bool isBTMemSlow() const { return IsBTMemSlow; }
> bool isUnalignedMemAccessFast() const { return IsUAMemFast; }
> <D437.2.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>