[llvm-dev] MCRegisterClass mandatory vs preferred alignment?

Tue Sep 1 08:16:57 PDT 2015

The performance of unaligned accesses came up here:
http://reviews.llvm.org/D12154

Summary: while larger unaligned accesses reduce code size and can improve
performance, they also increase the risk of crossing a cacheline and
suffering a performance hit that will vary depending on uarch. Crossing
cachelines isn't something we account for in general, so we do generate
unaligned SSE/AVX accesses for all recent x86, and we generate smaller (4/8
byte) unaligned accesses for all x86. (The criteria for generating
unaligned accesses isn't entirely clear/consistent, so you'll find some
'FIXME' comments that I added recently.)

So yes, I agree that it's worth experimenting wrt unaligned stack accesses.

On Mon, Aug 31, 2015 at 5:25 PM, Matthias Braun via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Would certainly be interesting to perform some benchmarking
> (llvm-testsuite/spec) to confirm this. I could imagine that a smaller stack
> footprint improves performance (or at least does not degrade it).
>
> - Matthias
>
> > On Aug 31, 2015, at 4:15 PM, Philip Reames <listmail at philipreames.com>
> wrote:
> >
> >
> >
> > On 08/31/2015 03:59 PM, Matthias Braun wrote:
> >> Looks to me like the alignment is specified in tablegen. From Target.td:
> >>
> >> class RegisterClass<string namespace, list<ValueType> regTypes, int
> alignment,
> >>                     dag regList, RegAltNameIndex idx = NoRegAltName>
> >>
> >> X86RegisterInfo.td:
> >>
> >> def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32,
> v4f64],
> >>                           256, (sequence "YMM%u", 0, 15)>;
> >> def VR256X : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32,
> v4f64],
> >>                           256, (sequence "YMM%u", 0, 31)>;
> >>
> >> Seems to be 256bits/32bytes.
> > Yeah, don't know how I missed that.  :)
> >>
> >> I don't know why the alignment was specified the way it is. My guess
> would be because memory accesses are faster that way (because they do not
> cross cache lines for example).
> > This is certainly true on older cores, but is actually true on newer
> ones?  Looking through Agner's instruction tables, it looks like the
> aligned and unaligned versions are essentially the same on newer intels and
> amds.
> >
> > I was originally imagining that I'd need a custom hook or flag, but
> would it make sense to just use the unaligned versions if the appropriate
> feature flag (IsUAMem32Slow) is unset?  This would result in slightly
> smaller code on newer architectures without (seemingly, I have no direct
> experience here) a performance hit.
> >>
> >> - Matthias
> >>
> >>> On Aug 31, 2015, at 3:21 PM, Philip Reames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>>
> >>> Looking around today, it appears that TargetRegisterClass and
> MCRegisterClass only includes a single alignment.  This is documented as
> being the minimum legal alignment, but it appears to often be greater than
> this in practice.  For instance, on x86 the alignment of %ymm0 is listed as
> 32, not 1.  Does anyone know why this is?
> >>>
> >>> Additionally, where are these alignments actually defined?  I don't
> seem them appearing in the X86RegisterInfo.td files as I would naively
> expect.
> >>>
> >>> The background for my question is that I'm looking into adding a
> function attribute which uses unaligned loads and stores for register
> spilling on x86 to avoid the need for dynamic frame realignment.  (see the
> previous thread "Aligned vector spills and variably sized stack frames")
> The key difference w.r.t. to the existing "no-realign-stack" attribute is
> that situations which *require* a stack realignment will generate a
> fatal_error rather than silently miscompiling.  The current mechanism works
> by essentially ignoring the alignment criteria and just hoping everything
> works out in practice.
> >>>
> >>> Philip
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIC-g&c=eEvniauFctOgLOKGJOplqw&r=owCLIXjMdMpT1E9Ei7smWg&m=4X-tenWKR90yebSZyZtJkCGbxi3lStowT32fRt8hEfE&s=Qo26oxiHUS6bEX8ogW7m8YC9B6KEpzfx06lA7_CzRI8&e=
> >
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150901/e6a9185c/attachment.html>