[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

Craig Topper via llvm-dev llvm-dev at lists.llvm.org
Mon Apr 19 19:50:13 PDT 2021


I don't think it's mentioned in Yuanke's mail. dvec.h is a header file that
is included with icc that provides C++ wrapper classes around the SSE/AVX
vector types that provide operator overloading.

~Craig


On Mon, Apr 19, 2021 at 7:30 PM Luo, Yuanke via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
>
> I collected the feedback/requirement from Intel customer as below.
>
>
>
> Our software runs in an embedded environment and is processing buffers
> which are unaligned. Sometimes this misalignment is simply because the
> buffer allocation is beyond the immediate control of our software but  it
> can also be because we are processing blocks of data which are not
> multiples of the vector size (e.g., 6, 12 or 24). We can’t just fix our
> buffers to make them aligned. Our code is complicated and we support
> multiple instruction sets operating using the same algorithms by using
> templated code. For example:
>
>
>
> template<typename DVEC_TYPE>
>
> void doSomething(DVEC_TYPE* data)
>
> {
>
>   // Trivial example – reality would be something much more substantial,
> possibly with loops or other function calls.
>
>   *data += 1.0f;
>
> }
>
>
>
> Note that we use dvec to help us abstract the ISA, but other similar
> header-only vector overloading libraries also exist.
>
>
>
> We would then instantiate our function above multiple times for each ISA
> or data type we care about:
>
>
>
> template void doSomething<float>(float* data); // Scalar type useful for
> debugging algorithm and doing basic testing
>
> template void doSomething<F32vec8>(F32vec8* data); // Different AVX widths
>
> template void doSomething<F32vec16>(F32vec16* data);
>
> template void doSomething<I32vec16>(I32vec16* data); // Different element
> type
>
>
>
> The functions are sufficiently large that we don’t want to have to write a
> different version for each ISA. We know that the incoming data may be
> mis-aligned and that accessing it directly is UB, so we could modify our
> code to explicitly handle misalignment. Something like:
>
>
>
> template<typename DVEC_TYPE>
>
> void doSomething(DVEC_TYPE* data)
>
> {
>
>   DVEC_TYPE t;
>
>   loadu(t, data);
>
>   t += 1.0f;
>
>   storeu(data, t);
>
> }
>
>
>
> The code has become more verbose, less readable (maintainable, debuggable,
> etc), and it no longer works with plain scalar types which don’t have
> loadu/storeu defined unless we start defining overloaded helper functions.
> Also, if `data’ pointed at an array, we’d have to throw some pointer
> arithmetic into the mix, rather than just using plain `data[IDX]’ syntax.
> We can certainly write code which could cope with the misalignment
> explicitly but it just ends up becoming messy. Or, we could leverage the
> hardware to manage this misalignment for us letting the compiler emit the
> movups instruction, instead of movaps.
>
>
>
> Until now we have only been using the Intel Compiler, so we have written
> our code to use ICC’s unaligned operations and hardware support to make our
> code cleaner. We are looking at porting our code to LLVM, but LLVM is
> making this harder than it needs to be.
>
>
>
> Thanks
>
> Yuanke
>
>
>
> *From:* paul.robinson at sony.com <paul.robinson at sony.com>
> *Sent:* Tuesday, April 20, 2021 4:42 AM
> *To:* jyknight at google.com
> *Cc:* Luo, Yuanke <yuanke.luo at intel.com>; lebedev.ri at gmail.com; Liu,
> Chen3 <chen3.liu at intel.com>; llvm-dev at lists.llvm.org; Maslov, Sergey V <
> sergey.v.maslov at intel.com>; Towner, Daniel <daniel.towner at intel.com>
> *Subject:* RE: [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx
> machine with option control.
>
>
>
> We might still not be fully understanding one another, because this:
>
> so that you can compile code with under-aligned objects, and have it work
> as the author expected it to
>
> sounds like you’re expecting us to recompile the client code that creates
> the under-aligned objects.  That is literally not possible.  If you do
> understand that part, great, it’s just not obvious to me from how you’re
> phrasing things.
>
>
>
> I (still) don’t know what Intel is facing.  For Sony’s problem, we would
> be much more likely to try to do something specific to the APIs that are
> being abused, rather than something draconian like eliminating alignment
> requirements for everyone.  But of course we have a solution that works for
> us, so there’s that much more inertia to overcome.
>
> --paulr
>
>
>
> *From:* James Y Knight <jyknight at google.com>
> *Sent:* Monday, April 19, 2021 2:30 PM
> *To:* Robinson, Paul <paul.robinson at sony.com>
> *Cc:* Luo, Yuanke <yuanke.luo at intel.com>; Roman Lebedev <
> lebedev.ri at gmail.com>; Liu, Chen3 <chen3.liu at intel.com>; llvm-dev <
> llvm-dev at lists.llvm.org>; Maslov, Sergey V <sergey.v.maslov at intel.com>;
> daniel.towner at intel.com
> *Subject:* Re: [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx
> machine with option control.
>
>
>
>
> I understand your goal is to find and fix bugs in software that is
> still under development and CAN be fixed.  I fully endorse that
> goal.  However, that is not the situation that Sony has, and likely
> not what Intel has.  Your proposal will NOT solve our problem.
>
>
>
> No, that's not it at all! I'm afraid you've totally misunderstood my
> concern.
>
>
>
> My goal is that if we add a compiler feature to address this problem -- so
> that you can compile code with under-aligned objects, and have it work as
> the author expected it to --  that the feature *reliably *addresses the
> problem, and makes such code no longer exhibit Undefined Behavior. The
> proposed backend change does not accomplish that, but we can implement a
> feature which will.
>
>
>
> As Reid said, -fmax-type-align=N appears to be *almost* that feature,
> and something like this little patch (along with documentation update) may
> be all that's needed (but this is totally untested).
>
>
>
> diff --git clang/lib/CodeGen/CodeGenModule.cpp
> clang/lib/CodeGen/CodeGenModule.cpp
> index b23d995683bf..3aef166a690e 100644
> --- clang/lib/CodeGen/CodeGenModule.cpp
> +++ clang/lib/CodeGen/CodeGenModule.cpp
> @@ -6280,8 +6280,7 @@ CharUnits
> CodeGenModule::getNaturalTypeAlignment(QualType T,
>    // Cap to the global maximum type alignment unless the alignment
>    // was somehow explicit on the type.
>    if (unsigned MaxAlign = getLangOpts().MaxTypeAlign) {
> -    if (Alignment.getQuantity() > MaxAlign &&
> -        !getContext().isAlignmentRequired(T))
> +    if (Alignment.getQuantity() > MaxAlign)
>        Alignment = CharUnits::fromQuantity(MaxAlign);
>    }
>    return Alignment;
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210419/55b99718/attachment-0001.html>


More information about the llvm-dev mailing list