[llvm-dev] Does middle-end pass need to consider some special type when doing optimization? Or letting back-end to revert the optimization accordingly?

Sat Mar 20 06:13:28 PDT 2021

I also think the pointee type shouldn't matter; my impression was that ty*
and ty'* should be treated equivalently and bitcasting between these should
not have any side effects.
But, when it is used by load, which receives a type for interpretation of
the loaded value, I don't think it's safe to convert load ty to load ty'
with the same bit width in general.
A relevant bug in gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58416 ,
the transformation is also happening in LLVM:
https://bugs.llvm.org/show_bug.cgi?id=45152

On Thu, Mar 18, 2021 at 5:56 PM Wang, Pengfei via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi,
>
>
>
> We are developing prototypes for Intel Advanced Matrix Extensions (AMX)
> [1] programing model in Clang and LLVM [2].
>
> We met several cases when the certain type we added are optimized
> unexpectedly in the middle-end. E.g. optimizing phi + biscast + load:
>
>
>
> From
>
> %a = load <256 x i32>, <256 x i32>* %mem, align 64
>
> … …
>
> %b = phi <256 x i32> [ %a, %label1 ], [%someother, %label2]
>
> %c = bitcast <256 x i32> %b to x86_amx
>
> To
>
> %a = bitcast <256 x i32>* %mem to x86_amx*
>
> %b = load x86_amx, x86_amx*, align 64
>
> … …
>
> %c = phi x86_amx [ %b, %label1 ], [%someother, %label2]
>
>
>
> To prevent such unexpected transforms, we concretely added the type check
> in each point of the optimizations.
>
> Roman pointed out the changes are not the right direction [3], and thought
> it’s bug for backend. While we agreed backend might be able to handle it
> for the functionality, we think it is better to handle it in the midden-end
> since they are negative optimizations for AMX.
>
>
>
> First, let me put some background here:
>
>    1. x86_amx* is different from trivial pointers.
>
> The AMX load instruction is much different from other load instructions.
> It is not only need the memory address but also the shape / stride of the
> tile register. We did some extra work in the backend to deduce the shape
> information from the context. We don’t want the pass to add new x86_amx
> related usage because this will result in the difficulty in deduction. That
> said bitcasting other pointer types to x86_amx* is not trivial as assumed
> here.
>
>    1. The physical tile registers have more limitations.
>       1. No copy instruction between tile registers.
>       2. Spilling / reload a tile register is expensive in light of its
>       size is 1024 bytes.
>       3. The shapes of tile registers need to be pre-configured before
>       use and all data in tile registers will turn into invalid once
>       re-configured. That said we need to dominate as more tile registers as
>       possible to configure their shapes with one configure instruction,
>       otherwise we need to spill and reload the live registers once we need to
>       re-configure.
>       4. The number of tile registers is rather small (only 8) and
>       different shapes cannot be reused.
>
> Based on the limitations, we need to reduce the use / live range of tile
> registers. But optimizations may increase the opportunity of the use. So
> even we can handle some combined operation for AMX type, we still prefer to
> prevent it from the beginning. Unless we can totally roll back the
> optimization. Which is also not a good solution in my opinion.
>
>    1. For more information, please refer to discussion in [3].
>
> For other optimization points, please refer [4][5].
>
>
>
> I think the main controversy from Roman is if middle-end pass should
> consider some special type when doing optimization. I tend to let
> middle-end do the type check on account of the peculiarity of AMX type. But
> I’m not sure if we have precedent to handle the similar issue in other
> targets. I’m open and glad to do it either way so long as we have an
> elegant solution.
>
> Any suggestions are welcome.
>
>
>
> [1]
> https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html#architecture
>
> [2] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146770.html
>
> [3] https://reviews.llvm.org/D98247
>
> [4] https://reviews.llvm.org/D98595
>
> [5] https://reviews.llvm.org/D98757
>
>
>
> Thanks
>
> Pengfei
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210320/d379a8fd/attachment.html>