[llvm-dev] Does middle-end pass need to consider some special type when doing optimization? Or letting back-end to revert the optimization accordingly?

Mon Mar 22 07:15:51 PDT 2021

Hi Florian,

Sorry, I didn’t understand your question.
If we can’t prevent ` load x86_amx* ` being generated, we need to transform load x86_amx* to llvm.x86.tileloadd64.internal() with shape propagation. The source of load/store instruction is generated by front-end because in C language we define our tile type as “typedef int tile1024i __attribute__((__vector_size__(1024), __aligned__(64)));”. It is a vector type. Actually we hope all operation about amx tile is through amx intrinsics, and the data exchange to other operation is through memory. But front-end generate load/store <256 x i32>* instruction instead of llvm.x86.tileloadd64.internal() or llvm.x86.tilestored64.internal().

What assumption does LLVM make use of?

Thanks
Yuanke

From: Florian Hahn <florian_hahn at apple.com>
Sent: Monday, March 22, 2021 9:30 PM
To: Luo, Yuanke <yuanke.luo at intel.com>; llvm-dev <llvm-dev at lists.llvm.org>
Cc: James Y Knight <jyknight at google.com>; Wang, Pengfei <pengfei.wang at intel.com>
Subject: Re: [llvm-dev] Does middle-end pass need to consider some special type when doing optimization? Or letting back-end to revert the optimization accordingly?

On Mar 19, 2021, at 01:58, Luo, Yuanke <yuanke.luo at intel.com<mailto:yuanke.luo at intel.com>> wrote:

Hi James,

Thank you for taking the time to deep dive the issue. It is very constructive. I agree we can transform “load x86_amx*” to amx load intrinsic. But it seems we need more effort to do the transform than preventing generate “load x86_amx*”. I can support transforming “load x86_amx*” to amx load intrinsic if people like this approach.

I also think Florian raise a good question. What the semantics about “load x86_amx*”. Is it different semantics than regular LLVM pointer types? What’s your opinions on it?

From the points earlier, it sounds like you’d need to change `load` semantics for `x86_amx` to load blocks of data with gaps in between them? I am not sure if that’s a good idea, as there are plenty of places in LLVM that make use of that assumption I think (e.g. the code reasoning about memory locations). I’d expect lots of places would need updating and until everything is updated there will plenty of places that get this subtly wrong. This doesn’t sound scalable.

Cheers,
Florian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210322/780db9ef/attachment.html>