<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Mar 23, 2021, at 08:21, Luo, Yuanke <<a href="mailto:yuanke.luo@intel.com" class="">yuanke.luo@intel.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta charset="UTF-8" class=""><div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class="">I prototyped the approach 1 at<span class="Apple-converted-space"> </span><a href="https://reviews.llvm.org/D99152" style="color: blue; text-decoration: underline;" class="">https://reviews.llvm.org/D99152</a><span class="Apple-converted-space"> </span>and I realized that sometimes bitcast optimization in middle-end is helpful. For the test case of inner_product(), we need extra effort eliminate llvm.x86.vector.amx.cast.x86amx.v256i32 by ourselves.<o:p class=""></o:p></div><div style="margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div></div></blockquote></div><br class=""><div class="">I think that’s expected, you might need to add some optimizations for the conversion intrinsic. But that can easily be limited to the AMX specific passes and all existing LLVM transformations should remain correct without changes.</div><div class=""><br class=""></div><div class="">Cheers,</div><div class="">Florian</div></body></html>