[PATCH] D130339: [CodeGen] Generate efficient assembly for freeze(poison) version of `mm_cast` intel intrinsics

Mon Aug 8 08:45:52 PDT 2022

RKSimon added inline comments.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:11433
+                        : (NumFreezeUndef ? DAG.getFreeze(DAG.getUNDEF(ResVT))
+                                          : DAG.getUNDEF(ResVT));

----------------
aqjune wrote:
> RKSimon wrote:
> > Is there any reason we couldn't just always use DAG.getFreeze(DAG.getUNDEF(ResVT)) ?
> I tried using `DAG.getFreeze(DAG.getUNDEF(ResVT))`, and it needs updates in existing lowering functions to make the following tests pass:
> ```
>   LLVM :: CodeGen/X86/haddsub-undef.ll
>   LLVM :: CodeGen/X86/oddsubvector.ll
>   LLVM :: CodeGen/X86/subvector-broadcast.ll
>   LLVM :: CodeGen/X86/vector-interleaved-load-i16-stride-3.ll
> ...
> ```
> It causes insertion of `vinsert*` instruction instead of efficient ops.
> I think it is good to keep `DAG.getUNDEF(ResVT)` to avoid regression.
Thanks - I'll take a look at those cases after this patch has gone in - we have some custom vector widening patterns that we'll need to adjust to handle freeze(undef)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130339/new/

https://reviews.llvm.org/D130339

[PATCH] D130339: [CodeGen] Generate efficient assembly for freeze(poison) version of `mm*_cast*` intel intrinsics

[PATCH] D130339: [CodeGen] Generate efficient assembly for freeze(poison) version of `mm_cast` intel intrinsics