[PATCH] D125988: [x86][SelectionDAG] Unroll vectorized FREM instructions which will be lowered to libcalls

Thu May 19 09:41:24 PDT 2022

RKSimon added a comment.

Probably pull out the frem.ll and frem-libcall.ll tests into their own phab for review first - frem-libcall.ll in particular doesn't show the current problem in trunk (it generates 4 fmodf calls atm).

================
Comment at: llvm/test/CodeGen/X86/frem-libcall.ll:5
+
+; RUN: llc -mtriple=x86_64-linux-gnu < %s  | FileCheck %s
+
----------------
Sort this to keep the description more easy to notice:
```
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=x86_64-linux-gnu < %s  | FileCheck %s

; Ensure vectorized FREMs are not widened/unrolled such that they get lowered
; into libcalls on undef elements.
```

================
Comment at: llvm/test/CodeGen/X86/frem.ll:4
+
+; RUN: llc -mtriple=x86_64-linux-gnu < %s  | FileCheck %s
+
----------------
Sort this to keep the description more easy to notice:
```
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=x86_64-linux-gnu < %s  | FileCheck %s

; Basic test coverage for FREM
```

================
Comment at: llvm/test/CodeGen/X86/frem.ll:95
+
+define void @frem_v16f32(<16 x float> %a0, <16 x float> %a1, <16 x float> *%p3) nounwind {
+; CHECK-LABEL: frem_v16f32:
----------------
very pedantic - but maybe sort the vector tests by size - the v8f16/v4f32/v2f64 first - then the 256 / 512-bit variants.

================
Comment at: llvm/test/CodeGen/X86/frem.ll:309
+
+define void @fremfv4f32(<4 x float> %a0, <4 x float> %a1, <4 x float> *%p3) nounwind {
+; CHECK-LABEL: fremfv4f32:
----------------
frem_v4f32

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125988/new/

https://reviews.llvm.org/D125988