[PATCH] D155593: AMDGPU: Overhaul and improve rcp and rsq f32 formation

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 18 12:47:42 PDT 2023


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:893
 
-  if (AllowInaccurateRcp) {
-    Function *Decl = Intrinsic::getDeclaration(
-      Mod, Intrinsic::amdgcn_rcp, Ty);
-
-    // Turn into multiply by the reciprocal.
+  if (FMF.allowReciprocal()) {
     // x / y -> x * (1.0 / y)
----------------
I'm not really sure how to interpret arcp and afn. The backend interpretation continues to be aggressive with afn, so I don't know which is correct.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155593/new/

https://reviews.llvm.org/D155593



More information about the llvm-commits mailing list