[PATCH] D101074: [X86] Canonicalize SGT/UGT compares with constants to use SGE/UGE to reduce the number of EFLAGs reads. (PR48760)

Tue Jun 29 16:55:26 PDT 2021

spatel added inline comments.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:23473
+    // Attempt to canonicalize SGT/UGT -> SGE/UGE compares with constant which
+    // reduces the number of EFLAGs reads. The equivalent for SLE/ULE -> SLT/ULT
+    // isn't likely to happen as we already canonicalize to that CondCode.
----------------
I didn't make the connection from SGT -> SGE to SETG -> SETGE to ZF/OF/SF or SETA -> SETAE to ZF/CF and then to an actual perf difference until re-reading the comments in PR48760 and opening the x86 manual...and I'm still not entirely clear about it. :) This deserves more explanation in the code comment. 

Maybe something like:
The "GE" conditions map to less EFLAGS bits than their "GT" counterparts. Specifically, the GE conditions don't read the ZF. This may translate to less uops depending on uarch implementation.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101074/new/

https://reviews.llvm.org/D101074