[llvm] [llvm][GlobalOpt] Optimize statically resolvable IFuncs (PR #80606)

Mon Feb 5 08:55:09 PST 2024

================
@@ -2404,6 +2405,42 @@ static bool OptimizeEmptyGlobalCXXDtors(Function *CXAAtExitFn) {
   return Changed;
 }
 
+static Function *hasSideeffectFreeStaticResolution(GlobalIFunc &IF) {
+  Function *Resolver = IF.getResolverFunction();
+  if (!Resolver)
+    return nullptr;
+
+  Function *Callee = nullptr;
+  for (BasicBlock &BB : *Resolver) {
----------------
jroelofs wrote:

Great point. I have a follow-up in mind I should ask your advice on then:

I want to extend this (in subsequent patch(es)) to de-virtualize call sites when the result is not quite as obvious.  Looking at attributes on the caller will give us some known bits on `__aarch64_cpu_features.features`, and if that leads to the resolver being constant-foldable in the context of that caller, we can make a direct call.

Would you clone the resolver for each call-site, RAUW the loads from `__aarch64_cpu_features.features` with `__aarch64_cpu_features.features | known_bits` and then InstCombine+SimplifyCFG, followed by reverting the call site back to the original resolver if it didn't work out & DCE-ing the remaining cruft? Or would you attempt to do a more analysis/reasoning-based approach?

https://github.com/llvm/llvm-project/pull/80606