[llvm] [DirectX] Propagate shader flags mask of callees to callers (PR #118306)

Wed Dec 4 09:42:11 PST 2024

================
@@ -61,6 +66,21 @@ void ModuleShaderFlags::initialize(const Module &M) {
     CombinedSFMask.merge(CSF);
   }
   llvm::sort(FunctionFlags);
+  // Propagate shader flag mask of functions to their callers.
+  while (!WorkList.empty()) {
+    const Function *Func = WorkList.pop_back_val();
+    if (!Func->user_empty()) {
+      const ComputedShaderFlags &FuncSF = getFunctionFlags(Func);
+      // Update mask of callers with that of Func
+      for (const auto User : Func->users()) {
+        if (const CallInst *CI = dyn_cast<CallInst>(User)) {
+          const Function *Caller = CI->getParent()->getParent();
+          if (mergeFunctionShaderFlags(Caller, FuncSF))
+            WorkList.push_back(Caller);
----------------
bogner wrote:

This approach is very inefficient. Consider a chain of functions, `f1`, `f2`, `f3`..., where `f1` has no flags, `f2` has one flag, `f3` has another flag, etc. In this case we might process `f1`, then add it to the worklist again while processing `f2`, process it again, and then add both `f2` to the worklist while processing `f3`, and then add `f1` to be processes *again* while processing `f2`, and so on.

The correct way to do this type of thing is to walk the call graph in post-order. One way to do this is by switching this pass to a [CallGraphSCCPass (old PM)](https://llvm.org/docs/WritingAnLLVMPass.html#the-callgraphsccpass-class`) / [CGSCCPass (new PM)](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/CGSCCPassManager.h#L136).

https://github.com/llvm/llvm-project/pull/118306