[llvm] [DirectX] Propagate shader flags mask of callees to callers (PR #118306)

Mon Dec 2 18:43:07 PST 2024

================
@@ -81,16 +101,31 @@ void ComputedShaderFlags::print(raw_ostream &OS) const {
   OS << ";\n";
 }
 
-/// Return the shader flags mask of the specified function Func.
-const ComputedShaderFlags &
-ModuleShaderFlags::getFunctionFlags(const Function *Func) const {
+auto ModuleShaderFlags::getFunctionShaderFlagInfo(const Function *Func) const {
   const auto Iter = llvm::lower_bound(
       FunctionFlags, Func,
       [](const std::pair<const Function *, ComputedShaderFlags> FSM,
          const Function *FindFunc) { return (FSM.first < FindFunc); });
   assert((Iter != FunctionFlags.end() && Iter->first == Func) &&
          "No Shader Flags Mask exists for function");
-  return Iter->second;
+  return Iter;
+}
+
+/// Merge mask NewSF to that of Func, if different.
+/// Return true if mask of Func is changed, else false.
+bool ModuleShaderFlags::mergeFunctionShaderFlags(
+    const Function *Func, const ComputedShaderFlags NewSF) {
+  const auto FuncSFInfo = getFunctionShaderFlagInfo(Func);
+  if ((FuncSFInfo->second & NewSF) != NewSF) {
+    const_cast<ComputedShaderFlags &>(FuncSFInfo->second).merge(NewSF);
----------------
tex3d wrote:

This presumes that the merge is a simple bitwise OR operation and you can use this `&` to detect a change.  It also uses `&` operator on the `ComputedShaderFlags` type, which I assume works by triggering an implicit cast to `uint64_t`.  I believe it would be cleaner to allow for a copy of the `ComputedShaderFlags` and to compare before and after merge to detect a change.

As a side note, we probably don't need to have all the shader flags defined as bitfield entries with one-by-one translation from bitfields to uint64_t. I certainly get that bitfields don't have a guaranteed layout in C++, but we could instead use an explicit `uint64_t` field with an enum defining the (DxilModule) bit shift for each flag (plus generated set/get accessors).  If we did that, operations such as this would be more efficient, since it doesn't have to go through multiple translations between individual bits and combined `uint64_t` values each time.  Copying and merging would be made trivial as well.

https://github.com/llvm/llvm-project/pull/118306