[clang] [llvm] [RISCV][Zicfilp] Enable Zicfilp CFI compiler behaviors by looking at module flags (PR #152121)

Mon Aug 25 05:08:51 PDT 2025

mylai-mtk wrote:

Hi, I have been modeling the `cf-protection` module flags as target features, and I found some problems so I'm not sure whether I should continue pursuing this path:

By modeling Zicfilp CFI module flags as subtarget features, I hope the following could be achieved:

+ Provide easy access to whether Zicfilp CFI is enabled.

-> References to `MCSubtargetInfo` is more widely available throughout the code base; `Module`s, on the hand, are somehow harder to obtain.

---

+ Avoid repeated parsing of module flags. (I mean no duplicated and scattered codes)
+ Provide correct and immutable indication of whether Zicfilp CFI is enabled since the very beginning of the backend pipeline.

-> By enforcing the consistency between the feature bits and module flags when creating `RISCVSubtarget`s, these 2 goals could be achieved.

---

The problem arose when I was trying to handle the module-wide application of the Zicfilp CFI module flags: In `TargetMachine`, there's also a feature bitmap contained in the `STI` field (typed `MCSubtargetInfo`). This `TM.STI` does not belong to any `Function`, and is created without the influence of a specific `Module` instance, so I cannot sync (enforce) the information of module flags into it at creation, thus the above goal is somehow, broken. 

In my experience with LLVM, I always assumed that `TM.STI->hasFeature(XXX)` represents whether a feature is enabled **for the given `Module` we're compiling**. However, the inability to sync module flags to `TM.STI` triggered a challenge to this assumption: Could a `TargetMachine` instance be reused to compile multiple `Module`s, and `TM.STI` actually represents **what the target machine is capable of, not how it behaves**? If this is indeed the correct understanding of `TargetMachine`, then it would be natural that no module flags could be synced to `TM.STI`, since module flags controls the `Module`'s behavior, but not necessarily the `TargetMachine`'s, and different `Module`s compiled with the same `TargetMachine` instance could exhibit different behaviors. It would further declare that my goal of "Avoid repeated parsing of module flags" is not suitable to implement with `TM.STI`, since it does not convey `Module` behaviors. (and thus parsing module flags when implementing module-wide behavior is preferred)

So in a nutshell, I want to ask:

+ Whether modeling per-`Function` behavior with `RISCVSubtarget` is suitable?
+ Whether modeling per-`Module` behavior with `TM.STI` is **not** suitable, since `TM.STI` is never intended to be in sync with and serve only one specific `Module`?

(By the way, I considered recommending this approach to Zicfiss in #152251, but given the uncertainty I'm currently facing, I think it's better that I get approval for this approach beforehand.)

https://github.com/llvm/llvm-project/pull/152121