[clang] [clang] Emit @llvm.assume before streaming_compatible functions when the streaming mode is known (PR #121917)

Fri Jan 10 07:14:08 PST 2025

sdesmalen-arm wrote:

The assumption cache mechanism is used by a number of passes, such as [partial] inlining, function specialization and IPSCCP (interprocedural sparse conditional constant propagation).

The idea behind doing this is to let optimizations iteratively apply knowledge about the streaming-mode of the caller when analyzing or optimizing a callee. If we'd move this functionality to e.g. constant-folding, then we can only apply this knowledge on functions where the streaming-mode is already known. If we'd combine the intrinsic in the instcombine pass, then this is combined only once, instead of iteratively by a pass that needs the information while e.g. inlining.

The idea of using llvm.assume is to give the compiler more knowledge when trying to inline or specialize a function. For example, if the callee has a `if (__arm_in_streaming_mode()) { ... }` branch, then the cost-model could infer that this is always/never executed, depending on mode of the call-site, which changes the cost/decision on whether or not to inline. IIRC, when it inlines the IR cloner also simplifies the code based on the assumption cache. 

I understand that emitting this on a per-call basis is a bit of a hammer. I had expected/hoped that LLVM passes would hoist/CSE the multiple llvm.assume's to reduce the number of these intrinsics (because the value of the condition is constant inside the function). Am I correct in thinking that emitting an llvm.assume once in the entry block of a function is not a problem?

https://github.com/llvm/llvm-project/pull/121917