tmatheson-arm wrote: If this is to support LSFE features, we likely want to stick with the CAS loops unless that feature is available. What do `fmax/fmin` lower to with/without LSFE? https://github.com/llvm/llvm-project/pull/162679