https://github.com/lhutton1 commented: I wonder if it's worth limiting the size of the binary operations that are folded - would the runtime of folding a larger input, say `tensor<1024x1024x1024x1024xf32>`, be reasonable? https://github.com/llvm/llvm-project/pull/128059