<div dir="ltr"><div><div><div>Hi -<br><br></div>The answer to this question may help to resolve larger questions about intrinsics and vectorization that were discussed at the dev mtg last week, but let's start with the basics:<br><br></div>Which, if any, of these is the canonical IR?<br></div><div><br>; ret = x < y ? 0 : x-y<br></div><div>define i32 @max1(i32 %x, i32 %y) {<br> %sub = sub nsw i32 %x, %y<br> %cmp = icmp slt i32 %x, %y ; cmp is independent of sub<br> %sel = select i1 %cmp, i32 0, i32 %sub<br> ret i32 %sel<br>}<br><br></div><div>; ret = (x-y) < 0 ? 0 : x-y<br></div><div>define i32 @max2(i32 %x, i32 %y) {<br> %sub = sub nsw i32 %x, %y<br> %cmp = icmp slt i32 %sub, 0 ; cmp depends on sub, but this looks more like a max?<br> %sel = select i1 %cmp, i32 0, i32 %sub<br> ret i32 %sel<br>}<br><br></div><div>; ret = (x-y) > 0 ? x-y : 0<br></div><div>define i32 @max3(i32 %x, i32 %y) {<br> %sub = sub nsw i32 %x, %y<br> %cmp = icmp sgt i32 %sub, 0 ; canonicalize cmp+sel - looks even more like a max?<br> %sel = select i1 %cmp, i32 %sub, i32 0<br> ret i32 %sel<br>}<br><br>define i32 @max4(i32 %x, i32 %y) {<br> %sub = sub nsw i32 %x, %y<br></div><div> %max = llvm.smax.i32(i32 %sub, i32 0) ; this intrinsic doesn't exist today<br></div><div> ret i32 %max<br>}<br><br></div><div><br></div><div><div><div><div><div><div><div>FWIW, InstCombine doesn't canonicalize any of the first 3 options currently. Codegen suffers because of that (depending on the target machine and data types). Regardless of the IR choice, some backend fixes are needed.<br></div><div><br>Another possible consideration is the structure/accuracy of the cost models used by the
vectorizers and other passes. I don't think they ever special-case
the cmp+sel pair as a possibly unified (and therefore cheaper than the sum of the parts)
operation.<br><br>Note that we added FP variants for min/max ops with:<br><a href="https://reviews.llvm.org/rL220341">https://reviews.llvm.org/rL220341</a><br><br></div><div><br></div></div></div></div></div></div></div></div>