<div dir="ltr"><div><div><div><div><div>Example:<br>define i8 @narrow_add(i8 %x, i8 %y) {<br>  %x32 = zext i8 %x to i32<br>  %y32 = zext i8 %y to i32<br>  %add = add nsw i32 %x32, %y32<br>  %tr = trunc i32 %add to i8<br>  ret i8 %tr<br>}<br><br></div>With no data-layout or with an x86 target where 8-bit integer is in the data-layout, we reduce to:<br><br>$ ./opt -instcombine narrowadd.ll -S<br>define i8 @narrow_add(i8 %x, i8 %y) {<br>  %add = add i8 %x, %y<br>  ret i8 %add<br>}<br><br></div>But on a target that has 32-bit registers without explicit subregister ops, we don't do that transform because we avoid changing operations from a legal (as specified in the data-layout) width to an illegal width - see InstCombiner::shouldChangeType().<br><br></div>Should we make an exception to allow narrowing for the common cases of i8 and i16?<br><br></div>In the motivating example from PR35875 ( <a href="https://bugs.llvm.org/show_bug.cgi?id=35875">https://bugs.llvm.org/show_bug.cgi?id=35875</a> ), an ARM target is stuck at 19 IR instructions:<br><br>declare void @use4(i8, i8, i8, i8)<br>define void @min_of_3_vals(i8 %x, i8 %y, i8 %z) {<br>  %nx = xor i8 %x, -1<br>  %ny = xor i8 %y, -1<br>  %nz = xor i8 %z, -1<br>  %zx = zext i8 %nx to i32<br>  %zy = zext i8 %ny to i32<br>  %zz = zext i8 %nz to i32<br><br>  %cmpxz = icmp ult i32 %zx, %zz<br>  %minxz = select i1 %cmpxz, i32 %zx, i32 %zz<br>  %cmpyz = icmp ult i32 %zy, %zz<br>  %minyz = select i1 %cmpyz, i32 %zy, i32 %zz<br>  %cmpyx = icmp ult i8 %y, %x<br>  %minxyz = select i1 %cmpyx, i32 %minxz, i32 %minyz<br>  %tr_minxyz = trunc i32 %minxyz to i8<br><br>  %new_zx = sub nsw i32 %zx, %minxyz<br>  %new_zy = sub nsw i32 %zy, %minxyz<br>  %new_zz = sub nsw i32 %zz, %minxyz<br>  %new_x = trunc i32 %new_zx to i8<br>  %new_y = trunc i32 %new_zy to i8<br>  %new_z = trunc i32 %new_zz to i8<br><br>  call void @use4(i8 %tr_minxyz, i8 %new_x, i8 %new_y, i8 %new_z)<br>  ret void<br>}<br><br></div>...but x86 gets to shrink the subs which leads to a bunch of other transforms, and we grind this down to 10 instructions between instcombine and early-cse:<br><br>define void @min_of_3_vals(i8 %x, i8 %y, i8 %z) {<br>  %nx = xor i8 %x, -1<br>  %ny = xor i8 %y, -1<br>  %nz = xor i8 %z, -1<br>  %cmpxz = icmp ult i8 %nx, %nz<br>  %minxz = select i1 %cmpxz, i8 %nx, i8 %nz<br>  %1 = icmp ult i8 %minxz, %ny<br>  %minxyz = select i1 %1, i8 %minxz, i8 %ny<br>  %new_x = sub i8 %nx, %minxyz<br>  %new_y = sub i8 %ny, %minxyz<br>  %new_z = sub i8 %nz, %minxyz<br><br>  call void @use4(i8 %minxyz, i8 %new_x, i8 %new_y, i8 %new_z)<br>  ret void<br>}<br><br></div>