<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/74760>74760</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
wasm vector casts from float to a narrow integer type scalarize
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
abadams
</td>
</tr>
</table>
<pre>
The following .ll produces to 16 scalar ops with control flow. Adding an intermediate cast to a 32-bit integer behaves as expected (does not scalarize), but that doesn't seem like it should be helpful. Adding +nontrapping-fptoint fixes it too, but IIUC fptoui is a poison value on overflow, so trying to lower it to something with overflow checking can't work. There may have been transformations already made to the llvm IR that are only correct for non-poison in-bounds values, and these transformations may have mapped those out-of-range poison values back in-range in a way that dodges the overflow checks.
```
; llc wasm_float_cast.ll -mtriple=wasm32-unknown--wasm -mattr=+simd128 -o -
define void @test(ptr noalias nocapture noundef readonly %in, ptr noalias nocapture noundef writeonly %out) {
entry:
%fv.0.copyload = load <16 x float>, ptr %in, align 16
%conv = fptoui <16 x float> %fv.0.copyload to <16 x i8>
store <16 x i8> %conv, ptr %out, align 16
ret void
}
```
```
.text
.file "wasm_float_cast.ll"
.functype test (i32, i32) -> ()
.section .text.test,"",@
.globl test # -- Begin function test
.type test,@function
test: # @test
.functype test (i32, i32) -> ()
.local v128, f32, i32, i32, v128
# %bb.0: # %entry
block
block
local.get 0
v128.load 0
local.tee 2
f32x4.extract_lane 1
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label1
# %bb.1: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label0
.LBB0_2: # %entry
end_block # label1:
i32.const 0
local.set 4
.LBB0_3: # %entry
end_block # label0:
block
block
local.get 2
f32x4.extract_lane 0
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label3
# %bb.4: # %entry
local.get 3
i32.trunc_f32_u
local.set 5
br 1 # 1: down to label2
.LBB0_5: # %entry
end_block # label3:
i32.const 0
local.set 5
.LBB0_6: # %entry
end_block # label2:
local.get 5
i8x16.splat
local.get 4
i8x16.replace_lane 1
local.set 6
block
block
local.get 2
f32x4.extract_lane 2
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label5
# %bb.7: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label4
.LBB0_8: # %entry
end_block # label5:
i32.const 0
local.set 4
.LBB0_9: # %entry
end_block # label4:
local.get 6
local.get 4
i8x16.replace_lane 2
local.set 6
block
block
local.get 2
f32x4.extract_lane 3
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label7
# %bb.10: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label6
.LBB0_11: # %entry
end_block # label7:
i32.const 0
local.set 4
.LBB0_12: # %entry
end_block # label6:
local.get 6
local.get 4
i8x16.replace_lane 3
local.set 6
block
block
local.get 0
v128.load 16
local.tee 2
f32x4.extract_lane 0
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label9
# %bb.13: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label8
.LBB0_14: # %entry
end_block # label9:
i32.const 0
local.set 4
.LBB0_15: # %entry
end_block # label8:
local.get 6
local.get 4
i8x16.replace_lane 4
local.set 6
block
block
local.get 2
f32x4.extract_lane 1
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label11
# %bb.16: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label10
.LBB0_17: # %entry
end_block # label11:
i32.const 0
local.set 4
.LBB0_18: # %entry
end_block # label10:
local.get 6
local.get 4
i8x16.replace_lane 5
local.set 6
block
block
local.get 2
f32x4.extract_lane 2
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label13
# %bb.19: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label12
.LBB0_20: # %entry
end_block # label13:
i32.const 0
local.set 4
.LBB0_21: # %entry
end_block # label12:
local.get 6
local.get 4
i8x16.replace_lane 6
local.set 6
block
block
local.get 2
f32x4.extract_lane 3
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label15
# %bb.22: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label14
.LBB0_23: # %entry
end_block # label15:
i32.const 0
local.set 4
.LBB0_24: # %entry
end_block # label14:
local.get 6
local.get 4
i8x16.replace_lane 7
local.set 6
block
block
local.get 0
v128.load 32
local.tee 2
f32x4.extract_lane 0
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label17
# %bb.25: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label16
.LBB0_26: # %entry
end_block # label17:
i32.const 0
local.set 4
.LBB0_27: # %entry
end_block # label16:
local.get 6
local.get 4
i8x16.replace_lane 8
local.set 6
block
block
local.get 2
f32x4.extract_lane 1
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label19
# %bb.28: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label18
.LBB0_29: # %entry
end_block # label19:
i32.const 0
local.set 4
.LBB0_30: # %entry
end_block # label18:
local.get 6
local.get 4
i8x16.replace_lane 9
local.set 6
block
block
local.get 2
f32x4.extract_lane 2
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label21
# %bb.31: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label20
.LBB0_32: # %entry
end_block # label21:
i32.const 0
local.set 4
.LBB0_33: # %entry
end_block # label20:
local.get 6
local.get 4
i8x16.replace_lane 10
local.set 6
block
block
local.get 2
f32x4.extract_lane 3
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label23
# %bb.34: # %entry
local.get 3
i32.trunc_f32_u
local.set 4
br 1 # 1: down to label22
.LBB0_35: # %entry
end_block # label23:
i32.const 0
local.set 4
.LBB0_36: # %entry
end_block # label22:
local.get 6
local.get 4
i8x16.replace_lane 11
local.set 6
block
block
local.get 0
i32.const 48
i32.add
v128.load 0
local.tee 2
f32x4.extract_lane 0
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label25
# %bb.37: # %entry
local.get 3
i32.trunc_f32_u
local.set 0
br 1 # 1: down to label24
.LBB0_38: # %entry
end_block # label25:
i32.const 0
local.set 0
.LBB0_39: # %entry
end_block # label24:
local.get 6
local.get 0
i8x16.replace_lane 12
local.set 6
block
block
local.get 2
f32x4.extract_lane 1
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label27
# %bb.40: # %entry
local.get 3
i32.trunc_f32_u
local.set 0
br 1 # 1: down to label26
.LBB0_41: # %entry
end_block # label27:
i32.const 0
local.set 0
.LBB0_42: # %entry
end_block # label26:
local.get 6
local.get 0
i8x16.replace_lane 13
local.set 6
block
block
local.get 2
f32x4.extract_lane 2
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label29
# %bb.43: # %entry
local.get 3
i32.trunc_f32_u
local.set 0
br 1 # 1: down to label28
.LBB0_44: # %entry
end_block # label29:
i32.const 0
local.set 0
.LBB0_45: # %entry
end_block # label28:
local.get 6
local.get 0
i8x16.replace_lane 14
local.set 6
block
block
local.get 2
f32x4.extract_lane 3
local.tee 3
f32.const 0x1p32
f32.lt
local.get 3
f32.const 0x0p0
f32.ge
i32.and
i32.eqz
br_if 0 # 0: down to label31
# %bb.46: # %entry
local.get 3
i32.trunc_f32_u
local.set 0
br 1 # 1: down to label30
.LBB0_47: # %entry
end_block # label31:
i32.const 0
local.set 0
.LBB0_48: # %entry
end_block # label30:
local.get 1
local.get 6
local.get 0
i8x16.replace_lane 15
v128.store 0
# fallthrough-return
end_function
# -- End function
.section .custom_section.target_features,"",@
.int8 3
.int8 43
.int8 15
.ascii "mutable-globals"
.int8 43
.int8 8
.ascii "sign-ext"
.int8 43
.int8 7
.ascii "simd128"
.section .text.test,"",@
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsW11vo7rW_jXkZgkENvm6yMW03SON9F692uc6MrBIfOrYHNskzf71R4akgZBOm4mnmp52VCUTbK8vngfsRzYzhq8k4iIY3wXjhxGr7VrpBctYwTZmlKliv_h7jVAqIdSOyxVEQkClVVHnaMAqSCZgciaYBlUZ2HG7hlxJq5WAUqhdBN-Kwo1jEri0qDdYcGYRcmasG8-AkjDjtmldoYYM12yLBpgBfKowt1hAQGaFQgNS2YM3_g8GZB6Qe8hqC3bNLLgeMiBTCwZxA4I_InALZq1qUUCGsEZRlbV4Dikgd9KFyqqKy1VYVlZxaaHkT2jcSKvU0cGPH_-6B9eh5sANMKgUN0rClokaQUlQW9QuXzfAKLB67zxYBULtULfWwKgN2rVraOp0HAP5GvNHdzlnbQI7pR8j-HuNGmHD9uAqAhmiBKuZNKXSG2a5kgaY0MiKPWxYgc6FXSMIsd3Aj_9vq8K0i0_sIVdaY26hVBqkkuEhAy7DTNWyMG0uxiXAZOEMGRy4ew5mw6oKXS9lEFRtQ1WGmskV9ipjIGP5o_PRtnEJDHZsf7xhxcqBaI1npTBRED8E8bfD5yQ-_LU_6R0IkcOOmc2yFIrZpcOSw2W4sZpXAgP64FopCWv5KNVOhqH7DeGGWasD-hCQO8M3RUJmECoIu94KLLlE2CpeQJDGFo0NyKyyrmhMcOZAmLPK1hpBusJhCe4WNDUOyJhLV8Gf999pbvE4QNU2IHMIpndtACit3gf0EA64LuU2iqNcVXuhWAEBfYDDf-6TCTxBU4SA_nV0_BwFE3wlIZl0TOVKbhsLBzCf2xi6s-q5E585LwdjxiqN_aajg04gTXbDSDTapsSHyk8fLt_qyxfnkcUne_pVcoFBPA8IGWIiIKTTsZa53Veus7uv7rHCKXHxNV9zCNskZu7R8jzKYO7Af_QbtZC4d5bd332QdiJbCZWJk30KYQh3uOISGudcSWjGn1I5xdPaOvZruzTX6bfG1BGON-YjVM4EQBDPt4lruoeyM-j5q2lsBznfZJxlUfwcChm3OD1azYTKH6Gxenate6VxHa3QBvH8VDTnKXJY611t-1p02Z3uYUnJUxrhk9Ust0vBpGtOLg6i3UFRrqRp3D4lFe3Zi4SFixG-ZCCu4l7LCuFkgFMSuedn7wL-559TVfSSl22lYvj5P1fppuSF2snmZcIyFMn5XUlclzeYGty0y8m6gK2uZb4sKVnWZ71N0zvtpOMySd7gPhlkcihj9H93d_GSvIgtlMXyiK_XnBwqdHx8HtJ5vnc_S6YNg_oLI-6GcT1BXgH9Zab8r4KenoM-fX_Qj72AnnTRNvaHNnod6MfdMCb-wiDdMLq1PlWPz56SSWQqwezFnulZT42VYDm-9LRv85n8hFU38Yx8Kp6Nz3k2_agvl95TfeYP4OMbXi5zf2GkL_Fs8iucOgf5b-YU_VScmg4mbPHrpPojOTXpgjlJ_KF5egOpEo8zx4lXVp3D_FZWXV4yJeexvWnN9Lmmj_MBBekHpeCsB_3UH_Tnt1DQ4zx25pWC6fu-2D6ZEjGUIiYflFZJT4tIph7FiFvUiMTjxDWJvTJr_LUM-43MGugdyfyjMqsneJCXJeTrAX2l5NFjFvE4e01eFD1-iVnng74WYz6ZNVA4CPmozOoD2qNyndwichCPk9LEr8wxfZcFGb38rvpakHWxOxBFyPij0rCnihCPWnpyiyxCfM5h_eois69F2W9k1kDrILOPyqye2EE8qufJLWoH9TmH9St3nPPja1HmkVlkIHfQN2y9-COZRXpyB_WooJObNl94nMMSv3JHcjn8r1WZH2oN9A76hg0efya1enoH9aiMk1v0Dupzj4dfvSP51R0d8IaFWbdM6awP6aID6Zs3IX6u9RsZyCj0DTtFPBM29kPYPk88Ku7kShml_1L2ON0lV8oo8SuEfeftIp9rAUcG0kj6_vtFPFGrJ42kHiV3cqU00qNW6nO-e6U08hq1fO8Z-VrBdfE40EbS998H4olaPW0k9ai5kyu1kT61fM53r9RGXqPWO-8F-VwrODoQR9L33wvih1q0j2iPOjq9UhzpB-JxZkpfFEfO51q3EG7cX101JwZPw5poSiaEXWtVr9ahRltr2UuqfxjueKDuL1lAv-XsnF5eG6s2y8OVyDK9QrsskdlaN0dcLx_e49LOepg7XkmHlzrJRczknLcnEDe1ZZnAcCVUxoTpHT982djsoi3DVzLEJ_tGI9MXjDTnXHs2rjjR2HxeOpU5Kha0mNM5G-EimcbJfJZMx5PRepHNMpIwWmIxp2zC6CRO0jIZZ7NynpY5FiO-IDGhCYmnJE4JnUTz8WScJvMyTmg5nszTII1xw7iIhNhuIqVXI25MjYtpOp3Eowa_pjmwTojEHTSNLvDxw0gv3Jgwq1cmSGPBjTUnK5ZbgYvmJPAWc6t0cwjdQKnVpj0A2x5Il0xrtXs-kG73FZ5OnY9qLRZrayvjGES-B-T7itt1nUW52gTku_N2-Aorrf6NuQ3I9yZGE5DvTQ7_DQAA__9pdkLn">