<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/153109>153109</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[NVPTX] Performance regression in IR that uses `<1 x float>`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
Artem-B
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Artem-B
</td>
</tr>
</table>
<pre>
Introduction of v2f32 support appears to cause performance regression in IR that uses `<1 x float>` vectors.
LLVM then tends to use `shufflevector` to build `<2 x float>` and our lowering for that ends up doing it the hard way, which does regress performance in some of our benchmarks.
Minimized reproducer: https://godbolt.org/z/8efcrna8b
One kernel constructs `<2 x float>` using `insertelement` and all of it is removed during lowering, and the case that uses `<1 x float>` and `shufflevector` ends up doing a lot more unnecessary work.
```
%i4 = shufflevector <1 x float> %i1, <1 x float> %i2, <2 x i32> <i32 0, i32 1>
->
cvt.u64.u32 %rd3, %r1;
cvt.u64.u32 %rd4, %r2;
shl.b64 %rd5, %rd4, 32;
or.b64 %rd6, %rd3, %rd5;
```
Vs:
```
%i4 = insertelement <2 x float> undef, float %i1, i64 0
%a = insertelement <2 x float> %i4, float %i2, i64 1
->
...[nothing. LLVM removes vector creation and uses the original inputs %i1/%i2 ]
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJykVN1u4zYTfRr6ZhCBIiXZvtCFk3wGFtj9uiiKoLeUOLLYUKTAH7vZpy9IydnYSNECFQyYIs-cM3M0HOG9OhnEltSPhLGDCzg9pBWpnzcihtG6dt3cdFa-tV9McFbGPihrwA5wZgNn4OM8WxdAzDMK5yFY6EX0CDO6wbpJmB7B4cmh9ylQGfjyK4RRBIgePZCGEv5Uwp8waCsC4f8jDYUz9sE6XwChB0IPX7--fIMwooGARmaRJEEa6sc4DBoXfIoMFrqotFyJ2R2xMBJsdKDtBZ0yJxisW7LJxHEGadO2CkkPRuEkXMQbYU9wGVU_grTor_Xc1KgMeDthciYJdGj6cRLu1RdLDd-UUZP6gRIcztlIdIQfYAxh9oQfCDsSdjxZ2VkdCutOhB1_EHbc4dA7I3YdrGb8YhBe0RnU0Fvjg4t98H9TbvSpGNJQZTy6gBonNOFqhNA6pasCqFTTZM8oQcbsy9WgVHnCJjd64fEfP10Cf_Zhbv0VoG2AyTqEaAz26L1wb3Cx7nX1K7EvP3oAwmpVAeHPcMMLd_oZV6aUPztg60EySXGWt_mT4gxoOkmLMhWR5R-WFaxPfw5FbKoicpbfCaud5JmQ1a4k_DGBP0VVVxRbUevjR110TXV9XcD1FbyE8WuMdQs2HzXvoPcMZL0AP7pG6OEld9admR_cvGkLuGsgiEbikBTyzk9zVVPBO5P4F0RZ8JaIXYnKW7-LoiD1o7FhVOZUQL75S2v6dSpA71DkIZRaLXdiak7r1EkZoUGZOaYLsWR7zGJA6uePNmxky-We78UG23JbV_t6y3izGdsd29fNfluzspFlz2jZdJKyuqLVwDouyo1qGWU13ZWMUrot64JihQMXu0HIrpPbgVQUJ6F0ofV5Svd4o7yP2JY1L-l-o0WH2q9D1-AF8uk6dl2bgh66ePKkolr54H_SBBV0ntb_f_n-2--kfobv_2HEbqLT7d3sUWGMXdHbibBjkl3_HmZn_8A-EHbMyXrCjms155b9FQAA__9zhOS2">