<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/127244>127244</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
SLP vectorizer performance regression caused by 88e7b8b81c061113399637f936937ffaf5a9bc08, #125725
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
pclove1
</td>
</tr>
</table>
<pre>
SLP vectorizer change in 88e7b8b81c061113399637f936937ffaf5a9bc08 / #125725 introduced a performance regression.
# A minimal reproducible example LLVM IR:
```
target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
define ptx_kernel void @test() {
%vec = bitcast i32 0 to <4 x i8>
%elem0 = extractelement <4 x i8> %vec, i64 0
%elem1 = extractelement <4 x i8> %vec, i64 1
%elem2 = extractelement <4 x i8> %vec, i64 2
%elem3 = extractelement <4 x i8> %vec, i64 3
br label %1
1: ; preds = %1, %0
%.p0 = phi i8 [ %elem0, %0 ], [ 0, %1 ]
%.p1 = phi i8 [ %elem1, %0 ], [ 0, %1 ]
%.p2 = phi i8 [ %elem2, %0 ], [ 0, %1 ]
%.p3 = phi i8 [ %elem3, %0 ], [ 0, %1 ]
%val0 = insertelement <4 x i8> poison, i8 %.p0, i64 0
%val1 = insertelement <4 x i8> %val0, i8 %.p1, i64 1
%val2 = insertelement <4 x i8> %val1, i8 %.p2, i64 2
%val3 = insertelement <4 x i8> %val2, i8 %.p3, i64 3
%val = bitcast <4 x i8> %val3 to i32
br label %1
}
```
# SLP vectorizer behavior before the culprit:
```
target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
define ptx_kernel void @test() {
%vec = bitcast i32 0 to <4 x i8>
br label %1
1: ; preds = %1, %0
%2 = phi <4 x i8> [ %vec, %0 ], [ zeroinitializer, %1 ]
%val = bitcast <4 x i8> %2 to i32
br label %1
}
```
https://godbolt.org/z/o1orj4GWh
# SLP vectorizer behavior after the culprit:
```
target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
define ptx_kernel void @test() {
%vec = bitcast i32 0 to <4 x i8>
%elem0 = extractelement <4 x i8> %vec, i64 0
%elem1 = extractelement <4 x i8> %vec, i64 1
%elem2 = extractelement <4 x i8> %vec, i64 2
%elem3 = extractelement <4 x i8> %vec, i64 3
br label %1
1: ; preds = %1, %0
%.p0 = phi i8 [ %elem0, %0 ], [ 0, %1 ]
%.p1 = phi i8 [ %elem1, %0 ], [ 0, %1 ]
%.p2 = phi i8 [ %elem2, %0 ], [ 0, %1 ]
%.p3 = phi i8 [ %elem3, %0 ], [ 0, %1 ]
%val0 = insertelement <4 x i8> poison, i8 %.p0, i64 0
%val1 = insertelement <4 x i8> %val0, i8 %.p1, i64 1
%val2 = insertelement <4 x i8> %val1, i8 %.p2, i64 2
%val3 = insertelement <4 x i8> %val2, i8 %.p3, i64 3
%val = bitcast <4 x i8> %val3 to i32
br label %1
}
```
https://godbolt.org/z/ha797ePhW
Out of curiosity, I tried reproducing at a quite recent git commit at 7ec60bf0166519317b5ae2505dd6ed4660e3ea39 and the performance regression was still there.
**Credit** to @metaflow for finding the culprit.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsV9tu4zYQ_Rr6ZWCDHIq6PPjBSepigRRdtED3saCkkc0tJaoUpVy-vpCsOM4NcdoCCywWEGyBmnNEzuWMRned2TVEa6YumLpa6D7snV-3hXUDiUXuyrv179efYaAiOG_uyUOx182OwDSQppTkaZ6KgsdCCCmzLJZJlck4k0lV6UrpLC94Cgy3wFAKVAkqME3wruwLKkFDS75yvtZNQeBp56nrjGtWjG_GCyVsoDaNqbUFT-2EM7kloFtdt5bg-vqPX-DTb0xuZkjM54tvgvY7ClDqoK2-c30AJq-AIdKyjZncSJx-liaOmNzE0dIITJncCEyXgxgtRLwcHqwaccTEEUM8vK-kyjQEbbj98y_yDVkYnCmBRTxQFximDDNgyQXjGwCGaqBi2kVuQqG7AEYicAgOmLyM4BZMyuRPB-rJnizVfELQbfC6COMCNeGJ_UzM8BJMHAE_BYsPgsUpGD8IxlOw_CBYPhw792B1TnZ8Lg6LgskNMHkBraeym-OoxAhlqI4HXrUHX7V7AyYFpi6OPnwwBaaupnt1AQ-LYlo8cog3OMQHOPANDvwAh3yDQ57DcUw4bQ8-MU1H_vU4tM50rpnikM5ufJFLg7biPZ75dadE4kVeDdrieUTilAhf5NigrTyPCE-J5It8Oxg9KctXSORYpUbim1maXD3Tn1nBnulnTns9GDfeVM4ThD1B0dvWmzCr2HciYf-lkB_L52kkDkUwi8bzErgn70xjgtF29PRrRfVenPExyOdEeB9C241Rwy3D7c6VubNh5fyO4fae4dYJ579GP3_Zv58Nugrkv-dk-GbN7Fv1sh-NbPM_9LEfTezdJvYvOtiZ7es9fdvrJEvo8_7LwfzXPoCroOi9cZ0Jd-MmP0HwhsrHr_dmBzqAhr97E8Zv_mI8784EKFxdmzA-TKiIeV5xEcdKZFIkudKEiquyjKmM4piTJC0z0E05SebrUwTc6A66YKwdjTwdp4rxuvRUmnC4n_Qq4jUFXVl3A5XzUJmmHPd6osirRbmWZSYzvaC1SGSmVBZluNivC65kTCRVXKUyygQnIXWUJBhRXqisXJg1clQcRSRSlQpc5bzIU8FRq5yLvBQs4lRrY1fWDvXo4YXpup7WAhOMosUUqm6a0hAbuoHp6aja6mrh1yNomfe7jkXcmi50jzTBBEvPZ7g3_FXovqMS8ruzR7tDsc6T3aL3dv0sZUzY9_mqcDXD7bin-W_ZeveVisBwO52kY7idjzqs8Z8AAAD__92Q980">