<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/118410>118410</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
slp-vectorizer pass removes calls to zext/trunc which breaks correctness
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jabraham17
</td>
</tr>
</table>
<pre>
The slp-vectorizer in LLVM 19 seems to erroneously remove calls to `zext`/`trunc`, which breaks the correctness of the code. This is a regression from LLVM 18 to 19.
This [compiler explorer link](https://godbolt.org/z/zof5rrxze) shows the issue, I will also paste the code here.
Input LLVM IR
```llvm
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "arm64-apple-macosx13.0.0"
define i64 @chpl_gen_main(i16 %0) {
chpl__init_repro.exit:
%1 = zext i16 %0 to i64
%2 = or i64 0, %1
%3 = tail call i64 @llvm.abs.i64(i64 %2, i1 true)
%4 = trunc i64 %3 to i16
%5 = zext i16 0 to i64
%6 = or i64 0, %5
%7 = tail call i64 @llvm.abs.i64(i64 %6, i1 true)
%8 = trunc i64 %7 to i16
%9 = or i16 %8, %4
%10 = zext i16 0 to i64
%11 = or i64 0, %10
%12 = tail call i64 @llvm.abs.i64(i64 %11, i1 true)
%13 = trunc i64 %12 to i16
%14 = or i16 %9, %13
%15 = zext i16 0 to i64
%16 = or i64 0, %15
%17 = tail call i64 @llvm.abs.i64(i64 %16, i1 true)
%18 = trunc i64 %17 to i16
%19 = or i16 %14, %18
%20 = zext i16 0 to i64
%21 = or i64 0, %20
%22 = tail call i64 @llvm.abs.i64(i64 %21, i1 true)
%23 = trunc i64 %22 to i16
%24 = or i16 %19, %23
%25 = zext i16 0 to i64
%26 = or i64 0, %25
%27 = tail call i64 @llvm.abs.i64(i64 %26, i1 true)
%28 = trunc i64 %27 to i16
%29 = or i16 %24, %28
%30 = zext i16 0 to i64
%31 = or i64 %30, 0
%32 = tail call i64 @llvm.abs.i64(i64 %31, i1 true)
%33 = trunc i64 %32 to i16
%34 = or i16 %29, %33
%35 = zext i16 0 to i64
%36 = or i64 0, %35
%37 = tail call i64 @llvm.abs.i64(i64 %36, i1 true)
%38 = trunc i64 %37 to i16
%39 = or i16 %34, %38
%40 = zext i16 0 to i64
%41 = or i64 0, %40
%42 = tail call i64 @llvm.abs.i64(i64 %41, i1 true)
%43 = trunc i64 %42 to i16
%44 = or i16 %39, %43
%45 = zext i16 0 to i64
%46 = or i64 %45, 0
%47 = tail call i64 @llvm.abs.i64(i64 %46, i1 true)
%48 = trunc i64 %47 to i16
%49 = or i16 %44, %48
%50 = zext i16 0 to i64
%51 = or i64 0, %50
%52 = tail call i64 @llvm.abs.i64(i64 %51, i1 true)
%53 = trunc i64 %52 to i16
%54 = or i16 %49, %53
%55 = zext i16 0 to i64
%56 = or i64 0, %55
%57 = tail call i64 @llvm.abs.i64(i64 %56, i1 true)
%58 = trunc i64 %57 to i16
%59 = or i16 %54, %58
%60 = zext i16 0 to i64
%61 = or i64 %60, 0
%62 = tail call i64 @llvm.abs.i64(i64 %61, i1 true)
%63 = trunc i64 %62 to i16
%64 = or i16 %59, %63
%65 = zext i16 0 to i64
%66 = or i64 0, %65
%67 = tail call i64 @llvm.abs.i64(i64 %66, i1 true)
%68 = trunc i64 %67 to i16
%69 = or i16 %64, %68
%70 = zext i16 0 to i64
%71 = or i64 0, %70
%72 = tail call i64 @llvm.abs.i64(i64 %71, i1 true)
%73 = trunc i64 %72 to i16
%74 = or i16 %69, %73
%75 = zext i16 0 to i64
%76 = or i64 %75, 0
%77 = tail call i64 @llvm.abs.i64(i64 %76, i1 true)
%78 = trunc i64 %77 to i16
%79 = or i16 %74, %78
store i16 %79, ptr null, align 2
ret i64 0
}
; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i64 @llvm.abs.i64(i64, i1 immarg) #0
; uselistorder directives
uselistorder ptr @llvm.abs.i64, { 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }
attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```
Running `opt --passes='slp-vectorizer'` with LLVM 18
```llvm
source_filename = "/app/example.ll"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "arm64-apple-macosx13.0.0"
define i64 @chpl_gen_main(i16 %0) {
%1 = insertelement <16 x i16> <i16 poison, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0>, i16 %0, i32 0
%2 = zext <16 x i16> %1 to <16 x i64>
%3 = or <16 x i64> %2, zeroinitializer
%4 = call <16 x i64> @llvm.abs.v16i64(<16 x i64> %3, i1 true)
%5 = trunc <16 x i64> %4 to <16 x i16>
%6 = call i16 @llvm.vector.reduce.or.v16i16(<16 x i16> %5)
store i16 %6, ptr null, align 2
ret i64 0
}
declare i64 @llvm.abs.i64(i64, i1 immarg) #0
declare <16 x i64> @llvm.abs.v16i64(<16 x i64>, i1 immarg) #0
declare i16 @llvm.vector.reduce.or.v16i16(<16 x i16>) #0
attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```
Running `opt --passes='slp-vectorizer'` with LLVM 19
```llvm
source_filename = "/app/example.ll"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "arm64-apple-macosx13.0.0"
define i64 @chpl_gen_main(i16 %0) {
%1 = insertelement <16 x i16> <i16 0, i16 poison, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0, i16 0>, i16 %0, i32 1
%2 = or <16 x i16> %1, zeroinitializer
%3 = call <16 x i16> @llvm.abs.v16i16(<16 x i16> %2, i1 false)
%4 = call i16 @llvm.vector.reduce.or.v16i16(<16 x i16> %3)
store i16 %4, ptr null, align 2
ret i64 0
}
declare i64 @llvm.abs.i64(i64, i1 immarg) #0
declare <16 x i16> @llvm.abs.v16i16(<16 x i16>, i1 immarg) #0
declare i16 @llvm.vector.reduce.or.v16i16(<16 x i16>) #0
attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```
The original high level code is written in Chapel, and amounts to `(uint16)abs((int64)a - (int64)b)`, where `a` and `b` are both `uint16`. For more details on the Chapel code, see [this issue](https://github.com/chapel-lang/chapel/issues/26301)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsmUtv4zgSgH8NfSFs8CFR9sEHJ2kDDcxeZht7DWi5YnGaIgWSyuvXL0hZSizL7fbOArO9GKA7EcWHquorVqpI6b06GIA1yu9Q_jCTbaisW_8hd05WsqbFbGf3b-tvFWCvm_kzlME69Q4OK4N_--1f_8B0hT1A7XGwGJyzBmzr9Rt2UNtnwKXUOvUhQd7hNSBBENsiQYJrTZla9_ilUmWFdw7kd49DBbi0zkEZDHiP7dPx1R4W-FulPFYeS-zg4MB7ZQ1-crY-CrOMn6KrBSIbRDZpNMrvSls3SoPD8Npo68Bhrcx3lD8gtqxCaDzimygV2x7sfmd1WFh3QGz7Hv_bp9y513dAbIV9ZV86CZX3LUTZv-IXpTWW2lvcSB9gkBZX4OAoyVfTtKGT8evv8ZUg3T-tn2tENkG6AwS8l0Fq-WbbgBF_wIgxmNeIb-xciQzxjcjmirIl4hvKlnPDWffun7G1jU3GPhYLTjUa-oWkq0U2l02jYV7L0vpXyhdkQbopiGz28KQMYCUyjDJSVo1-PIB5rKUyiC0VFRixnEQroOIOkU0a8aiMCo8OGmcX8KpCNCTZ4DiUpi9H5rifHOFERY4jWBphXfpmcoQ4LfYilvPUGaTSyYd6uaK9FnLnF3EdtkxvWc7iZEVxcBHKqv9A1q0RPQ0fR_IkAxX9kPxUyrGIYkrEvO8tbpFRTMu4PJexGMm4GoTo7Lg8yjFIScmPtaB00tJk6Ge3KELptCaUn6tC2UgXmo2UWfXS8GHIFSZ0EgodqNCbsNALXOgEGDomQ8doaNaLsxy8_AocNgmHDXDYTXDYBThsAg4bw2FjOLSnwwY67AodNkmHDXTYTXTYBTpsgg77TCe2x3BYD4cNcPgVOPwETpoQ1yB9kLqJDb_Ahk-w4WM2fMyG9Wz4wIZfYcMn2fCBDb-JDb_Ahk-w4eOdw8dweA-HD3CyK3CyyZ2TDTsnu4lOdoFONkEnG9PJxnR4Tycb6GQ_oBO7xcjVsnxwtdS8CU42ASe-nmCTjdlkYzZZzyYb2ORX2OSTbPJBm_wmNvkFNvkEm3zMJh-zyXo2-cDmWh5wIREYds5tqcClXGAqGTjLBs7Sgbynkw90xBU6YhzWBDnxNXETHXGBjpigI8Z0xJjOkBGIgY64lqVN0hEDHXFbonaBjpigI8Z0xJiO6OmIgU5xhU4xuXeKgU5xE53iAp1igk4xplOM6YieTjHQKa7QKcaBrTgNbMVNdIoLdIqpRHqUERRjOEUPp-jg-GAdDJ1J1SY4bFqt47PU6mAwS0MdhCMeskHFQ1dDIX6Ht60pQyxLNyG4WFliY6NaO1l-x8Y-OQBsrH8zJTa2NS_K7LFvoGy1DHKnIdWTDkLrDK6htu4NsaWx5qjsHkotHVw20tE8qq5lLGJXGDFOPsRrPWgV9dyDw3sVy2z1DB6RzUlPVPts-ftY-mGa8HWZLuXpZ6q_urqAJn9Npkt1ShF_JGZpWpqVJnVzkifgwYAyBKd2bQCfxO6K1-Luv2jC_lt9Ad599_fWGGUOGAlim4Dn80Z6Dx7xB8SK05MPxAokCH5RoeoPHSYKem9bV8Ljk9JgZD1U4YhtZdMgtoVXWTcaFtGx2K95APCpylfGgwugoQYTJb-nAr-mjce_xGac3Fjlrem8U3RR7S984F_6506pe6w4-4hJ7COkjbWJOgf78Tpy-TKktX2AOe3G_RnFOzirjApK6uRLJwcVKfiNJ37ag89UdLv8fHF-IaH4FBPPJ2UneiT1Ts89umgcbXSUotsFCwf7toSFdUmkWEEvz62U93KcBFVxc0z9MwGvn3u7UX9i0f_EMKcr_ZLRbvV3tPuJaDeEmv-JsDcd7ejZQexErPtB1OITUes4cbTBLoSI_tj2SWp_fm77p6IPn4w-2V8bfX7aOP_X0edbBdg6dVBGalypQ4U1PIPu7kyUxy9OhQAGK4PvK9lAB8rssaxta0J_m4TYslUmRDVXcuejsmypTIhIVhLP8afmLnpDf9cEEYkgMsa1uCwSZJeeHeCdDVV8cVxZkAXeWofr6EN7iCWKx9akK55OtiR1XNcDYJTfhe6Cyrcweb2kQtXuFqWtEduWaYG5luYwtBDbpskesS0TnNDOjWf7Nd-v-ErOYE0LzumSUsJm1VoUnMKSFjnATghZFlQs84ysyl1Z5PkTm6k1IyyjjDDGKc3YgpFyycgTFZKUYi9WKCNQS6UXyZOsO8ySAGtKlxklMy13oH26HmTMwEt_8cVQ_jBz6zhpvmsPPrqi8sF_LBNU0LAe3RrGPzTHq0H_cTeYLgbZtstTTi4DP10Ezlqn1z8wZ_pT1P2aN87-AWX4bM2jPs9r9u8AAAD__xMlwo0">