<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/54757>54757</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
InstCombine causes induction variable to get undef dbg value
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
dcdelia
</td>
</tr>
</table>
<pre>
In this code, variable `j`, used as an induction variable, is not available during debugging at line 5 in the body of the loop. This happens because of InstCombinePass, thus before loop optimizations. The pass replaces the i32 IR value for `j` with a boolean i1 and updates the phi node accordingly to have the control flow instead of using icmp. However, llvm.dbg.value ends up with `undef` as argument.
We tested the code at -Og/O1/O2/O3 and the issue is present in all of them, as the generated ASM and DWARF information are identical. Any way the debug metadata could be amended here?
At the source level, it should be possible to step on the assignment to variable a only once, displaying `i=0` and `j=0` as locals. Although the pipeline and transformations are completely different, we saw that recent gcc versions at -O2 emit identical ASM to clang but with complete DWARF info, using `DW_AT_LOCATION` for `j` to distinguish two instruction ranges for its value.
We tested clang and lldb 14.0.0 commit 116dc70 on x64.
```
$ cat a.c
long a;
int main() {
int i, j, print_hash_value = 0;
for (i = j = 0; j < 1; j++)
a = (i)*j;
}
```
LLDB trace:
```
$ clang -O1 -g a.c -o opt
$ lldb opt
(lldb) target create "opt"
Current executable set to '/home/stepping/288/reduce/opt' (x86_64).
(lldb) b main
Breakpoint 1: where = opt`main at a.c:5:7, address = 0x0000000000400480
(lldb) r
Process 28 launched: '/home/stepping/reduce/opt' (x86_64)
Process 28 stopped
* thread #1, name = 'opt', stop reason = breakpoint 1.1
frame #0: 0x0000000000400480 opt`main at a.c:5:7
2 int main() {
3 int i, j, print_hash_value = 0;
4 for (i = j = 0; j < 1; j++)
-> 5 a = (i)*j;
6 }
(lldb) frame var
(int) print_hash_value = 0
(int) i = 0
(int) j = <no location, value may have been optimized out>
```
ASM at -O1 (same for every opt-level):
```
0000000000400480 <main>:
400480: 48 c7 05 a5 0b 20 00 movq $0x0,0x200ba5(%rip) # 601030 <a>
400487: 00 00 00 00
40048b: 31 c0 xor %eax,%eax
40048d: c3 retq
```
With `opt-bisect-limit` we found out that the optimization pass that introduces the issue is InstCombine. After InstCombine, the loop optimizations will then optimize the loop away leaving the main function with a single assignment with zero value to variable `a`. Because of the effects of InstCombine, `j` is lost already.
IR before InstCombine:
```
define dso_local i32 @main() local_unnamed_addr #0 !dbg !11 {
call void @llvm.dbg.value(metadata i32 0, metadata !18, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 0, metadata !17, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 0, metadata !16, metadata !DIExpression()), !dbg !19
br label %1, !dbg !20
1: ; preds = %4, %0
%2 = phi i32 [ 0, %0 ], [ 1, %4 ], !dbg !22
call void @llvm.dbg.value(metadata i32 %2, metadata !17, metadata !DIExpression()), !dbg !19
%3 = icmp slt i32 %2, 1, !dbg !23
br i1 %3, label %4, label %5, !dbg !25
4: ; preds = %1
store i64 0, i64* @a, align 8, !dbg !26, !tbaa !27
call void @llvm.dbg.value(metadata i32 1, metadata !17, metadata !DIExpression()), !dbg !19
br label %1, !dbg !31, !llvm.loop !32
5: ; preds = %1
ret i32 0, !dbg !36
}
```
IR after InstCombine:
```
define dso_local i32 @main() local_unnamed_addr #0 !dbg !11 {
call void @llvm.dbg.value(metadata i32 0, metadata !18, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 0, metadata !17, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 0, metadata !16, metadata !DIExpression()), !dbg !19
br label %1, !dbg !20
1: ; preds = %3, %0
%2 = phi i1 [ false, %3 ], [ true, %0 ], !dbg !22
call void @llvm.dbg.value(metadata i32 undef, metadata !17, metadata !DIExpression()), !dbg !19
br i1 %2, label %3, label %4, !dbg !23
3: ; preds = %1
store i64 0, i64* @a, align 8, !dbg !24, !tbaa !26
call void @llvm.dbg.value(metadata i32 1, metadata !17, metadata !DIExpression()), !dbg !19
br label %1, !dbg !30, !llvm.loop !31
4: ; preds = %1
ret i32 0, !dbg !35
}
```
Therefore the `dbg.value` associated with `j` is set to `undef`. This leads to the unavailability of the variable value during debugging.
Final IR from `opt -O1`:
```
define dso_local i32 @main() local_unnamed_addr #0 !dbg !11 {
call void @llvm.dbg.value(metadata i32 0, metadata !18, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 0, metadata !17, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 0, metadata !16, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 undef, metadata !17, metadata !DIExpression()), !dbg !19
store i64 0, i64* @a, align 8, !dbg !20, !tbaa !23
call void @llvm.dbg.value(metadata i32 1, metadata !17, metadata !DIExpression()), !dbg !19
call void @llvm.dbg.value(metadata i32 undef, metadata !17, metadata !DIExpression()), !dbg !19
ret i32 0, !dbg !27
}
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJztWdlu4zgW_RrnhYih1cuDH5ykgw5QM9XoKaAeA0qkbaZl0S1SWfrr59xLyZaTOKjKZDrAYAxbkbjcnYdHTGHV0-KmFn5jnCit0qPkUtzLxsii0mI0ie7wo7bWaSWkE7IWplZt6Y2t9wNpAObX1gt5L03Fk1XbmHotlC7a9ZrupBeVqbXIIQEKtSigXNgV31fW7sbiG5mxkbudrp0odCmhlkbc1M5f2m2B6b9J50if37Q0ZGWbMFnYnTdb85ckyxyJ0mKHsaLRu0qW2rEakybi5ncYXrVaYG7vongwfiMkTLKVJh9jeKpEu1PSd1N3GwMPlRayLG2j4FH1JLyFufeaB5S29o2txKqyD3DReS0VGd868t6UWzj4q33Q97ohB6rqfjtWxXocjNG1ctAXDIFFba30iiyjoDfrdqtrPx5FV6NoGa7foVVDieqUk2VenH9dj5LrrzFdErqk7Aj77lxLV7FrtIM0SoOsqi4DW7JJBlfXutaNJNHLf_2D5199X_5-jQkI2ZYjDJsgS0GMKWU1Fsv6STzIJ57OKRdb7SWiJ2FaWymkSkj4oCB0oxs9Sq-Hziw9z3S2bUrkEzGquKi8cJt--s46Z6iyEHT4jYyHMkKSzbqm-FDPvngl-pEhW5dcn8o41METpQJRNaP0KuLowjmugf7ZoZrgESpoWXnoXm9C9s1Oc_VyNBtZu30oHMeitNtdpb2GSmVWK7hYe9L7AK_kA2QgOY0uycp1WQoUgQtzKWeJ0Fv4uo8nxx3OlJWEvUXrQ1n0OgbpCGuz8-rq--3y2-2Xr5fLbzdf_0nuDEsc8hAEj7GtcfDqwXKVNt1ihk9rlDrNMN6FJTIWr1dcsItCUVWqEHE2jsYRmUdexPFEldOI0vM4yfqanUTdNzwmmSjhuhyXoaGyJHCUXoRHgzBtpalHyWyUzMVo2rULQT2GvL6jyw4Y42830m1uwzJCHkW0FyNCAJKZ4Y67fTffXoqYb0fJBX_n_SSB4qGRNJGak-XdXuRoevW6R3z98uXqgsoDNZcuT3vO4Tv_GovzNYVAnFuCr8MAjuqgZUYNFAgPKNBelI3G8oR9CQ1KkjDusm2o6oR-1GXreRE4zYtilEyBBRu7xVK4prWzQxXgNpkhvNeNBqBTDwubktuPs8ntJIPG8QsTipAYbr6AHX_sLOUEoVyKB1raHDoSNYlopOjSnC5z_KYMM0oBglzIxmO0_2T4zqIXGpvQ8ltjS5qVzEQl27rcaEU6T_n2llcvBDpvsemoXvUS67Uh-B4laUwW18CuriSmQR610iwsaulQ6tRZDMIxjg_VtGp4epJGZPBLj9-IVi8jITmnV4VIhehH_MTqEFmY9o5Vcj5Kf8FW3n1Orxd0TmjEYd0cEhviAsTe9xgCzfkpw5-NMq83B_theG0ZywndAq0hSVtsUrxlF1rXPWkApNnWw6M3ljZvhZ5XLXQ5spzCRtv5E8k573at-amV_yLrMJGzCbX9FNEtgXTZBTabiXIqolzIXESFSCIRRdS-tfd_UlSTDOUE76LHJIoKmXNp5I3ZUSS6DwpPTKI4Slml3LsZlE0PyqLo8BXDMcVhTBqLMhIvP4-IBevKtXyEQd3NQIg6CCnTVyTg02j_5xsp-N5xI4p2YZwuEXSkzzOBo3SANFEiw25L2_aQFAY6yF2GmBrBgztmRgOeif1_5XUzbAq08zW6ie0ZRAp9h4o6jJTEi8Aq72mXplZe5ysgGFvVMU_aw6sjMsMdf-nGdpU7JDfwWOI3FhcHkkyiNZhHid37mDOT4T0LMERwHECmIoB7OqKUYMYdox5OPlHOoKfEh5Szt8yYmFuPsmgAUNx-29aEnuqWUJ9BEJcYxJf-xPEQxUqio_fWKJJzzI8hcU8oSRG_lOxbSNLsecvVzS-PRHUdr_8ZA9OcQ3FQP_8Q1dPPUz35j1QXDbbSQle0bOPjQcnRyosPa_eHP7RrwBLlur0hz4KGPOrV4z7hTnq34vLJL4KPNApPV3yPxs64PNs3HgxN3hFI0vzBaYTIlJ2hNz3hKn-k6Hl000EKTMyT-aWwz0Z29JQ_m50Pc5N9RG72bAWcht7sJllIhCG-tKSASmZuFdBJzJ6ZM-mefSE5bsn0HSmJPzgfb9R22j-zRQzS1JoMo5p_aFSxsR0W8MCQyY-8VACW5Yu96G9D5c8D5c_D5M-D5L8XkdNniHwMyDFD70pWTnfj0iEm-6bt26M3YPknYhnOvD40k3t4TY4A9RWwfQnO4Zp-JBC8B12z5-g6-fnIfjC4voWt0avYGv_XdqxT0Jr_CLR-owMLZrzEntF5iCAfBjpbGj4I7Y9lew7dH6wcDmq702sQfdiILhLY1t15uKmM359171l84PXPD8qPKPm1qQHi2AFWjd127z70Dkp-_J-Y_88S888FzXfBVPQcptJ3OPPRJPCT43gCm3p-fAqbztQiVfN0Ls-88ZVeDHif4Nd998o_4Qhz6GyY_RCkid08a5tqsfF-5wgvkmt818CythiXdosHCkr353zX2DtdejzyiYjDTZ5N8-nZZjFb5VOVpyoukokqtJ7N9bxYJUWUF_kkl_Mz3g_cAsRglCS1fgiHKnQ2nV-dmUUSJUmURTn2h3mWjrNZmq9mAJ4smk30NEd2NPCpGnOObLM-axZsEjDRUeqM8-7QGU5JtGZ1kC9bv7HNQpVKV0aeseoFm_5vumAufg">