<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/86966>86966</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[DSE] Missed optimization: eliminate `memset` of `alloca` if either `alloca` doesn't escape or `memset` is dead
</td>
</tr>
<tr>
<th>Labels</th>
<td>
llvm:optimizations,
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
XChy
</td>
</tr>
</table>
<pre>
Alive2 proof: https://alive2.llvm.org/ce/z/Dq-Cih
### Motivating example
```llvm
define void @src(i1 %c, i1 %c1, i1 %c2, i64 %n) {
entry:
%p = alloca [2 x i64], align 8
call void @llvm.memset.p0.i64(ptr %p, i8 0, i64 16, i1 false)
br i1 %c, label %then, label %else
then: ; preds = %entry
br i1 %c2, label %common.ret, label %thread
thread: ; preds = %then
store i64 %n, ptr %p, align 8
%8 = getelementptr i8, ptr %p, i64 8
store i64 %n, ptr %8, align 8
br label %escape
else: ; preds = %entry
store i64 %n, ptr %p, align 8
%11 = getelementptr i8, ptr %p, i64 8
store i64 %n, ptr %11, align 8
br i1 %c1, label %common.ret, label %escape
common.ret: ; preds = %then, %else, %escape
ret void
escape: ; preds = %thread, %else
call void @use(ptr %p)
br label %common.ret
}
```
`%p` has two final destinations.
One is `common.ret`, and through all paths to `common.ret`, `%p` doesn't escape from this function. In this case, `memset` is dead.
Another destination is `escape`, though `%p` will escape by `use` function, but memset is also dead because there are two consecutive i64 stores on all paths to `escape`.
Thus the memset in entry block is dead indeed.
I'm sorry that I cannot generalize it further, but one key point is that DSE eliminate this case when we replace the two consecutive stores with a single store: https://godbolt.org/z/T45aexqM5
### Real-world motivation
This snippet of IR is derived from [linux/drivers/gpu/drm/i915/i915_utils.c@__i915_printk](https://github.com/torvalds/linux/blob/8d025e2092e29bfd13e56c78e22af25fac83c8ec/drivers/gpu/drm/i915/i915_utils.c#L17) (after O3 pipeline). Looks like handling C's va_args.
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, see also:https://godbolt.org/z/ebx1Kv87z
**Let me know if you can confirm that it's an optimization opportunity, thanks.**
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVs1u27wSfRp6M4ggUdaPF144cQ0Et0WBtou7CyhxZPGGIlWScuI8_QUp2ZFj40sLfIEQixQ5c87hzHCYtWKvENckuyfZdsEG12qz_u9De1xUmh_XGykOSKE3Wjck3UDrXG9JuiF0R-iOha-RlIcu0mZP6K5GQndvhO62v-8eREviLYk303-ajg98004cmBNqD_jKul4iXCzM4_Hxdscpjo1QCActOJBlbE1NaCkSIDSrCX2A6TWZvdPwni_9QBG6AlLcj8ZQOXP0HMII_IIeSLoFJqWuGZDsnsKr30uyrbfCpNgrKE_rayblGUrg3mFn0UV9HPlNtOydCVYDhBLiE5QknwA2TFokdHUyWRmYsZGsQulHrkV1MYF-20yqsCDdAEnvoTfIbeDhFwaOV9bphbVad51WkUH3watBxi_dhJl0A3_8d4UoQJ0AWacNzg7nAeaKfZCb0KwMRvboUGKHyvnVovy4z9srP3VR3nBRmZnEtmb9hchB9E9F_ltOSfIvkkqS26zmefHZsV8Tny3846O_fe704Ry90-vMGYBBFxLqQvVxxS3dp3CcGb2VmYN3NkvFU7LND3vGcHRdbD_UoI-FyZvKY2iZBfeioRGKSeBonVDMCa1sNNWy7wpBWCB5PHOSh1LAFAfXGj3sW19zoGeuteD0zcUzp1yjVYQWDkZxoDG6A9cKC82gau8-gkc1ztRsEjuPx_LkLQgLHBk_Ydwo7Vo0c_wT5kn9EYJrA9QZkhch5QlEdfRfvNp5fMbht1WDg9G1N8qk1cE5VFizwSJ41wjMYFCy1spiPThxGCM8xLoFra4kOmOLRha_2sF6Y2dnCkJmQiV1_XziDEJxRB7ND_SR0KIDq405gmuZg0eomVLawR4VGibFG4Jw0AzGgz2R0grhGY_Qa6ECt7B3-_MLoBSdFxLfzwBeWlTwgmCwl6wOtK8IT2RfhGuBgRVqL6fJ60t3r3mlpZsuXH_X_lpmDF9_f8tuX7c_kMm7F20kh266ebWaL_3lsVol-h4d6AYef4yiGXFAPgYZye6lUMMroTvup431SPohjDtCd2KVZNPP0-CEtFFNlvHTU5jojVDuOdym5QcywrVDFdXa23DaHJjk3vTJWSV1Reiu5DHNkMYrinRVNTxJMcvrokRKWUOzhtVlWpdY_xU8mn5NitAZ0JI1Dg18T6EXPUqh_OUcwVetny1I8YzQMsWl71geCC0sHNgTM3t7jkA8dzKs0oeQ-QwM8qFGDh7PmJsNHPVAaOGruHJo0Dr0kRmCQhuxD-XEDpXuneiY9Efhq8X70MegRQzpRNLNZ6GB1Wvyn0NZvF2Ghn--ok9PeFb6BUQA5oPfh2UjTDfGtHCBLVMjAvE21gjd99q4QQl3HOsDU882Gs0u-Drlq3TFFrhOiiTJl5Sm5aJdp6smb4pilTcN-oBdrlizSnhRVDxJ6qxeiDWN6TJOaZnkSZyWUVOxZtnEFeMZ0pIWZBljx4Q8N5wLYe2A6zJf5fkiFHUbOllKQ-uYbuaoLaG-_SGUdsJa5Hfzj_5btl2Ytd94Vw1763s7YZ19d-aEk6FT3v78QrItfAtmLpTx2fpeAy5Kr278eGwxQyluAEWovhfTVzVem5slfDEYuf6HXAoCjD93vdH_w9r5-Pd6-dwIkv0_AAD__7n5neQ">