<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/72113>72113</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Lack of dead store elimination when only middle part of a memset is overwritten
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
bjope
</td>
</tr>
</table>
<pre>
Here is a simplified example showing the lack of dead store elimination when middle part of a memset is overwritten. The original scenario seen in user code involved a struct that was initialized to zero, and then some struct members where initialized to other values using subsequent stores:
```
#include <string.h>
void overwrite_start(char X[1000]) {
memset(X, 0, 1000);
memset(&X[0], 1, 200);
}
void overwrite_end(char X[1000]) {
memset(X, 0, 1000);
memset(&X[800], 1, 200);
}
void overwrite_middle(char X[1000]) {
memset(X, 0, 1000);
memset(&X[10], 1, 980);
}
```
Here is the above example using godbolt: https://godbolt.org/z/bc4xa8ahb
LLVM/clang (and the DeadStoreElimination pass) is optimizing the overwrite_start and overwrite_end functions by adjusting the first memset, given the knowledge that it's partially dead, since the start/end of the memset:ed region is overwritten by later stores/memsets.
But for the third function, when only a middle part of the memset is dead, there is no optimization and we would write the overlapping chars two times. I wonder if such an optimization wouldn't be nice. And if so, is DeadStoreElimination the correct place?
One thing that complicates things a bit is that one would need to split the first memset into two memset:s. So it is not as easy as to just adjust the start/size arguments of the already existing memset.
Some heuristics would be needed to decide when it is profitable to do the transformation. And there could be lots of parameters to consider in such heuristics. For example size of the overlap (i.e. amount of bytes that we do not need to write twice), size of the new memset:s, is it better to write a few bytes extra to make the new memset:s aligned and writing a multiple of N words, etc. Some of those things could probably be a bit target specific.
It had perhaps been nice if one could just annotate the memset saying that the part starting at offset X with size Y is in fact dead. And then the backend (or other optimization passes) could utilize that information when lowering the memset.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0Vk-P27YT_TTyZRBDltdr--DD_onxC5Bfe0hRpKdiRI4sZilSJUf2ej99MaTkXSdpERQIsJCxEmf45s2bR2KM5uCIdsXqvlg9znDg1odd_cX3NKu9Pu_-R4HARECIpuutaQxpoGfseksQW38y7gDcElhUT-Ab0IQaIvtAQNZ0xiEb7-DUkoPOaG0JegwsSxE66iKx5PdHCqdgmMnN4beWwAdzMA4tREUOg_EQiRwYB0OkAMprAuOO3h5JCzoOg2LgFhlOGME4wwateSEN7OGFgi-qB0CnBa2D6DuagjrqagpRMEqx15GeWwpwRDtQhCFKuXGoI_01kONcaCyWd0X5WJTT87Yc__K_1dI4ZQdNUCwfIgfjDvO2WL5_G3P0Rl9IoD8jY-Ci2qgWA3wuVveLsiyL1WNRbaFY3-cYGPkrqs1nKa6UR1pYbYvlt4uK6lZS5TwPsJBHdb26WD_-Cypy-idh2pT_HVVW1U8CtrjCtd38A67rjk9DI3OBtT_SZWKygA5e195ysbyDlrlP-qn2RbUfP8x9OBTV_qWo9rW6ecYNtvVbAj5-_P3_RbVXFt0BimozqhoeCfUnUeT7N5PXY4zChAxZz6YzL9PIfiW3NBxXzYZmcEqSRKjPgPrLEHkKbkyIfKHrAQ7mSC59eXL-ZEkfKE-j4aJaxzT0Bq09J4eQiGicohQxqn0vW_omvRoTL-9IQ6CDFHLtEoLIIlOYZrDa55g4f0vV_cDQ-JBycmvCa00CIbmSd_YsXnRtTq8gZOMJM7dja52f2MwsC3cngpMfrIZE4IVii30vrIk8I_DJA5uO4hw-wMk7TQFMA3FQLaC7TpqyuaJaM9QEziiaw53TaX2yMxO_33PZWvkQSDH0FhUVy_1bVn51iY3USmRQXqxdIVPMr8Xva8NZw8jg3VSao2yLsbeGvxECGMc-lXjpX5zDJw85l_MMGIEwnuWXPYiiRmFdKSGaFwIMh6Ejx3HqB9pAqM9AzyYLMe9y1fBPYuwtDUGWqDjCFvqIdMauSRlNufkZWB98YxhrS-m7z3IJ6GLjQ5c4zczn_qspp_UZXI8BO2I5RNiD8i6a1FeX-_oKZw57H17PT6lyrG1UioyzmdMcsPODS0qsz7kvcrKRgBMapz6MUjsZReJMaaxekzo6venEqBgjamKZnEs8QkOncSN65oDyqcMn-k4WQCt3Bp0lH0xqBEI3WDZSlG_gFzj5oNN-xEoE0I2QfKRJYZnEPvgaa3sWNrPmGMOBGGJPyjRGXTX3A0OLGnoKLfYRarkRyFjIRIhGc84sKuc84ziGozojni-al9dp1pPkUg3CdiPrPsPJcJuZ_CNR5qBBxckGLkLIY1ajehLfKqqND-Nt4WqIxX8pOXBGN7CRy8Voju4isKxH608UJo8d5T3Tu6XeLrc4o93idrvdbMrNejFrd9sl1YuV3lblbbkqF81Wbap6tdmsb2ixbupyZnZVWS0Xi8VysSzXq8WctFpvV1vErcIb3G6Km5I6NHZu7bGTY2dmYhxot64Wi-XMYk02TpfDsJNF7-rhEIub0prI8TWMDVvaffyBS2Cy2x-7Cc6GYHdfnZCG26GeK98V1V52H3_e9cF_ISXekSqQ8yAV8XcAAAD__1Tco30">