[PATCH] D109280: [WIP][DSE] Memory intrinsics like memset, memcpy, memmove are removed if they are overwritten by a store in a loop

Wed Feb 2 20:46:09 PST 2022

vdsered added a comment.

In D109280#3291535 <https://reviews.llvm.org/D109280#3291535>, @fhahn wrote:

> @vdsered  are you still planning on pushing this forward?
>
> In D109280#3043563 <https://reviews.llvm.org/D109280#3043563>, @nikic wrote:
>
>> In D109280#3043179 <https://reviews.llvm.org/D109280#3043179>, @xbolva00 wrote:
>>
>>> I think this would have nontrivial impact on compile time (@nikic ?) and the results from testsuite do not look so promising - I would expect more “hits” to justify (small if possible) compile time regression.
>>
>> Yes, this has a large negative effect: https://llvm-compile-time-tracker.com/compare.php?from=64eaffb613d0cb7fa7542fa48281a2e617ad8ee9&to=6e7452757e16c4260fa9a5862761a68ed778dbf9&stat=instructions
>
> Agreed that the compile-time impact looks very large, but I *think* it may not be as bad as it looks at the moment. There seems to be a bug in the code that means we effectively run the main DSE loop twice if `EnableOptimizationAcrossLoops = false` (as in this patch).
>
> Thee patch unconditionally runs the whole main DSE loop again, as below. `prepareStateForAcrossLoopOptimization` has en early exit if `!EnableOptimizationAcrossLoops`. Otherwise clears `MemDeps` and roughly only adds MemDeps with loop ranges. So as a consequence, if `!EnableOptimizationAcrossLoops` we process *all* MemoryDeps again.
>
>   MadeChange |= State.prepareStateForAcrossLoopOptimization();
>   MadeChange |= runDSEOptimizationLoop(State);
>
> I tried to rebase the patches and collect compile-time numbers with that issue fixed: https://llvm-compile-time-tracker.com/compare.php?from=f622c7b7d33b211517d8fe4f725d1028d786fc08&to=00a8810b123e606c19d9926d11183318323a8752&stat=instructions
>
> NewPM-O3: +0.24%
> NewPM-ReleaseThinLTO: +0.39%
> NewPM-ReleaseLTO-g: +0.26%
>
> We should still see if we can get those down further, but I think they look more encouraging to start with.

@fhahn Yes, I do

Thank you for this analysis.

I think there are potential enhancements. For example, loop transformations shouldn't probably be done in this pass. It'd be probably better move this whole optimization into a specialized pass and run it closer to other loop passes in the default pipeline because loops'd be in the right form for free or loop deletion would remove loops that become empty after DSE and so on. However, I'm still not sure if this is a better idea than implementing it right here in DSE.

Plus, it'd be good to see how this behaves on larger projects (LLVM itself and so on).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109280/new/

https://reviews.llvm.org/D109280