[PATCH] D40480: MemorySSA backed Dead Store Elimination.
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 7 06:43:55 PST 2017
dmgreen added a comment.
OK. I have some performance numbers. I'm compiling clang ("ninja clang") and using
-ftime-report/-stat to get info (with some extra precision for decimal places) and
summing the results for all the compiled files. The total runtime is a little noisy on
this machine, but these sub-numbers seem pretty stable between runs.
Firstly the good news. With this version we now remove more dead store.
Old: 41310 New: 51660
With my "MemSSA can enable us to remove more stores" hat on, this is good stuff.
Some more good news is that DSE is now quicker, for the sum of time for each file:
Old: ~26s New: ~19s
The bad news is that we also need to add in the MemorySSA passes. I think we now
calc this twice in the pipeline, not once as before, so times roughly double.
Old: ~35s New: ~69s
I'm hoping that in the long run we can shared the cost of this between other passes.
NewGVN is a couple of hops earlier in the LTO pass pipeline, LICM also quite close
in the normal one. Hopefully this cost can be shared out.
The other bad news is we use a post-dom tree (again, maybe sharable?):
Old: ~15s New: ~27s
But Memdeps is somehow now quicker:
Old: ~13s New: ~8.5s
The total runtime here was on the order of 10000s, so it's hard to pick out the overall
cost exactly. These results suggest that the total is now ~30s more, and excluding
MemSSA we are at roughly the same time.
I'm going to try and take a look at the most costly files and see if we can knock the most
expensive ones down without making the total slower. As Daniel mentioned, there some
good candidates for caching the results here, like those in isOverlap.
Maths isn't on my side for making the whole thing quicker. But it removes more dead stores :)
https://reviews.llvm.org/D40480
More information about the llvm-commits
mailing list