[PATCH] MachineCSE: Add a target query for the LookAheadLimit heurisitic
Tom Stellard
thomas.stellard at amd.com
Mon May 4 11:17:12 PDT 2015
In http://reviews.llvm.org/D9472#165425, @MatzeB wrote:
> Hi Tom,
>
> I don't know the history here, but as this does scan forward for each instruction of the basic block it looks like a way to avoid quadratic runtime behavior for (corner) cases with thousands of instructions in a basic block. I think it is no problem to go to a much higher limit than 5. But why go completely boundless, do you need a guarantee here that the CSE is happening?
Yes, I would like a guarantee that CSE is happening. For AMD GPUs, there is a control register (m0), which is used to clamp memory addresses to avoid out of bound reads and writes. Before each load/store instruction, we emit: s_mov_b32 m0, -1 (-1 disables address clamping) and then rely on MachineCSE to eliminate all the unnecessary moves.
REPOSITORY
rL LLVM
http://reviews.llvm.org/D9472
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list