[PATCH] MachineCSE: Add a target query for the LookAheadLimit heurisitic

Mon May 4 11:17:12 PDT 2015

In http://reviews.llvm.org/D9472#165425, @MatzeB wrote:

> Hi Tom,
>
> I don't know the history here, but as this does scan forward for each instruction of the basic block it looks like a way to avoid quadratic runtime behavior for (corner) cases with thousands of instructions in a basic block. I think it is no problem to go to a much higher limit than 5. But why go completely boundless, do you need a guarantee here that the CSE is happening?

Yes, I would like a guarantee that CSE is happening.  For AMD GPUs, there is a control register (m0), which is used to clamp memory addresses to avoid out of bound reads and writes.  Before each load/store instruction, we emit: s_mov_b32 m0, -1  (-1 disables address clamping) and then rely on MachineCSE to eliminate all the unnecessary moves.

REPOSITORY
  rL LLVM

http://reviews.llvm.org/D9472

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/