[LLVMdev] anti-dependency breaking and mask/shift dependencies

Wed Jun 13 12:31:03 PDT 2012

On Mon, 11 Jun 2012 11:07:10 -0500
Hal Finkel <hfinkel at anl.gov> wrote:

> On Mon, 11 Jun 2012 08:56:21 -0700
> Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
> 
> > 
> > On Jun 11, 2012, at 8:07 AM, Hal Finkel wrote:
> > 
> > > Also, I think the following might work well: If we add a special
> > > kind of register dependency called a 'remembered' register. This
> > > is not a real dependency meaning that that the instruction does
> > > not actually read or write to the register, but it means that if
> > > the register allocator (or anything else) swaps the referenced
> > > register for another one (or a new virtual register), then the
> > > 'remembered' register needs to be swapped as well. Using this I
> > > can create a late-expanded pseuso which represents the necessary
> > > mask/shift operation. This operation has real read/write
> > > dependencies on the GPRs being used, but also needs to 'remember'
> > > from which cr the input originally came. On the other hand, this
> > > might create a bunch of dead-register-dependency special cases in
> > > CodeGen which would not be worth the effort. What do you think?
> > 
> > I am not sure I follow completely, but it sounds like it would be
> > quite fragile?
> 
> I think you're right, I retract my suggestion.

I think that I changed my mind again ;) -- Let me explain more
concretely:

The problem is a code sequence such as:
crX = compare gprA, gprB
grpC = move_all_crs_to_gpr
gprD = shift_and_mask_crX gprC

In the current implementation, we can form the shift_and_mask_crX (with
all of its immediate constants) at lowering time because we fix crX (to
cr7 specifically). It would be better to be able to let the register
allocator assign crX to any available condition register. The problem
is just that in order to correctly form shift_and_mask_crX, we need to
know which physical cr was chosen.

One possible "solution" is to make shift_and_mask_crX a pseudo
instruction:
gprD = shift_and_mask_pseudo crX, gprC

But this is less than optimal because in this sequence:
crX = compare gprA, gprB
grpC = move_all_crs_to_gpr
(*)
gprD = shift_and_mask_pseudo crX, gprC

at (*), which could be arbitrarily long, the physical register assigned
to crX would be held live by the register allocator (because the
shift_and_mask_pseudo depends on it). But during (*) we don't need the
value of the cr (the value has already been copied into gprC), we just
need to know the identity of the physical register assigned to crX.

What I had proposed above was to allow tagging the pseudo's dependence
on crX as only some kind of "remembered" dependence so that the live
interval of crX would actually end at the move_all_crs_to_gpr
instruction even though the shift_and_mask_pseudo would get to see the
correct physical register assignment.

Thanks again,
Hal

> 
> Thanks again,
> Hal
> 
> > 
> > /jakob
> > 
> 
> 
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory