[LLVMdev] wide memory accesses

Jonas Paulsson jnspaulsson at hotmail.com
Tue May 10 08:17:02 PDT 2011


Great answer, now it works. Only, I now have the immediately following problem:

I insert the copies from the 32 bit load, like

    %reg16507<def> = ld %r4
    %reg16457<def> = COPY %reg16507:lo16
    %reg16443<def> = COPY %reg16507:hi16
    %reg16510<def> = ld %r5
    %reg16458<def> = COPY %reg16510:lo16
    %reg16444<def> = COPY %reg16510:hi16
    %reg16468<def> = add %reg16457, %reg16458 ; 16457 is regclass high-part 

I make two loads, and copy high and low parts respectively to 16 bit registers for use, as you recommended.
The final instruction in the list is an addition, which is however onstrained to use the high subregister (%16457). The low subregister is copied to a virtual register with the hi16-regclass, but this gets coalesced to the wrong regclass.

The problem is that when I follow the comment for getMatchingSuperRegClass() - as wittingly I can - I then simply return A, as it is a proper register class
containing all registers in B; B is the regclass containing all subregisters of A with only the high parts. So, if A has Reg32_1, B has Reg16_1_hi, and so on.

So, when I return A, the COPY gets coalesced, but the registerclass for the new interval becomes the one of 32 bits, which gives an error for the addition instruction.

I would like to ask for help in as to what this method should actually do - if I am missing it, or if not, I then wonder what register classes I should use to make this work?

I perceive this:    A={32 bit regs}, B={high parts of the registers in A}, so if called with (A,B,:hi), return A.

Is there something else missing here?

Thanks,

Jonas

> Subject: Re: [LLVMdev] wide memory accesses
> From: stoklund at 2pi.dk
> Date: Mon, 9 May 2011 09:51:37 -0700
> CC: llvmdev at cs.uiuc.edu
> To: jnspaulsson at hotmail.com
> 
> 
> On May 9, 2011, at 9:00 AM, Jonas Paulsson wrote:
> 
> > Hi,
> > 
> > I am trying to take 16 bit memory reads and combine them to a single 32 bit read. I am having trouble to make the code simply read 32 bytes and the use the subregisters accordingly, without unnecessary copying.
> > 
> > I have tried two techniques, in the MachineFunction:
> > 
> > 1. replace the MachineOperands in the users of the data with the new register/subregister index. This yields an assert failure during VirtRegRewriter, in substPhysReg: "Invalid SubReg for physical register", after the Two-address rewrote this:
> > 
> > %reg16445<def> = add %reg16507:hi16, %reg16510:hi16 ; 32bit:16507,16510, 16bit: 16445  
> >   prepend:    %reg16445<def> = COPY %reg16507; 
> >   rewrite to:    %reg16445<def> = addh_1_8 %reg16445:hi16, %reg16510:hi16
> > 
> > In my eyes, there should not have been a subreg 'hi16' for the 16445 reg - this reg is 16 bits. I would have wished that the 16507:hi16 be interpreted as the corresponding subregister, and thus generated in the COPY with a :hi16. It is all right that the 16445 is of 16 bits, this is correct, but then it is used incorrectly with a :hi16 subregister value ?? Any ideas?
> 
> If you are doing this before the register allocator passes, you must make sure that the code preserves SSA form. That can be difficult to do when dealing with sub-registers. Other targets just emit EXTRACT_SUBREG / INSERT_SUBREG and let the coalescer deal with it.
> 
> It looks like TwoAddressInctructionPass is not ready to deal with subreg indexes either, it is creating wrong code in your example.
> 
> I recommend:
> 
> > 2. Insert COPY's, but these would not get coalesced away, so instead of saving instructions I ended up with one load and two moves...:-( How could I get the wide load to simply be used intelligently by COPYing to the old virtual registers?
> 
> You need to implement getMatchingSuperRegClass in your register info class. That will give the coalescer the needed information to join subreg copies.
> 
> /jakob
> 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110510/3b5af2a2/attachment.html>


More information about the llvm-dev mailing list