[Libclc-dev] [PATCH v3 1/1] R600: Add new intrinsic to read work dimensions

Jeroen Ketema j.ketema at imperial.ac.uk
Sun Aug 10 15:04:40 PDT 2014


On 10 Aug 2014, at 22:42, Jan Vesely <jan.vesely at rutgers.edu> wrote:

> On Thu, 2014-08-07 at 17:23 +0100, Jeroen Ketema wrote:
>> I think Jan posted this to the wrong list libclc instead of llvm-dev?
>> Maybe post the updated patch to llvm-dev?
> 
> you're right, sorry about that. I'll be more careful with v4.

No problem, it’s mostly that you get maximum exposure there,
although I guess that most people that care also read this list.

Jeroen


> 
> jan
> 
>> 
>> Jeroen
>> 
>> On 07 Aug 2014, at 17:11, Tom Stellard <tom at stellard.net> wrote:
>> 
>>> On Thu, Aug 07, 2014 at 12:01:00PM -0400, Jan Vesely wrote:
>>>> On Wed, 2014-08-06 at 15:21 -0700, Matt Arsenault wrote:
>>>>> On 08/06/2014 03:08 PM, Jan Vesely wrote:
>>>>>> +  case Intrinsic::r600_read_workdim: {
>>>>>> +    const size_t arg_size = DAG.getMachineFunction().getFunction()->arg_size();
>>>>> arg_size() returns the number of arguments, not their actual size. You 
>>>>> can't assume every argument is 4 bytes. There could be larger types, 
>>>>> vectors, or structs
>>>> 
>>>> ah, right. I guess I'll have to repeat most of the argument magic from
>>>> clover/llvm/invocation.cpp
>>>> 
>>> 
>>> The way I would do this is to add a field to AMDGPUMachineFunctionInfo
>>> called something like ABIArgOffset, and then set in
>>> SITargetLowering::LowerFormalArguments() where we compute the offset for
>>> each argument something like:
>>> 
>>>  for (unsigned i = 0, e = Ins.size(), ArgIdx = 0; i != e; ++i) {
>>> 
>>>    const ISD::InputArg &Arg = Ins[i];
>>>    if (Skipped & (1 << i)) {
>>>      InVals.push_back(DAG.getUNDEF(Arg.VT));
>>>      continue;
>>>    }
>>> 
>>>    CCValAssign &VA = ArgLocs[ArgIdx++];
>>>    EVT VT = VA.getLocVT();
>>> 
>>>    if (VA.isMemLoc()) {
>>>      VT = Ins[i].VT;
>>>      EVT MemVT = Splits[i].VT;
>>> +      unsigned Offset = 36 + VA.getLocMemOffset();
>>>      // The first 36 bytes of the input buffer contains information
>>>      // about
>>>      // thread group and global sizes.
>>>      SDValue Arg = LowerParameter(DAG, VT, MemVT,  DL, DAG.getRoot(),
>>> +                                   Offset,
>>> -                                   36 + VA.getLocMemOffset(),
>>>                                   Ins[i].Flags.isSExt());
>>>      InVals.push_back(Arg);
>>> +      MFI->ABIArgOffset = Offset + MemVT.getSizeInBits() / 8
>>>      continue;
>>>    }
>>> 
>>> 
>>> Then can you can use this computed offset when you lower get_dim.
>>> 
>>> -Tom
>>> 
>>>>>> +    return LowerParameter(DAG, VT, VT, DL, DAG.getEntryNode(), 36 + (arg_size * 4), false);
>>>>>> +  }
>>>>> 
>>>> 
>>>> -- 
>>>> Jan Vesely <jan.vesely at rutgers.edu>
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Libclc-dev mailing list
>>> Libclc-dev at pcc.me.uk
>>> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
>> 
> 
> -- 
> Jan Vesely <jan.vesely at rutgers.edu>





More information about the Libclc-dev mailing list