[PATCH RFC 4/4] R600: optimize the UDIVREM 64 algorithm

Matt Arsenault arsenm2 at gmail.com
Fri Apr 25 13:27:39 PDT 2014


On Apr 25, 2014, at 1:08 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:

> On Fri, 2014-04-25 at 12:27 -0700, Matt Arsenault wrote:
>> On 04/25/2014 12:08 PM, Jan Vesely wrote:
>>> +        HBit = DAG.getNode(AMDGPUISD::BFE_U32, DL, HalfVT, LHS_Lo, POS, one);
>> This might want a check for a subtarget with BFE instructions. 
>> Alternatively, I've been thinking it might be easier for places that 
>> want to use BFE to just use it, and to handle expanding BFE nodes for 
>> the old GPUs that don't support it somewhere else.
> 
> Or it can be dropped entirely. I used the attached file for testing.
> counting asm total lines and lines with '*'
> 
> Linecount with BFE is 700 and 202 (lines with '*'), without BFE, it is
> 743/201. I'm not sure which one is expected to perform better.

I have a patch I haven’t posted yet that adds a bunch of DAG combines for BFE nodes which might help



More information about the llvm-commits mailing list