[llvm] r286171 - [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies

Mekhanoshin, Stanislav via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 17 15:01:13 PST 2016


Thank you! I have found the problem.

The v_cmp_* instruction does not preserve result bits for inactive lanes, but rather sets them to 0. This is in fact equivalent of EXEC[n]  & compare[n]. I suppose a correct propagation shall start not with v_cndmask_b32 which saves condition, but with a v_cmp instruction which restores it. In case if pattern is matched we can emit s_and_b32 of original scalar result with EXEC instead of v_cmp. Then the first v_cmdmask_b32 will have a chance to be deadcoded, while the new s_and_b32 will have a chance to be combined with the following EXEC use. With this approach fs-discard-exit-2 passes and 1 instruction shorter than original version.

Stas

-----Original Message-----
From: Michel Dänzer [mailto:michel at daenzer.net] 
Sent: Monday, November 14, 2016 1:55 AM
To: Mekhanoshin, Stanislav <Stanislav.Mekhanoshin at amd.com>; Tom Stellard <tom at stellard.net>
Cc: llvm-commits at lists.llvm.org; Nicolai Hähnle <nhaehnle at gmail.com>; Marek Olšák <maraeo at gmail.com>; Matt Arsenault <arsenm2 at gmail.com>
Subject: Re: [llvm] r286171 - [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies

On 11/11/16 12:36 PM, Mekhanoshin, Stanislav wrote:
> I have reverted the change in rL286530. I still do not see any issue with the code and cannot explain the failure.
> Michael, what GPU do you use?

It's Kaveri.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the llvm-commits mailing list