[llvm] r286171 - [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Mekhanoshin, Stanislav via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 17 15:01:13 PST 2016
Thank you! I have found the problem.
The v_cmp_* instruction does not preserve result bits for inactive lanes, but rather sets them to 0. This is in fact equivalent of EXEC[n] & compare[n]. I suppose a correct propagation shall start not with v_cndmask_b32 which saves condition, but with a v_cmp instruction which restores it. In case if pattern is matched we can emit s_and_b32 of original scalar result with EXEC instead of v_cmp. Then the first v_cmdmask_b32 will have a chance to be deadcoded, while the new s_and_b32 will have a chance to be combined with the following EXEC use. With this approach fs-discard-exit-2 passes and 1 instruction shorter than original version.
Stas
-----Original Message-----
From: Michel Dänzer [mailto:michel at daenzer.net]
Sent: Monday, November 14, 2016 1:55 AM
To: Mekhanoshin, Stanislav <Stanislav.Mekhanoshin at amd.com>; Tom Stellard <tom at stellard.net>
Cc: llvm-commits at lists.llvm.org; Nicolai Hähnle <nhaehnle at gmail.com>; Marek Olšák <maraeo at gmail.com>; Matt Arsenault <arsenm2 at gmail.com>
Subject: Re: [llvm] r286171 - [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
On 11/11/16 12:36 PM, Mekhanoshin, Stanislav wrote:
> I have reverted the change in rL286530. I still do not see any issue with the code and cannot explain the failure.
> Michael, what GPU do you use?
It's Kaveri.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the llvm-commits
mailing list