[PATCH] D31398: [X86][X86 intrinsics]Folding cmp(sub(a, b), 0) into cmp(a, b) optimization
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 1 09:02:18 PDT 2017
spatel added a comment.
In https://reviews.llvm.org/D31398#716002, @m_zuckerman wrote:
> You are absolutely right, your transform is valid and we will do it after this patch.
> Since the intrinsics are lowered with generic IR, mine patch is still valid and we will need them both for a complete solution.
1. We need to be conservative for the general case...speaking from experience :). As @scanon mentioned, we need some way to tell whether denorms are flushed to zero or not. I think this patch is safe currently because we assume the default FP ENV, and on x86 that would not have DAZ/FTZ.
2. We don't need nsz or nnan for this fold (see @scanon comment).
3. I'd prefer to put all of the tests in one file since they are just variants of the same fold.
https://reviews.llvm.org/D31398
More information about the llvm-commits
mailing list