[PATCH] D73978: [WIP][FPEnv] Don't transform FSUB(-0.0,X)->FNEG(X) when flushing denormals
Cameron McInally via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 24 15:15:29 PST 2020
cameron.mcinally added a comment.
Thanks for that patch, Sanjay.
I have another issue which I hope you can help me sort out. There's a transform in `narrowExtractedVectorBinOp(...)` in DAGCombiner.cpp:
`// extract (binop B0, B1), N --> binop (extract B0, N), (extract B1, N)`
This transform only happens for binops, so we don't see it when SelectionDAGBuilder converts the FSUB->FNEG.
The IR is...
%rhs_neg = fsub <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>, %rhs
%splat = shufflevector <4 x float> %rhs_neg, <4 x float> undef, <2 x i32> <i32 3, i32 3>
and after DAGCombine we end up with DAGs like this...
FNEG:
< t9: v4f32 = bitcast t8
< t24: v4f32 = fneg t9
< t15: v2f32 = extract_subvector t24, Constant:i64<2>
< t17: v2f32 = vector_shuffle<1,1> t15, undef:v2f32
FSUB:
> t29: v1i64 = extract_subvector t8, Constant:i64<1>
> t30: v2f32 = bitcast t29
> t32: v2f32 = fneg t30
> t17: v2f32 = vector_shuffle<1,1> t32, undef:v2f32
Moving the extract to the operands (FSUB) is a problem on AArch64 since the extract **could** be rolled into the shuffle (FNEG). E.g.:
FNEG:
< t9: v4f32 = bitcast t8
< t24: v4f32 = fneg t9
< t26: v2f32 = AArch64ISD::DUPLANE32 t24, Constant:i64<3>
FSUB:
> t29: v1i64 = extract_subvector t8, Constant:i64<1>
> t30: v2f32 = bitcast t29
> t32: v2f32 = fneg t30
> t36: v4f32 = insert_subvector undef:v4f32, t32, Constant:i32<0>
> t37: v2f32 = AArch64ISD::DUPLANE32 t36, Constant:i64<1>
Any insight on the best way to correct this difference? I suppose I could fix up the extract+insert at the MachineInstruction level, but that doesn't seem like the correct fix since other targets could have the same problem.
I'm also a little skeptical about moving the extracts to the operands, and if it's a win in the general case. Seems like it would be stronger after any extract+insert peeps have occurred, but I suppose that's why it's done in DAGCombine. :/
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73978/new/
https://reviews.llvm.org/D73978
More information about the llvm-commits
mailing list