[llvm-dev] Improving codegen for not-quite-constant-folding

David Goldblatt via llvm-dev llvm-dev at lists.llvm.org
Tue Jun 1 19:56:12 PDT 2021


Hi all,

Some Facebook colleagues and I are looking at poor codegen for the
C++20 spaceship operator (i.e. operator <=>, which performs 3-way
comparisons), when used to implement simple comparisons (e.g.
operator<). A simple example is at
https://gcc.godbolt.org/z/qhn6cf7nv, which desugars into something
like https://gcc.godbolt.org/z/4z4e1jb6T. By comparison, a
hand-written operator< would probably look like one of
https://gcc.godbolt.org/z/zn1en8fGW, or
https://gcc.godbolt.org/z/6oY5exTa8, or
https://gcc.godbolt.org/z/j6734zWYz.

A reduced example with the same issue is here:
https://gcc.godbolt.org/z/6E1KxE83x. Broadly, a tree of select/phi
nodes with constants at the leaves isn't subject to constant folding,
even though the reduced example could be rewritten as something like:
https://gcc.godbolt.org/z/356qr6rKK (this saves a multiply, and in
"real" examples enables further simplifications).

GCC's GVN/PRE pass does a better job here with -O2 -ftree-partial-pre
("partial pre" is not a typo; it's "partial partial redundancy
elimination", in addition to regular partial redundancy elimination),
or -O3 (which turns on -ftree-partial-pre). -ftree-partial-pre tracks
partial anticipation of expressions (those that are evaluated on some
path to exit) on a per-BB level, and inserts the necessary phis to
ensure that a partially anticipated expression is available in a
temporary if temporaries with the same value number are available in
all a BB's predecessors. It does constant folding after phi
translation before checking for availability, and regards all
constants as always available. After phi-translating the result into
each branch of the ternary operators, we get constants, so the
optimization kicks in and we can replace each leaf constant with the
result of multiplying it by 123.

A simpler approach that could catch cases like this would be to find
instances of trees in which the leaves are constants, and the non-leaf
nodes are phis or selects. If the root of such a tree is the input
into an operation we could otherwise constant fold (if that root were
a constant), we constant-fold each leaf of the tree instead. This
would miss some optimization opportunities that the GCC approach would
catch, however.

The former feels tricky to implement on top of GVN or NewGVN (and,
it's unclear that it's always a net improvement). The latter seems
reasonable, although it would require a one-off pass targeted at this
relatively narrow case, which feels heavyweight. My inclination is to
attempt the simpler approach, but wanted a sanity check that this
seems reasonable and is not better done elsewhere (or with a different
approach entirely).

Best,
David


More information about the llvm-dev mailing list