[llvm-dev] [RFC] Attribute overhaul 2

Mon Mar 20 09:48:32 PDT 2017

LLVM's Attribute APIs need an overhaul.

Current problems
================

First, testing for an attribute on an Argument is slow.
llvm::AttributeSet::getAttributes(int) consumed 2% of cycles while
optimizing
llc during LTO. Our mid-level optimizations are constantly asking if a given
argument has some attribute (nonnull, dereferencable, etc), and this is
currently linear in the size of the function prototype.  This should be
constant
time.

Adding and removing individual attributes is also inefficient. Every single
attribute addition or removal convenience method, for example, inserts a new
AttributeSet into a FoldingSet. AttributeFuncs::mergeAttributesForInlining
is
written in terms of these APIs, and is therefore quite inefficient. This
also
shows up during LTO. I don't think it's practical to remove these inherently
expensive APIs, but if we make AttributeBuilder easier to use, it will be a
lot
easier to fix these problems as they come up.

Lastly, the Attribute APIs are hard to use because they do too much
information
hiding. In particular, the choice to make AttributeSetNode an internal
implementation detail of lib/IR is problematic. This type describes all of
the
attributes on an individual argument, return value, or function, which IPO
transforms often want. Today the getFnAttributes, getRetAttributes, and
getParamAttributes APIs find the relevant AttributeSetNode* and wrap it in a
new, uniqued AttributeListImpl. This requires callers to keep around the
index
of the extracted attributes so they can look through the wrapper list. If we
make AttributeSetNode public, we can simplify a lot of IPO transform code.

Naming
======

The naming of today's APIs is confusing. I'll try to explain what the
current
important classes are.

- AttributeSet: This is a uniqued, ordered list of sets of attributes, and
is
  associated with a function or call prototype. It is stored on Function,
  CallInst, and InvokeInst. It is a smart pointer value type that wraps
  AttributeSetImpl, which contains the actual storage.

- AttributeSetImpl: The private implementation of AttributeSet. Owned by the
  LLVMContext. Today this is a vector of pairs of attribute indices and
  AttributeSetNode pointers.

- AttributeSetNode: This is an ordered, uniqued set of Attributes that might
  apply to a single function, callee, return value, parameter, or
argument.  It
  uses TrailingObjects to store the attributes, and until Jan 2016, tested
for
  attribute presence by a linear scan. Matthias Braum added a bitset to
speed up
  tests in r259251.

- AttributeBuilder: A mutable representation of an AttributeSetNode. Used
for
  efficiently building a collection of attributes before freezing it into an
  AttributeSetNode.

- Attribute: Pointer wrapping an AttributeImpl.

- AttributeImpl: Polymorphic base class of StringAttributeImpl,
  EnumAttributeImpl, and IntAttributeImpl. Enums have the attribute kind,
  integers have a uint64_t value, and strings have two StringRefs for the
kind
  and value.

AttributeSet doesn't seem like a good name to me. In the past it was called
AttrListPtr and PAListPtr. Today's AttributeSetImpl was called
"ParameterAttributeList", which is why we have "PAL" local variables.  I'd
like
to rename AttributeSet to just "AttributeList".  It's a list of sets of
attributes that is parallel to some function prototype. I already have a
patch
out for this here: https://reviews.llvm.org/D31102

The natural second thing to do is to rename AttributeSetNode to
AttributeSet, or
create a pointer wrapper type called AttributeSet and rename today's
AttributeSetNode to AttributeSetImpl. It is inherently an ordered set of
attributes. I also propose to make this type public, as described earlier.

Optimizations
=============

Testing for presence of an attribute on a parameter should be fast. Today
it is
linear in the size of the prototype. I propose that we change AttributeList
to
make attributes randomly accessible by slot index, so that it stores a
trailing
object array of AttributeSetNode*, or some equivalent type that makes it
efficient to test for attribute presence. I'll do some benchmarking to show
that
this change doesn't measurably impact LTO memory usage.

I also want to reduce the number of loads needed to test a single attribute.
One potential optimization is to move the bitset of enum attributes up out
of
AttributeSetNode* into a wrapper type, perhaps called AttributeSet. The
wrapper
type could be a bitset and pointer pair, or we could try to union those
together
in the usual way to save memory. I suspect the memory savings of the union
are
unimportant, but I will measure. If we can always store the enumerated
attributes that have no associated size locally, that seems ideal, since it
allows modifying them without copying the immutable data stored in the LLVM
context.

Another idea is to eliminate the use of AttributeList completely from
Function,
and instead store the attributes of an Argument directly *in* the Argument.
The
function and return attributes would live on the Function. At the end of the
day, `Arg->hasNonNullAttr()` should compile to a load and a bittest. This
is a
more invasive change that would require more API migration, but again it
would
avoid more FoldingSet usage.

The last two optimizations taken together are particularly nice, because it
means that changing enum attributes on functions and arguments won't trigger
copies of immutable objects. It will only involve flipping bits in a locally
stored bitset.

Separately, I want to collapse dereferenceable(N), nonnull, and
dereferenceable_or_null(N) for efficiency and orthogonality. I will probably
raise this in a separate RFC, but I think it was a mistake to have to
attribute
imply nonnull. I'm mostly looking at this from an efficiency and redundancy
perspective, though, not a semantic one.

----------

Does this all seem reasonable? Speak up now if you think any of this looks
like
a step in the wrong direction.

Notes:

RFC Overhauling Attributes, Bill Wending Sept 2012
https://groups.google.com/forum/#!topic/llvm-dev/RnZVfYmmkMc
User struggle
http://lists.llvm.org/pipermail/llvm-dev/2013-January/058822.html
dereferenceable vs dereferenceable_or_null split
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/82186
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170320/72b48fd0/attachment.html>