[llvm-commits] [PATCH] Make Inits FoldingSetNodes

Thu Jul 14 15:09:29 PDT 2011

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes:

> This surprises me. From my reading of the TableGen code, a StringInit
> is only ever allocated after parsing a string. That means the only way
> you can get duplicate StringInit instances is by typing multiple
> identical strings in your .td files.

Yep.  For AVX there will be a lot of programmatic things like tagging on
"ps" suffixes, "v" prefixes and the like to asm strings and other such
things.  So you'll see a few of those "ps", etc. kinds of things and
they're duplicates.

When I first did AVX, the size of the Init pool grew pretty huge and
FoldingSet helped a lot.  Now, that was a while ago and I've massaged
AVX since then so maybe it's not so bad anymore but I'm still more
comfortable making sure we don't get an Init object explosion.

I don't know which particular types of Inits caused the explosion.  My
guess is StringInit wasn't one of them.  Probably more likely the
various OpInits.

> Multiclasses and default arguments and all that stuff is already
> sharing inits AFAIK.

This is another area where AVX/SIMD could introduce Init explosion.  In
my patches I will be adding a lot of dyanmism to how Record names get
generated.  It's the intermediate values that can be duplicates.

> But a StringMap is nice and fast for interning this stuff, so I don't
> mind doing that.

Ok.

>>> I don't think there is any reason to release memory. This function
>>> leaks a ton anyway since it doesn't call destructors, and it is only
>>> called immediately before exiting the process. It is very hard to beat
>>> _exit(2) for speed of deallocating complex data structures.
>> 
>> Ok, but will Valgrind report leaks?
>
> Yes. I don't think that is a problem.

Hmm...Some of our groups here have testing policies that say if valgrind
reports a problem, it's a bug.  I wish more of our groups did that,
actually.  That may sound pedantic but a lot of valgrind output can mask
a real problem that's reported.  It's the same rationale as cleaning up
compiler warnings.  It may not be a "real" problem but it can make the
"real" problems harder to find.

> Leaks are important to plug for long running processes and library code.

More important, yes, but I think for debugging purposes it's best to get
things as clean as possible.

> Short-lived command line tools like TableGen don't really benefit from
> cleaning up. And it can take a long time to deallocate complicated
> data structures.

I wouldn't think deallocation time dominates TableGen.  Does it?

                            -Dave