[PATCH] D73269: [llvm-readobj] - Add a few warnings for --gnu-hash-table.

George Rimar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jan 25 11:44:04 PST 2020


grimar marked an inline comment as done.
grimar added inline comments.


================
Comment at: llvm/tools/llvm-readobj/ELFDumper.cpp:2525
+    // vector. We should not report a warning in this case.
+    bool IsEmptyHashTable =
+        GnuHashTable->symndx == NumSyms && BloomFilter.size() == 1 &&
----------------
MaskRay wrote:
> grimar wrote:
> > MaskRay wrote:
> > > grimar wrote:
> > > > MaskRay wrote:
> > > > > `GnuHashTable->symndx == 1` is probably sufficient.
> > > > > 
> > > > > The other conditions are redundant.
> > > > > GnuHashTable->symndx == 1 is probably sufficient.
> > > > 
> > > > It is not, because it can be > 1. For example when we have an object with an
> > > > undefined dynamic symbol and use lld:
> > > > 
> > > > 
> > > > **1.s:**
> > > > .global bar
> > > > 
> > > > ```
> > > > as 1.s -o 1.o
> > > > lld 1.o -o 1.so -shared --hash-style=gnu
> > > > llvm-readelf --gnu-hash-table 1.so
> > > > ```
> > > > 
> > > > ```
> > > > GnuHashTable {
> > > >   Num Buckets: 1
> > > >   First Hashed Symbol Index: 2
> > > >   Num Mask Words: 1
> > > >   Shift Count: 26
> > > >   Bloom Filter: [0x0]
> > > >   Buckets: [0]
> > > >   Values: []
> > > > }
> > > > ```
> > > > 
> > > > For GNU ld the output is different, btw (but it does not seem to be critical to fix LLD here, I guess):
> > > > 
> > > > ```
> > > > ld.bfd 1.o -o 1.so -shared --hash-style=gnu
> > > > llvm-readelf --gnu-hash-table 1.so
> > > > ```
> > > > 
> > > > ```
> > > > GnuHashTable {
> > > >   Num Buckets: 1
> > > >   First Hashed Symbol Index: 1
> > > >   Num Mask Words: 1
> > > >   Shift Count: 0
> > > >   Bloom Filter: [0x0]
> > > >   Buckets: [0]
> > > >   Values: [0x0]
> > > > }
> > > > ```
> > > > 
> > > OK, `!Buckets.empty() && Buckets[0] == 0` is sufficient.
> > > 
> > > The number of buckets does not matter. The size of the bloom filter does not matter (it can preclude some symbols. Since there is no defined dynamic symbol, it shouldn't matter if the bloom filter precludes anything).
> > As far I understand, the first thing that any dynamic loader tries to do during a symbol lookup is that it checks agains the Bloom filter,
> > When we have a zeroed Bloom filter of any size, it means the object can't contain any symbol.
> > 
> > Then it means that what is really important for an empty gnu hash table is that the Bloom filter is zeroed.
> > And that I believe is what any linker produces and what we might want to check.
> > Anything else probably is not important.
> > 
> > The code to check against the filter can be:
> > (https://blogs.oracle.com/solaris/gnu-hash-elf-sections-v2)
> > ```
> >       /* Test against the Bloom filter */
> >         c = sizeof (BloomWord) * 8;
> >         n = (h1 / c) & os->os_maskwords_bm;
> >         bitmask = (1 << (h1 % c)) | (1 << (h2 % c));
> >         if ((os->os_bloom[n] & bitmask) != bitmask)
> >                 return (NULL);
> > ```
> > 
> > So it uses `maskwords` and bloom filter words.
> > So we want to check that maskwords==1 and that the Bloom filter is empty.
> > 
> > Hence, I'd simplify this to:
> > 
> > ```
> > bool IsEmptyHashTable = `GnuHashTable->maskwords == 1 && BloomFilter.size() == 1 && BloomFilter.front() == 0`;
> > ```
> > 
> > Are you OK with this?
> > 
> > 
> The bloom filter saying "true" is a necessary condition. It the first thing to check, but it is not required to return "false" when there is a question "can symbol foo exist". It can say "true", and let the bucket/chain filter to say "false".
> So, ``GnuHashTable->maskwords == 1 && BloomFilter.size() == 1 && BloomFilter.front() == 0`;` is indeed one way to make .gnu.hash reject all input, but it is not the only way.
> 
> All buckets being zeros (so `buckets[hash % num_buckets] = 0`) is a sufficient condition to say "false". This is another way to make .gnu.hash reject all input.
> 
> (Now I think the warning is probably not useful.)
> All buckets being zeros (so buckets[hash % num_buckets] = 0) is a sufficient condition to say "false". This is another way to make .gnu.hash reject all input.

LLD creates a single dummy bucket just because of an issue in the android loader I think:

```
  // Note that we don't want to create a zero-sized hash table because
  // Android loader as of 2018 doesn't like a .gnu.hash containing such
  // table. If that's the case, we create a hash table with one unused
  // dummy slot.
  nBuckets = std::max<size_t>((v.end() - mid) / 4, 1);
```

I see no other reason to create a bucket.
So the condition should actually probably be `(!Buckets.empty() && Buckets[0] = 0) || Buckets.empty()` then.

I think checking the Bloom filter is a more stable way?

> (Now I think the warning is probably not useful.)

I've introduced it because of
`W.printHexList("Values", GnuHashTable->values(NumSyms));`

which has a dangerous implementation:
```
  ArrayRef<Elf_Word> values(unsigned DynamicSymCount) const {
    return ArrayRef<Elf_Word>(buckets().end(), DynamicSymCount - symndx);
  }
```

Unfortunately we can't know the GNU hash table size, we might have no section headers and there is no dynamic tag that holds such a value. So it might overflow even with this patch. For example when we have N dynamic symbols and that N is larger than the number of values. I do not think there is a way to detect this.

So, this warning itself not needed, but the whole branch that exits early might help a bit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73269/new/

https://reviews.llvm.org/D73269





More information about the llvm-commits mailing list