[PATCH] D53162: [DataLayout] Add bit width of pointers to global values

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 12 04:08:45 PDT 2018


nhaehnle added a comment.

Thank you all for taking a look.

In https://reviews.llvm.org/D53162#1263094, @delena wrote:

> Can you use the "address space" to distinguish between global and local in Data Layout?


We may be talking past each other. This is about GlobalValues (i.e. global variables, e.g. file-level variables in C/C++), not about global vs. local. Those could exist in any address space.



================
Comment at: include/llvm/IR/DataLayout.h:340
+  /// to be 0.
+  unsigned getPointerGlobalBitWidth(unsigned AS = 0) const;
+
----------------
theraven wrote:
> bjope wrote:
> > Maybe remove the default here. There are comments about removing AS defaults for other methods here, and usually when someone does not provide the address space in the call it could be a hard to find fault.
> Agreed.  The default value is useful for incrementally migrating consumers of APIs that didn't know about address spaces, but for a new API it doesn't make sense.
That makes a lot of sense, thanks for the history lesson :)


================
Comment at: lib/Analysis/ValueTracking.cpp:1344
+          if (LeadZ > 0)
+            LeadZ--;
+        }
----------------
theraven wrote:
> Does this correctly handle the case of invalid pointers being created as temporary values?  In CHERI, we saw a few cases where, after reassociation, you ended up with IR that took a pointer out of bounds in one calculation and then brought it back in as a result of the next one before dereferencing.  This should perhaps be restricted to inbounds GEPs (or is it already and I'm missing the check)?
The intention of this code is to just not care about the in- or out-of-boundedness of GEPs, so yes, it should handle temporary invalid pointers correctly (though maybe over-conservatively).

To think this through:
  - GEP indices are always supposed to be treated as signed integers. If the index may be negative, setting `LeadZ = 0` is conservatively correct: we make no claim about leading zeros at all anymore in this case.
  - Zero indices don't actually change the pointer value, so need no change of LeadZ.
  - Non-zero but positive indices change the pointer value by adding some offset. The intuition behind the code here is that the base pointer (or rather, the base pointer plus offsets from earlier GEP indices) fits into some number of bits, and the offset that we're adding in this GEP index fits into some number of bits, so their some fits into `max(those numbers) + 1`. Though the code itself is expressed in number of leading zeros as opposed to number of bits needed to represent the (unsigned) pointer value.

It seems like some tighter guarantees could be made for inbounds GEPs, but this code does not attempt that.

Does that make sense to you?


Repository:
  rL LLVM

https://reviews.llvm.org/D53162





More information about the llvm-commits mailing list