[PATCH] D110585: [Analysis][SimplifyLibCalls] Refactor code related to size_t in lib func signatures. NFCI

Alexander Richardson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 29 01:43:24 PDT 2021


arichardson added a comment.

In D110585#3029472 <https://reviews.llvm.org/D110585#3029472>, @bjope wrote:

> In D110585#3028146 <https://reviews.llvm.org/D110585#3028146>, @nikic wrote:
>
>> As I mentioned on the other review, I think that `DL.getIndexSizeInBits()` should be used for the size_t size. If we do that, do we really need a TLI hook?
>
> At least for my target getIndexSize won't match with size_t either. I made a quick experiment with changing the DataLayout to set the index size to something different than pointer size, but that was not so easy because I got lots of failures. At least from LoadStoreVectorizer (Vectorizer::areConsecutivePointers -> Value::stripAndAccumulateConstantOffsets -> assert "The offset bit width does not match the DL specification:"). So it might take some time to fix all such problems (and we've been have IndexSize equals PointerSize but not size of size_t in bits since the beginning of time afaict).
>
> I thought the TLI hook was useful to give this lookup a name and having the logic for how to find size of size_t in one place, regardless. As currently I think we agree that the faulty assumption of getIntPtrType=>"type for size_t" is incorrect, and that assumption has been used in more than one place (and by now it is kind of tricky to find if that has been done in several more places).
> If everyone is expected to know about "IndexSize is supposed to be matching size_t", then I agree that the helper just will appear as an unneeded extra wrapper. But I've never seen, or heard, anything about that earlier except in these reviews. I figure that the `getSizeTSize` hook will be the place where we kind of define/document that relation.

DL.getIndexSize() should match `ptrdiff_t` and therefore `size_t` on pretty much all architectures. In theory these two types could have a different width but I don't know if any such architecture exist(ed). I am not opposed to the hook, but I think it adds unncessary complexity and we should probably just add a comment to DataLayout/LangRef stating that we assume index size matches ptrdiff_t/size_t.

Due to DL.getIntPtrType() -> size_t assumption, I added a really ugly hack for CHERI targets to return IndexType when DL.getIntPtrType() is called with CHERI capabilities: https://github.com/CTSRD-CHERI/llvm-project/blob/bca8a554d2be06b6f8937c5531202a0e3e8d0f9a/llvm/lib/IR/DataLayout.cpp#L857
Ideally we would fix all those calls but we just don't have the time to do so. Considering this hack works, it seems to me that 90+% of DL.getIntPtrType() really want the address range (i.e. `ptrdiff_t`/`size_t`) rather than an integer with the same number of bits as `void*` i.e. `(u)intptr_t`

@bjope What do your pointers look like? For RISCV-64 CHERI and Arm's Morello we have 128 bits + 1 bit of hidden state, but only 64 of them are used for the address part, so `uintptr_t` is 128 bits and `size_t`/`ptrdiff_t` are 64 bits. Our datalayout string is `e-m:e-pf200:128:128:128:64-p:64:64-i64:64-i128:128-n64-S128-A200-P200-G200` (the `f` in pf200 marks address space 200 as using "fat pointers", since our backend predates non-integral pointers and non-integral pointers also limit optimizations a bit too much since `ptrtoint` is fine for us). We also have a version CHERI for 32-bit architectures where we 64-bit pointers+32-bit size_t, and therefore use `"e-m:e-pf200:64:64:64:32-p:32:32-i64:64-n32-S128-A200-P200-G200"`



================
Comment at: llvm/lib/Analysis/TargetLibraryInfo.cpp:755
     --NumParams;
-    if (!IsSizeTTy(FTy.getParamType(NumParams)))
+    if (!FTy.getParamType(NumParams)->isIntegerTy(SizeTBits))
       return false;
----------------
nikic wrote:
> I'd suggest to land these changes as an NFC refactoring -- just determine `SizeTBits` as `DL.getPointerSizeInBits()` for now.
That would be incorrect for CHERI (well if we hardcode AS0 it will happen to work since we keep AS0 as 64-bit integer pointers and use AS200 for capabilities).
It should IMO be `DL.getIndexSizeInBits(DL.getAllocaAddrSpace())` (or `getDefaultGlobalsAddressSpace()`) since that gives you `ptrdiff_t` which *should* be compatible with size_t.

One problem we have here is that there is no "libfunc address space" so there is no easy way to determine the expected address space for libfuncs without a pointer argument, etc. Should it be alloca/globals/something new?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110585/new/

https://reviews.llvm.org/D110585



More information about the llvm-commits mailing list