[llvm-dev] DSE: Remove useless stores between malloc & memset

Dávid Bolvanský via llvm-dev llvm-dev at lists.llvm.org
Tue May 22 16:02:24 PDT 2018


IR:
define i32 @calloc_strlen_write_between() {
  %call = tail call noalias i8* @calloc(i32 10, i32 1)
  store i8 97, i8* %call, align 1
  %call1 = tail call i32 @strlen(i8* %call)
  ret i32 %call1
}


static bool eliminateStrlen(CallInst *CI, BasicBlock::iterator &BBI,
AliasAnalysis *AA, MemoryDependenceResults *MD,
const DataLayout &DL, const TargetLibraryInfo *TLI,
InstOverlapIntervalsTy &IOL,
DenseMap<Instruction *, size_t> *InstrOrdering) {

// Must be a strlen.
LibFunc Func;
Function *Callee = CI->getCalledFunction();
if (!TLI->getLibFunc(*Callee, Func) || !TLI->has(Func) ||
Func != LibFunc_strlen)
return false;

Value *Dst = CI->getOperand(0);
Instruction *UnderlyingPointer =
dyn_cast<Instruction>(GetUnderlyingObject(Dst, DL));
if (!UnderlyingPointer)
return false;
if (!isStringFromCalloc(Dst, TLI))
return false;

if (memoryIsNotModifiedBetween(UnderlyingPointer, CI, AA)) {
Value *Len = ConstantInt::get(CI->getType(), 0);
CI->replaceAllUsesWith(Len);
CI->eraseFromParent();
return true;
}
return false;
}


------------------------------------------------------
That IR is still wrongly transformed with this code to ret i32 0 (but there
is write between calloc and strlen). Any suggestions?

2018-05-23 0:49 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:

> It works with
>
> MemoryLocation MemoryLocation::get(const CallInst *CI) {
> AAMDNodes AATags;
> CI->getAAMetadata(AATags);
> const auto &DL = CI->getModule()->getDataLayout();
>
> return MemoryLocation(CI, DL.getTypeStoreSize(CI->getType()), AATags);
> }
>
> Is it fine? :)
>
> 2018-05-22 23:56 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
>
>> Looks like there are many overloads for "get". http://llvm.org/doxygen
>> /MemoryLocation_8cpp_source.html
>>
>> But nothing for CallInst. Any suggestions how to do a proper one? I will
>> look at it too.
>>
>> 2018-05-22 23:34 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
>>
>>> Full stack trace:
>>>
>>> opt: /home/xbolva00/LLVM/llvm/include/llvm/ADT/Optional.h:176: T*
>>> llvm::Optional<T>::getPointer() [with T = llvm::MemoryLocation]:
>>> Assertion `Storage.hasVal' failed.
>>> Stack dump:
>>> 0. Program arguments: opt aaa.ll -dse -S
>>> 1. Running pass 'Function Pass Manager' on module 'aaa.ll'.
>>> 2. Running pass 'Dead Store Elimination' on function '@calloc_strlen'
>>> LLVMSymbolizer: error reading file: No such file or directory
>>> #0 0x000056135ebe698a (opt+0x212198a)
>>> #1 0x000056135ebe4cf4 (opt+0x211fcf4)
>>> #2 0x000056135ebe4e32 (opt+0x211fe32)
>>> #3 0x00007f6e35b14150 __restore_rt (/lib/x86_64-linux-gnu/libpthr
>>> ead.so.0+0x13150)
>>> #4 0x00007f6e3481b0bb gsignal /build/glibc-itYbWN/glibc-2.26
>>> /signal/../sysdeps/unix/sysv/linux/raise.c:51:0
>>> #5 0x00007f6e3481cf5d abort /build/glibc-itYbWN/glibc-2.26
>>> /stdlib/abort.c:92:0
>>> #6 0x00007f6e34812f17 __assert_fail_base /build/glibc-itYbWN/glibc-2.26
>>> /assert/assert.c:92:0
>>> #7 0x00007f6e34812fc2 (/lib/x86_64-linux-gnu/libc.so.6+0x2efc2)
>>> #8 0x000056135e962b80 (opt+0x1e9db80)
>>> #9 0x000056135e969260 (opt+0x1ea4260)
>>> #10 0x000056135e96a6e0 (opt+0x1ea56e0)
>>> #11 0x000056135e61d561 (opt+0x1b58561)
>>> #12 0x000056135e61d5d9 (opt+0x1b585d9)
>>> #13 0x000056135e61cbb7 (opt+0x1b57bb7)
>>> #14 0x000056135d175216 (opt+0x6b0216)
>>> #15 0x00007f6e348051c1 __libc_start_main /build/glibc-itYbWN/glibc-2.26
>>> /csu/../csu/libc-start.c:342:0
>>> #16 0x000056135d1f404a (opt+0x72f04a)
>>>
>>>
>>> 2018-05-22 23:32 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>
>>>> It looks like the memoryIsNotModifiedBetween assumes the second
>>>> argument is a store, or some other instruction supported by
>>>> MemoryLocation::get.  If you're passing in something else, you'll have to
>>>> compute the MemoryLocation some other way.
>>>>
>>>> (Generally, if you're asking a question about an assertion, please
>>>> include the whole stack trace; it's hard to guess what's happening
>>>> otherwise.)
>>>>
>>>> -Eli
>>>>
>>>>
>>>> On 5/22/2018 2:16 PM, Dávid Bolvanský wrote:
>>>>
>>>> * if (isStringFromCalloc(Dst, TLI)) should be if
>>>> (!isStringFromCalloc(Dst, TLI))
>>>> but still asserting...
>>>>
>>>> 2018-05-22 23:06 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
>>>>
>>>>> Can you help a bit?
>>>>>
>>>>> I try to work with DSE but I got the following assert:
>>>>> opt: /home/xbolva00/LLVM/llvm/include/llvm/ADT/Optional.h:176: T*
>>>>> llvm::Optional<T>::getPointer() [with T = llvm::MemoryLocation]:
>>>>> Assertion `Storage.hasVal' failed.
>>>>>
>>>>> static bool eliminateStrlen(CallInst *CI, BasicBlock::iterator &BBI,
>>>>>                             AliasAnalysis *AA, MemoryDependenceResults *MD,
>>>>>                             const DataLayout &DL, const TargetLibraryInfo *TLI,
>>>>>                             InstOverlapIntervalsTy &IOL,
>>>>>                             DenseMap<Instruction *, size_t> *InstrOrdering) {
>>>>>
>>>>>   // Must be a strlen.
>>>>>   LibFunc Func;
>>>>>   Function *Callee = CI->getCalledFunction();
>>>>>   if (!TLI->getLibFunc(*Callee, Func) || !TLI->has(Func) ||
>>>>>       Func != LibFunc_strlen)
>>>>>     return false;
>>>>>
>>>>>   Value *Dst = CI->getOperand(0);
>>>>>   Instruction *UnderlyingPointer = dyn_cast<Instruction>(GetUnderlyingObject(Dst, DL));
>>>>>   if (!UnderlyingPointer)
>>>>>     return false;
>>>>>   if (isStringFromCalloc(Dst, TLI))
>>>>>     return false;
>>>>>   errs() << "before\n";
>>>>>   if (memoryIsNotModifiedBetween(UnderlyingPointer, CI, AA)) { <--- CRASH
>>>>>     errs() << "after\n";
>>>>>   }
>>>>>   return false;
>>>>> }
>>>>>
>>>>> Do you know what is wrong here? I followed the "example" (in eliminateNoopStore) how to use "memoryIsNotModifiedBetween".
>>>>>
>>>>> Thank you for advice
>>>>>
>>>>>
>>>>> 2018-05-21 21:06 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>>
>>>>>> memoryIsNotModifiedBetween is precisely the sort of expensive walk we
>>>>>> shouldn't be doing... I'm surprised it hasn't caused any serious issues
>>>>>> yet.  Ideally, what we should be doing is using MemorySSA to find a
>>>>>> dependency from the memset: if the closest dependency is the malloc, there
>>>>>> aren't any stores between the memset and the malloc.  (But we aren't using
>>>>>> MemorySSA in DSE yet; see https://reviews.llvm.org/D40480.)
>>>>>>
>>>>>> But yes, memoryIsNotModifiedBetween has the right meaning.
>>>>>>
>>>>>> -Eli
>>>>>>
>>>>>>
>>>>>> On 5/21/2018 7:48 AM, Dávid Bolvanský wrote:
>>>>>>
>>>>>> "memory accesses between the malloc and the memset without an
>>>>>> expensive linear scan of the block/function"
>>>>>>
>>>>>> (1) do you mean just use "memoryIsNotModifiedBetween" function in DSE
>>>>>> to check it?
>>>>>>
>>>>>> x = maloc(..);
>>>>>> memset(x, ...)
>>>>>>
>>>>>> (2) GetUnderlyingObject would give me Value * (from malloc) ?
>>>>>>
>>>>>> Also another case:
>>>>>>
>>>>>> memset(s, 0, len); // len > 1
>>>>>> return strlen(s); // optimize to 0
>>>>>>
>>>>>> (3) How to check memset and strlen pairs? I have a strlen call, I
>>>>>> have a "Value *" for "s". What is the best way to construct memset +
>>>>>> strlen pairs?
>>>>>>
>>>>>> (4) Can this be somehow generalized (and not only for strlen)? So
>>>>>> GetStringLength in ValueTracking would be taught about this info (string is
>>>>>> empty after memset)
>>>>>>
>>>>>> (5) new malloc memset folding /  memset + strlen case should be
>>>>>> implemented in DSE, right?
>>>>>>
>>>>>> 2018-05-17 21:36 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>>>
>>>>>>> The fundamental problem with trying to do that sort of transform in
>>>>>>> instcombine is that you don't have access to MemoryDependenceAnalysis or
>>>>>>> MemorySSA; you need a data structure like that to figure out whether there
>>>>>>> are any memory accesses between the malloc and the memset without an
>>>>>>> expensive linear scan of the block/function.  (You can sort of get around
>>>>>>> the problem in simple cases by adding arbitrary limits to the number of you
>>>>>>> scan, but it doesn't generalize well.)
>>>>>>>
>>>>>>> -Eli
>>>>>>>
>>>>>>>
>>>>>>> On 5/17/2018 12:17 PM, Dávid Bolvanský wrote:
>>>>>>>
>>>>>>> As we talked in https://reviews.llvm.org/D45344, the problem was
>>>>>>> dead stores. And I know why :D There was just -instcombine pass. I forgot
>>>>>>> to do -dse before -instcombine so this is why I did custom "store removal"
>>>>>>> code there.
>>>>>>>
>>>>>>> I would like to finish malloc + llvm.memset folding. Yes, you told
>>>>>>> you would like to see the whole foldMallocMemset in DSE but extend it for
>>>>>>> llvm.memset in InstCombine... is it really so bad to do?
>>>>>>> We have standard malloc + memset folding there, so a  few  new lines
>>>>>>> should not do bad things.
>>>>>>>
>>>>>>> If I reopen D45344, reupload patch with removed my custom "store
>>>>>>> removal" code, It could be ok, no? The patch as is worked/works for me for
>>>>>>> malloc + llvm.memset folding, I would just add -dse to tests to handle dead
>>>>>>> stores.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2018-05-17 21:00 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>>>>
>>>>>>>> On 5/17/2018 8:58 AM, Dávid Bolvanský via llvm-dev wrote:
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I would like to find a way to do this removal properly. I found
>>>>>>>>> DSE and "eliminateNoopStore" can be useful for this thing.
>>>>>>>>>
>>>>>>>>> What I mean?
>>>>>>>>> int *test = malloc(15 * sizeof(int));
>>>>>>>>> test[10] = 12; < ----- remove this store
>>>>>>>>> memset(test,0,sizeof(int) * 15);
>>>>>>>>>
>>>>>>>>
>>>>>>>> This is classic dead store elimination, and it's already handled by
>>>>>>>> DSE. At least, we optimize the following testcase:
>>>>>>>>
>>>>>>>> #include <stdlib.h>
>>>>>>>> #include <string.h>
>>>>>>>> void bar(int*);
>>>>>>>> void f() {
>>>>>>>>   int *test = malloc(15 * sizeof(int));
>>>>>>>>   test[10] = 12;
>>>>>>>>   memset(test,0,sizeof(int) * 15);
>>>>>>>>   bar(test);
>>>>>>>> }
>>>>>>>>
>>>>>>>> You should be able to look at the existing code to understand how
>>>>>>>> it's handled (using MemoryDependenceAnalysis).
>>>>>>>>
>>>>>>>> -Eli
>>>>>>>>
>>>>>>>> --
>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>>>>>>> a Linux Foundation Collaborative Project
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180523/3c4af193/attachment.html>


More information about the llvm-dev mailing list