[llvm-dev] DSE: Remove useless stores between malloc & memset

Dávid Bolvanský via llvm-dev llvm-dev at lists.llvm.org
Tue May 22 16:22:12 PDT 2018


Yeah, sorry for that. Better "It compiles ok (but maybe incorrect code)",
not "It works" as I wrote.

2018-05-23 1:08 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:

> You might want to look more carefully at how you're constructing the
> MemoryLocation.   The first argument is a pointer, and the second argument
> is the number of bytes pointed to by that pointer (or
> MemoryLocation::UnknownSize if the number of bytes accessed isn't known).
>
> More generally, copy-pasting code you don't understand isn't a good idea.
>
> -Eli
>
>
> On 5/22/2018 4:02 PM, Dávid Bolvanský wrote:
>
> IR:
> define i32 @calloc_strlen_write_between() {
>   %call = tail call noalias i8* @calloc(i32 10, i32 1)
>   store i8 97, i8* %call, align 1
>   %call1 = tail call i32 @strlen(i8* %call)
>   ret i32 %call1
> }
>
>
> static bool eliminateStrlen(CallInst *CI, BasicBlock::iterator &BBI,
> AliasAnalysis *AA, MemoryDependenceResults *MD,
> const DataLayout &DL, const TargetLibraryInfo *TLI,
> InstOverlapIntervalsTy &IOL,
> DenseMap<Instruction *, size_t> *InstrOrdering) {
>
> // Must be a strlen.
> LibFunc Func;
> Function *Callee = CI->getCalledFunction();
> if (!TLI->getLibFunc(*Callee, Func) || !TLI->has(Func) ||
> Func != LibFunc_strlen)
> return false;
>
> Value *Dst = CI->getOperand(0);
> Instruction *UnderlyingPointer =
> dyn_cast<Instruction>(GetUnderlyingObject(Dst, DL));
> if (!UnderlyingPointer)
> return false;
> if (!isStringFromCalloc(Dst, TLI))
> return false;
>
> if (memoryIsNotModifiedBetween(UnderlyingPointer, CI, AA)) {
> Value *Len = ConstantInt::get(CI->getType(), 0);
> CI->replaceAllUsesWith(Len);
> CI->eraseFromParent();
> return true;
> }
> return false;
> }
>
>
> ------------------------------------------------------
> That IR is still wrongly transformed with this code to ret i32 0 (but
> there is write between calloc and strlen). Any suggestions?
>
> 2018-05-23 0:49 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
>
>> It works with
>>
>> MemoryLocation MemoryLocation::get(const CallInst *CI) {
>> AAMDNodes AATags;
>> CI->getAAMetadata(AATags);
>> const auto &DL = CI->getModule()->getDataLayout();
>>
>> return MemoryLocation(CI, DL.getTypeStoreSize(CI->getType()), AATags);
>> }
>>
>> Is it fine? :)
>>
>> 2018-05-22 23:56 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
>>
>>> Looks like there are many overloads for "get". http://llvm.org/doxygen
>>> /MemoryLocation_8cpp_source.html
>>>
>>> But nothing for CallInst. Any suggestions how to do a proper one? I will
>>> look at it too.
>>>
>>> 2018-05-22 23:34 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
>>>
>>>> Full stack trace:
>>>>
>>>> opt: /home/xbolva00/LLVM/llvm/include/llvm/ADT/Optional.h:176: T*
>>>> llvm::Optional<T>::getPointer() [with T = llvm::MemoryLocation]:
>>>> Assertion `Storage.hasVal' failed.
>>>> Stack dump:
>>>> 0. Program arguments: opt aaa.ll -dse -S
>>>> 1. Running pass 'Function Pass Manager' on module 'aaa.ll'.
>>>> 2. Running pass 'Dead Store Elimination' on function '@calloc_strlen'
>>>> LLVMSymbolizer: error reading file: No such file or directory
>>>> #0 0x000056135ebe698a (opt+0x212198a)
>>>> #1 0x000056135ebe4cf4 (opt+0x211fcf4)
>>>> #2 0x000056135ebe4e32 (opt+0x211fe32)
>>>> #3 0x00007f6e35b14150 __restore_rt (/lib/x86_64-linux-gnu/libpthr
>>>> ead.so.0+0x13150)
>>>> #4 0x00007f6e3481b0bb gsignal /build/glibc-itYbWN/glibc-2.26
>>>> /signal/../sysdeps/unix/sysv/linux/raise.c:51:0
>>>> #5 0x00007f6e3481cf5d abort /build/glibc-itYbWN/glibc-2.26
>>>> /stdlib/abort.c:92:0
>>>> #6 0x00007f6e34812f17 __assert_fail_base /build/glibc-itYbWN/glibc-2.26
>>>> /assert/assert.c:92:0
>>>> #7 0x00007f6e34812fc2 (/lib/x86_64-linux-gnu/libc.so.6+0x2efc2)
>>>> #8 0x000056135e962b80 (opt+0x1e9db80)
>>>> #9 0x000056135e969260 (opt+0x1ea4260)
>>>> #10 0x000056135e96a6e0 (opt+0x1ea56e0)
>>>> #11 0x000056135e61d561 (opt+0x1b58561)
>>>> #12 0x000056135e61d5d9 (opt+0x1b585d9)
>>>> #13 0x000056135e61cbb7 (opt+0x1b57bb7)
>>>> #14 0x000056135d175216 (opt+0x6b0216)
>>>> #15 0x00007f6e348051c1 __libc_start_main /build/glibc-itYbWN/glibc-2.26
>>>> /csu/../csu/libc-start.c:342:0
>>>> #16 0x000056135d1f404a (opt+0x72f04a)
>>>>
>>>>
>>>> 2018-05-22 23:32 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>
>>>>> It looks like the memoryIsNotModifiedBetween assumes the second
>>>>> argument is a store, or some other instruction supported by
>>>>> MemoryLocation::get.  If you're passing in something else, you'll have to
>>>>> compute the MemoryLocation some other way.
>>>>>
>>>>> (Generally, if you're asking a question about an assertion, please
>>>>> include the whole stack trace; it's hard to guess what's happening
>>>>> otherwise.)
>>>>>
>>>>> -Eli
>>>>>
>>>>>
>>>>> On 5/22/2018 2:16 PM, Dávid Bolvanský wrote:
>>>>>
>>>>> * if (isStringFromCalloc(Dst, TLI)) should be if
>>>>> (!isStringFromCalloc(Dst, TLI))
>>>>> but still asserting...
>>>>>
>>>>> 2018-05-22 23:06 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>
>>>>> :
>>>>>
>>>>>> Can you help a bit?
>>>>>>
>>>>>> I try to work with DSE but I got the following assert:
>>>>>> opt: /home/xbolva00/LLVM/llvm/include/llvm/ADT/Optional.h:176: T*
>>>>>> llvm::Optional<T>::getPointer() [with T = llvm::MemoryLocation]:
>>>>>> Assertion `Storage.hasVal' failed.
>>>>>>
>>>>>> static bool eliminateStrlen(CallInst *CI, BasicBlock::iterator &BBI,
>>>>>>                             AliasAnalysis *AA, MemoryDependenceResults *MD,
>>>>>>                             const DataLayout &DL, const TargetLibraryInfo *TLI,
>>>>>>                             InstOverlapIntervalsTy &IOL,
>>>>>>                             DenseMap<Instruction *, size_t> *InstrOrdering) {
>>>>>>
>>>>>>   // Must be a strlen.
>>>>>>   LibFunc Func;
>>>>>>   Function *Callee = CI->getCalledFunction();
>>>>>>   if (!TLI->getLibFunc(*Callee, Func) || !TLI->has(Func) ||
>>>>>>       Func != LibFunc_strlen)
>>>>>>     return false;
>>>>>>
>>>>>>   Value *Dst = CI->getOperand(0);
>>>>>>   Instruction *UnderlyingPointer = dyn_cast<Instruction>(GetUnderlyingObject(Dst, DL));
>>>>>>   if (!UnderlyingPointer)
>>>>>>     return false;
>>>>>>   if (isStringFromCalloc(Dst, TLI))
>>>>>>     return false;
>>>>>>   errs() << "before\n";
>>>>>>   if (memoryIsNotModifiedBetween(UnderlyingPointer, CI, AA)) { <--- CRASH
>>>>>>     errs() << "after\n";
>>>>>>   }
>>>>>>   return false;
>>>>>> }
>>>>>>
>>>>>> Do you know what is wrong here? I followed the "example" (in eliminateNoopStore) how to use "memoryIsNotModifiedBetween".
>>>>>>
>>>>>> Thank you for advice
>>>>>>
>>>>>>
>>>>>> 2018-05-21 21:06 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>>>
>>>>>>> memoryIsNotModifiedBetween is precisely the sort of expensive walk
>>>>>>> we shouldn't be doing... I'm surprised it hasn't caused any serious issues
>>>>>>> yet.  Ideally, what we should be doing is using MemorySSA to find a
>>>>>>> dependency from the memset: if the closest dependency is the malloc, there
>>>>>>> aren't any stores between the memset and the malloc.  (But we aren't using
>>>>>>> MemorySSA in DSE yet; see https://reviews.llvm.org/D40480.)
>>>>>>>
>>>>>>> But yes, memoryIsNotModifiedBetween has the right meaning.
>>>>>>>
>>>>>>> -Eli
>>>>>>>
>>>>>>>
>>>>>>> On 5/21/2018 7:48 AM, Dávid Bolvanský wrote:
>>>>>>>
>>>>>>> "memory accesses between the malloc and the memset without an
>>>>>>> expensive linear scan of the block/function"
>>>>>>>
>>>>>>> (1) do you mean just use "memoryIsNotModifiedBetween" function in
>>>>>>> DSE to check it?
>>>>>>>
>>>>>>> x = maloc(..);
>>>>>>> memset(x, ...)
>>>>>>>
>>>>>>> (2) GetUnderlyingObject would give me Value * (from malloc) ?
>>>>>>>
>>>>>>> Also another case:
>>>>>>>
>>>>>>> memset(s, 0, len); // len > 1
>>>>>>> return strlen(s); // optimize to 0
>>>>>>>
>>>>>>> (3) How to check memset and strlen pairs? I have a strlen call, I
>>>>>>> have a "Value *" for "s". What is the best way to construct memset +
>>>>>>> strlen pairs?
>>>>>>>
>>>>>>> (4) Can this be somehow generalized (and not only for strlen)? So
>>>>>>> GetStringLength in ValueTracking would be taught about this info (string is
>>>>>>> empty after memset)
>>>>>>>
>>>>>>> (5) new malloc memset folding /  memset + strlen case should be
>>>>>>> implemented in DSE, right?
>>>>>>>
>>>>>>> 2018-05-17 21:36 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>>>>
>>>>>>>> The fundamental problem with trying to do that sort of transform in
>>>>>>>> instcombine is that you don't have access to MemoryDependenceAnalysis or
>>>>>>>> MemorySSA; you need a data structure like that to figure out whether there
>>>>>>>> are any memory accesses between the malloc and the memset without an
>>>>>>>> expensive linear scan of the block/function.  (You can sort of get around
>>>>>>>> the problem in simple cases by adding arbitrary limits to the number of you
>>>>>>>> scan, but it doesn't generalize well.)
>>>>>>>>
>>>>>>>> -Eli
>>>>>>>>
>>>>>>>>
>>>>>>>> On 5/17/2018 12:17 PM, Dávid Bolvanský wrote:
>>>>>>>>
>>>>>>>> As we talked in https://reviews.llvm.org/D45344, the problem was
>>>>>>>> dead stores. And I know why :D There was just -instcombine pass. I forgot
>>>>>>>> to do -dse before -instcombine so this is why I did custom "store removal"
>>>>>>>> code there.
>>>>>>>>
>>>>>>>> I would like to finish malloc + llvm.memset folding. Yes, you told
>>>>>>>> you would like to see the whole foldMallocMemset in DSE but extend it for
>>>>>>>> llvm.memset in InstCombine... is it really so bad to do?
>>>>>>>> We have standard malloc + memset folding there, so a  few  new
>>>>>>>> lines should not do bad things.
>>>>>>>>
>>>>>>>> If I reopen D45344, reupload patch with removed my custom "store
>>>>>>>> removal" code, It could be ok, no? The patch as is worked/works for me for
>>>>>>>> malloc + llvm.memset folding, I would just add -dse to tests to handle dead
>>>>>>>> stores.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2018-05-17 21:00 GMT+02:00 Friedman, Eli <efriedma at codeaurora.org>:
>>>>>>>>
>>>>>>>>> On 5/17/2018 8:58 AM, Dávid Bolvanský via llvm-dev wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I would like to find a way to do this removal properly. I found
>>>>>>>>>> DSE and "eliminateNoopStore" can be useful for this thing.
>>>>>>>>>>
>>>>>>>>>> What I mean?
>>>>>>>>>> int *test = malloc(15 * sizeof(int));
>>>>>>>>>> test[10] = 12; < ----- remove this store
>>>>>>>>>> memset(test,0,sizeof(int) * 15);
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is classic dead store elimination, and it's already handled
>>>>>>>>> by DSE. At least, we optimize the following testcase:
>>>>>>>>>
>>>>>>>>> #include <stdlib.h>
>>>>>>>>> #include <string.h>
>>>>>>>>> void bar(int*);
>>>>>>>>> void f() {
>>>>>>>>>   int *test = malloc(15 * sizeof(int));
>>>>>>>>>   test[10] = 12;
>>>>>>>>>   memset(test,0,sizeof(int) * 15);
>>>>>>>>>   bar(test);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> You should be able to look at the existing code to understand how
>>>>>>>>> it's handled (using MemoryDependenceAnalysis).
>>>>>>>>>
>>>>>>>>> -Eli
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>>>>>>>> a Linux Foundation Collaborative Project
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>
>>>>>
>>>>
>>>
>>
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180523/b8db7b11/attachment.html>


More information about the llvm-dev mailing list