[llvm-dev] DSE: Remove useless stores between malloc & memset
Friedman, Eli via llvm-dev
llvm-dev at lists.llvm.org
Tue May 22 16:08:24 PDT 2018
You might want to look more carefully at how you're constructing the
MemoryLocation. The first argument is a pointer, and the second
argument is the number of bytes pointed to by that pointer (or
MemoryLocation::UnknownSize if the number of bytes accessed isn't known).
More generally, copy-pasting code you don't understand isn't a good idea.
-Eli
On 5/22/2018 4:02 PM, Dávid Bolvanský wrote:
> IR:
> define i32 @calloc_strlen_write_between() {
> %call = tail call noalias i8* @calloc(i32 10, i32 1)
> store i8 97, i8* %call, align 1
> %call1 = tail call i32 @strlen(i8* %call)
> ret i32 %call1
> }
>
>
> static bool eliminateStrlen(CallInst *CI, BasicBlock::iterator &BBI,
> AliasAnalysis *AA, MemoryDependenceResults *MD,
> const DataLayout &DL, const TargetLibraryInfo *TLI,
> InstOverlapIntervalsTy &IOL,
> DenseMap<Instruction *, size_t> *InstrOrdering) {
>
> // Must be a strlen.
> LibFunc Func;
> Function *Callee = CI->getCalledFunction();
> if (!TLI->getLibFunc(*Callee, Func) || !TLI->has(Func) ||
> Func != LibFunc_strlen)
> return false;
>
> Value *Dst = CI->getOperand(0);
> Instruction *UnderlyingPointer =
> dyn_cast<Instruction>(GetUnderlyingObject(Dst, DL));
> if (!UnderlyingPointer)
> return false;
> if (!isStringFromCalloc(Dst, TLI))
> return false;
>
> if (memoryIsNotModifiedBetween(UnderlyingPointer, CI, AA)) {
> Value *Len = ConstantInt::get(CI->getType(), 0);
> CI->replaceAllUsesWith(Len);
> CI->eraseFromParent();
> return true;
> }
> return false;
> }
>
>
> ------------------------------------------------------
> That IR is still wrongly transformed with this code to ret i32 0 (but
> there is write between calloc and strlen). Any suggestions?
>
> 2018-05-23 0:49 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com
> <mailto:david.bolvansky at gmail.com>>:
>
> It works with
>
> MemoryLocation MemoryLocation::get(const CallInst *CI) {
> AAMDNodes AATags;
> CI->getAAMetadata(AATags);
> const auto &DL = CI->getModule()->getDataLayout();
>
> return MemoryLocation(CI, DL.getTypeStoreSize(CI->getType()), AATags);
> }
>
> Is it fine? :)
>
> 2018-05-22 23:56 GMT+02:00 Dávid Bolvanský
> <david.bolvansky at gmail.com <mailto:david.bolvansky at gmail.com>>:
>
> Looks like there are many overloads for "get".
> http://llvm.org/doxygen/MemoryLocation_8cpp_source.html
> <http://llvm.org/doxygen/MemoryLocation_8cpp_source.html>
>
> But nothing for CallInst. Any suggestions how to do a proper
> one? I will look at it too.
>
> 2018-05-22 23:34 GMT+02:00 Dávid Bolvanský
> <david.bolvansky at gmail.com <mailto:david.bolvansky at gmail.com>>:
>
> Full stack trace:
>
> opt:
> /home/xbolva00/LLVM/llvm/include/llvm/ADT/Optional.h:176:
> T* llvm::Optional<T>::getPointer() [with T =
> llvm::MemoryLocation]: Assertion `Storage.hasVal' failed.
> Stack dump:
> 0.Program arguments: opt aaa.ll -dse -S
> 1.Running pass 'Function Pass Manager' on module 'aaa.ll'.
> 2.Running pass 'Dead Store Elimination' on function
> '@calloc_strlen'
> LLVMSymbolizer: error reading file: No such file or directory
> #0 0x000056135ebe698a (opt+0x212198a)
> #1 0x000056135ebe4cf4 (opt+0x211fcf4)
> #2 0x000056135ebe4e32 (opt+0x211fe32)
> #3 0x00007f6e35b14150 __restore_rt
> (/lib/x86_64-linux-gnu/libpthread.so.0+0x13150)
> #4 0x00007f6e3481b0bb gsignal
> /build/glibc-itYbWN/glibc-2.26/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
> #5 0x00007f6e3481cf5d abort
> /build/glibc-itYbWN/glibc-2.26/stdlib/abort.c:92:0
> #6 0x00007f6e34812f17 __assert_fail_base
> /build/glibc-itYbWN/glibc-2.26/assert/assert.c:92:0
> #7 0x00007f6e34812fc2
> (/lib/x86_64-linux-gnu/libc.so.6+0x2efc2)
> #8 0x000056135e962b80 (opt+0x1e9db80)
> #9 0x000056135e969260 (opt+0x1ea4260)
> #10 0x000056135e96a6e0 (opt+0x1ea56e0)
> #11 0x000056135e61d561 (opt+0x1b58561)
> #12 0x000056135e61d5d9 (opt+0x1b585d9)
> #13 0x000056135e61cbb7 (opt+0x1b57bb7)
> #14 0x000056135d175216 (opt+0x6b0216)
> #15 0x00007f6e348051c1 __libc_start_main
> /build/glibc-itYbWN/glibc-2.26/csu/../csu/libc-start.c:342:0
> #16 0x000056135d1f404a (opt+0x72f04a)
>
>
> 2018-05-22 23:32 GMT+02:00 Friedman, Eli
> <efriedma at codeaurora.org <mailto:efriedma at codeaurora.org>>:
>
> It looks like the memoryIsNotModifiedBetween assumes
> the second argument is a store, or some other
> instruction supported by MemoryLocation::get. If
> you're passing in something else, you'll have to
> compute the MemoryLocation some other way.
>
> (Generally, if you're asking a question about an
> assertion, please include the whole stack trace; it's
> hard to guess what's happening otherwise.)
>
> -Eli
>
>
> On 5/22/2018 2:16 PM, Dávid Bolvanský wrote:
>> * if (isStringFromCalloc(Dst, TLI)) should be if
>> (!isStringFromCalloc(Dst, TLI))
>> but still asserting...
>>
>> 2018-05-22 23:06 GMT+02:00 Dávid Bolvanský
>> <david.bolvansky at gmail.com
>> <mailto:david.bolvansky at gmail.com>>:
>>
>> Can you help a bit?
>>
>> I try to work with DSE but I got the following
>> assert:
>> opt:
>> /home/xbolva00/LLVM/llvm/include/llvm/ADT/Optional.h:176:
>> T* llvm::Optional<T>::getPointer() [with T =
>> llvm::MemoryLocation]: Assertion `Storage.hasVal'
>> failed.
>>
>> static bool eliminateStrlen(CallInst *CI, BasicBlock::iterator &BBI,
>> AliasAnalysis *AA, MemoryDependenceResults *MD,
>> const DataLayout &DL, const TargetLibraryInfo *TLI,
>> InstOverlapIntervalsTy &IOL,
>> DenseMap<Instruction *, size_t> *InstrOrdering) {
>>
>> // Must be a strlen.
>> LibFunc Func;
>> Function *Callee = CI->getCalledFunction();
>> if (!TLI->getLibFunc(*Callee, Func) || !TLI->has(Func) ||
>> Func != LibFunc_strlen)
>> return false;
>>
>> Value *Dst = CI->getOperand(0);
>> Instruction *UnderlyingPointer = dyn_cast<Instruction>(GetUnderlyingObject(Dst, DL));
>> if (!UnderlyingPointer)
>> return false;
>> if (isStringFromCalloc(Dst, TLI))
>> return false;
>> errs() << "before\n";
>> if (memoryIsNotModifiedBetween(UnderlyingPointer, CI, AA)) { <--- CRASH
>> errs() << "after\n";
>> }
>> return false;
>> }
>>
>> Do you know what is wrong here? I followed the "example" (in eliminateNoopStore) how to use "memoryIsNotModifiedBetween".
>>
>> Thank you for advice
>>
>>
>> 2018-05-21 21:06 GMT+02:00 Friedman, Eli
>> <efriedma at codeaurora.org
>> <mailto:efriedma at codeaurora.org>>:
>>
>> memoryIsNotModifiedBetween is precisely the
>> sort of expensive walk we shouldn't be
>> doing... I'm surprised it hasn't caused any
>> serious issues yet. Ideally, what we should
>> be doing is using MemorySSA to find a
>> dependency from the memset: if the closest
>> dependency is the malloc, there aren't any
>> stores between the memset and the malloc.
>> (But we aren't using MemorySSA in DSE yet;
>> see https://reviews.llvm.org/D40480
>> <https://reviews.llvm.org/D40480>.)
>>
>> But yes, memoryIsNotModifiedBetween has the
>> right meaning.
>>
>> -Eli
>>
>>
>> On 5/21/2018 7:48 AM, Dávid Bolvanský wrote:
>>> "memory accesses between the malloc and the
>>> memset without an expensive linear scan of
>>> the block/function"
>>>
>>> (1) do you mean just use
>>> "memoryIsNotModifiedBetween" function in DSE
>>> to check it?
>>>
>>> x = maloc(..);
>>> memset(x, ...)
>>>
>>> (2) GetUnderlyingObject would give me Value
>>> * (from malloc) ?
>>>
>>> Also another case:
>>>
>>> memset(s, 0, len); // len > 1
>>> return strlen(s); // optimize to 0
>>>
>>> (3) How to check memset and strlen pairs? I
>>> have a strlen call, I have a "Value *" for
>>> "s". What is the best way to construct
>>> memset + strlen pairs?
>>>
>>> (4) Can this be somehow generalized (and not
>>> only for strlen)? So GetStringLength in
>>> ValueTracking would be taught about this
>>> info (string is empty after memset)
>>>
>>> (5) new malloc memset folding /
>>> memset + strlen case should be implemented
>>> in DSE, right?
>>>
>>> 2018-05-17 21:36 GMT+02:00 Friedman, Eli
>>> <efriedma at codeaurora.org
>>> <mailto:efriedma at codeaurora.org>>:
>>>
>>> The fundamental problem with trying to
>>> do that sort of transform in instcombine
>>> is that you don't have access to
>>> MemoryDependenceAnalysis or MemorySSA;
>>> you need a data structure like that to
>>> figure out whether there are any memory
>>> accesses between the malloc and the
>>> memset without an expensive linear scan
>>> of the block/function. (You can sort of
>>> get around the problem in simple cases
>>> by adding arbitrary limits to the number
>>> of you scan, but it doesn't generalize
>>> well.)
>>>
>>> -Eli
>>>
>>>
>>> On 5/17/2018 12:17 PM, Dávid Bolvanský
>>> wrote:
>>>> As we talked in
>>>> https://reviews.llvm.org/D45344
>>>> <https://reviews.llvm.org/D45344>, the
>>>> problem was dead stores. And I know why
>>>> :D There was just -instcombine pass. I
>>>> forgot to do -dse before
>>>> -instcombine so this is why I did
>>>> custom "store removal" code there.
>>>>
>>>> I would like to finish malloc +
>>>> llvm.memset folding. Yes, you told you
>>>> would like to see the whole
>>>> foldMallocMemset in DSE but extend it
>>>> for llvm.memset in InstCombine... is it
>>>> really so bad to do?
>>>> We have standard malloc +
>>>> memset folding there, so a few new
>>>> lines should not do bad things.
>>>>
>>>> If I reopen D45344, reupload patch with
>>>> removed my custom "store removal" code,
>>>> It could be ok, no? The patch as is
>>>> worked/works for me for malloc
>>>> + llvm.memset folding, I would just add
>>>> -dse to tests to handle dead stores.
>>>>
>>>>
>>>>
>>>> 2018-05-17 21:00 GMT+02:00 Friedman,
>>>> Eli <efriedma at codeaurora.org
>>>> <mailto:efriedma at codeaurora.org>>:
>>>>
>>>> On 5/17/2018 8:58 AM, Dávid
>>>> Bolvanský via llvm-dev wrote:
>>>>
>>>> Hello,
>>>>
>>>> I would like to find a way to
>>>> do this removal properly. I
>>>> found DSE and
>>>> "eliminateNoopStore" can be
>>>> useful for this thing.
>>>>
>>>> What I mean?
>>>> int *test = malloc(15 *
>>>> sizeof(int));
>>>> test[10] = 12; < ----- remove
>>>> this store
>>>> memset(test,0,sizeof(int) * 15);
>>>>
>>>>
>>>> This is classic dead store
>>>> elimination, and it's already
>>>> handled by DSE. At least, we
>>>> optimize the following testcase:
>>>>
>>>> #include <stdlib.h>
>>>> #include <string.h>
>>>> void bar(int*);
>>>> void f() {
>>>> int *test = malloc(15 * sizeof(int));
>>>> test[10] = 12;
>>>> memset(test,0,sizeof(int) * 15);
>>>> bar(test);
>>>> }
>>>>
>>>> You should be able to look at the
>>>> existing code to understand how
>>>> it's handled (using
>>>> MemoryDependenceAnalysis).
>>>>
>>>> -Eli
>>>>
>>>> --
>>>> Employee of Qualcomm Innovation
>>>> Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is
>>>> a member of Code Aurora Forum, a
>>>> Linux Foundation Collaborative Project
>>>>
>>>>
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>
>>>
>>
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>
>>
>>
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
>
>
>
>
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180522/b4fab287/attachment-0001.html>
More information about the llvm-dev
mailing list