[llvm-dev] Proposal: writing calls that can nevertheless be removed

Max Kazantsev via llvm-dev llvm-dev at lists.llvm.org
Sun Nov 7 21:03:19 PST 2021


> I understand your attribute now as a way to express: "assume nobody will depend on the fact that there would have been a state change when you determine if a call is needed".
Does that somewhat match with your initial idea?

Yes, it does.

> So `readonly` on the call sites would give you the right effect but if we propagate it, or the inliner would make use of it, you would have a problem, right?

The only problem here is, according to langref:
> If a readonly function writes memory visible to the program, or has other side-effects, the behavior is undefined. If a function writes to a readonly pointer argument, the behavior is undefined.

Formally, this memory forming the state is visible to this program. At least it is visible to this very function. So we cannot use readonly here because it is UB. But otherwise yes, all optimizations effects from the proposed attribute would be the same as if it was readonly function.

-----Original Message-----
From: Johannes Doerfert <johannesdoerfert at gmail.com> 
Sent: Wednesday, November 3, 2021 8:50 AM
To: Max Kazantsev <mkazantsev at azul.com>; llvm-dev at lists.llvm.org
Cc: Dmitry Makogon <dmakogon at azul.com>; Artur Pilipenko <apilipenko at azul.com>
Subject: Re: [llvm-dev] Proposal: writing calls that can nevertheless be removed


On 11/2/21 02:43, Max Kazantsev wrote:
> I don't think we should be discussing globals here at all. Singleton was just an example why we might need it. In our motivating case, the mutated state is internal state of a parameter object (specifically, its stored hash). My proposal wasn't about globals at all, there might not be any global at all involved in the process.

Let's step back then:

The function you shown, assuming that is actually an accurate model of the kind of functions you want to tag, has a write effect based on
*some* state.
The function cannot be removed, even if the result is unused, because it may change *some* state.
The only way to remove all calls to the function (right now) is to either (A) show it never changes the state, (B) show nobody will observe the state change.

I understand your attribute now as a way to express: "assume nobody will depend on the fact that there would have been a state change when you determine if a call is needed".
Does that somewhat match with your initial idea?

Doesn't that basically mean: assume call sites are readonly but the callee is not (and do not propagate this from call sites to callee)?
So `readonly` on the call sites would give you the right effect but if we propagate it, or the inliner would make use of it, you would have a problem, right?

I guess if we reorganize the effect attributes
(read/write/argmem/inacc/...) we could introduce a "call_only" version to express this.
So it would look like `call getUnique() effects(globalsonly, readonly | call_only)`, or something like that.

WDYT?

~ Johannes



>
> -----Original Message-----
> From: Max Kazantsev
> Sent: Tuesday, November 2, 2021 2:41 PM
> To: Johannes Doerfert <johannesdoerfert at gmail.com>; 
> llvm-dev at lists.llvm.org
> Cc: Dmitry Makogon <dmakogon at azul.com>; Artur Pilipenko 
> <apilipenko at azul.com>
> Subject: RE: [llvm-dev] Proposal: writing calls that can nevertheless 
> be removed
>
> I think it's an orthogonal proposition. What I propose is, "it does change some external state, but we don't care what is this state and how it is changed, as long as idempotency and "can remove" properties stand". The state may compose of dozens of variables in theory, but it doesn't matter. You may think of this field as of a static field of some class, for example. Or set of fields. Or static hash map which may has its own (complex) state.
>
> -----Original Message-----
> From: Johannes Doerfert <johannesdoerfert at gmail.com>
> Sent: Monday, November 1, 2021 11:03 PM
> To: Max Kazantsev <mkazantsev at azul.com>; llvm-dev at lists.llvm.org
> Cc: Dmitry Makogon <dmakogon at azul.com>; Artur Pilipenko 
> <apilipenko at azul.com>
> Subject: Re: [llvm-dev] Proposal: writing calls that can nevertheless 
> be removed
>
>
> On 11/1/21 10:34, Max Kazantsev wrote:
>> Right, but how is it a problem? Whoever sets this attribute wields all responsibility for its correctness, so it's used when appropriate.
> I'd avoid attributes that have "remote effects". Thus, any attribute should go on the global, not the function. If you can annotate the global with something that says "I will only be directly accessed by the `getUnique` function (e.g., `only_accessed_directly_by(@getUnique)`)"
> we should be able to optimize this properly. If we annotate the function however, we would somehow iterate over users and then argue the users attribute (which doesn't exist yet) has a remote effect on this global.
> Does that make some sense?
>
> ~ Johannes
>
>
>> --Max
>>
>>
>> -----Original Message-----
>> From: Johannes Doerfert <johannesdoerfert at gmail.com>
>> Sent: Monday, November 1, 2021 10:06 PM
>> To: Max Kazantsev <mkazantsev at azul.com>; llvm-dev at lists.llvm.org
>> Cc: Dmitry Makogon <dmakogon at azul.com>; Artur Pilipenko 
>> <apilipenko at azul.com>
>> Subject: Re: [llvm-dev] Proposal: writing calls that can nevertheless 
>> be removed
>>
>> I assume you cannot make `Singleton` a "non-exposed" symbol, correct?
>> At the end of the day it is that global visibility/linkage that is problematic not necessarily the function.
>>
>> ~ Johannes
>>
>>
>> On 11/1/21 07:27, Max Kazantsev via llvm-dev wrote:
>>> Hello everyone,
>>>
>>> I have a proposal to introduce a function attribute to solve a dead code elimination problem for singleton-get-like functions. Imagine the following code:
>>>
>>> MyObj *Singleton = nullptr;
>>>
>>> MyObj *getUnique() {
>>>      if (Singleton)
>>>        return Singleton;
>>>      Singleton = <idempotent initialization> }
>>>
>>> Here getUnique is the only accessor of Singleton variable, which cannot be otherwise read or written.
>>>
>>> This function is formally writing memory, but it has 2 important properties:
>>> - This function is idempotent, meaning that its result only depends 
>>> on parameters (which is empty in this case);
>>> - If the result of the call is not used, it can safely be deleted without causing any problems.
>>>
>>> In other words, if we have code like
>>>
>>> for (int I = 0; I < 1000; I++) {
>>>      getUnique();
>>> }
>>>
>>> We can consider this loop dead and just delete it. Note that currently, because getUnique may write external memory, we cannot do so (even if we inline it and cleanse the remaining code, we'll still have a store to Singleton).
>>>
>>> In Java world, motivation for this is Object identity hash code. It requires to return the same value for the same object every time, it may write memory (usually it computes hash on 1st invocation<https://srvaroa.github.io/jvm/java/openjdk/biased-locking/2017/01/30/hashCode.html> basing on current address and stores it to the object). But if the hash code result is not used, it's completely fine to skip this invocation.
>>>
>>> In current function attributes, I could not find anything that could express "this is a mem-writing idempotent function which is OK to delete if trivially dead". Optimization prospects for this are great: we can simply DCE them.
>>>
>>> Do you think it's work having such attribute?
>>>
>>> Best regards,
>>> Max
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list