[PATCH] D26437: Use -fno-unit-at-a-time and -funit-at-a-time
Justin Lebar via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 11 14:01:45 PST 2016
jlebar added a comment.
> Convergent attribute is missing. You may argue that atomic_work_item_fence does not have convergent attribute, but it should not in fact. Then I can remove a body of the sub_group_barrier completely so it does not call anything at all, result is still the same.
Ah, I think I understand your problem.
The problem is that you are explicitly marking this function as convergent, and then are surprised that LLVM is removing the convergent attribute.
Is that right?
This is WAI for how we have interpreted the convergent attribute in clang and llvm. Maybe our intended behavior is wrong, we can discuss it. But first I want to be really clear that this has nothing to do with running IPOs, or compiling a bitcode library. You would have exactly the same problem, and exactly the same bug, if your testcase were the whole program. Does that make sense?
As I said above:
> clang adds the "convergent" attribute to every function it emits. llvm then removes this attribute from functions that do not contain convergent instructions or call convergent functions.
This is the reason that you are seeing what you are seeing: llvm is removing the "convergent" attribute from this function because it does not contain any convergent instructions. Again, it has nothing to do with bitcode libraries. Indeed, if we made the change proposed here, it would likely not fix your bug. As soon as you link this bitcode library into another library and run the IPOs over the combined result, it's going to again remove the convergent attribute on this function.
> You may argue that atomic_work_item_fence does not have convergent attribute, but it should not in fact.
Can you elaborate on why atomic_work_item_fence should not have the convergent attribute but sub_group_barrier should be convergent?
Specifically, the convergent attribute prevents additional control-flow dependencies from being added to a callsite. Are you saying that it is incorrect to convert
sub_group_barrier();
if (a) { ... }
else { ... }
into
if (a) { sub_group_barrier(); ... }
else { sub_grouop_barrier(); ... }
but it *is* ok to perform the same transformation on atomic_work_item_fence? Why?
> Then I can remove a body of the sub_group_barrier completely so it does not call anything at all, result is still the same.
Yes, there is currently no way for you to say "this empty function must be convergent."
Can you explain why this is an important feature?
> Also note that besides functions attribute pass PassManagerBuilder adds a lot of other passes under unit-a-time-mode which are not expected to run in a non-whole program mode, we have there IPSCCP, GlobalOptimizer, DeadArgElimination etc. All of that is not supposed to run on a library module.
If it's OK with you, I'd like to focus on the convergent issue first. Once we get that resolved, Richard Smith should be able to speak to your larger question of funit-at-a-time.
Repository:
rL LLVM
https://reviews.llvm.org/D26437
More information about the llvm-commits
mailing list