[clang] [RFC] [clang] [CodeGen] Avoid creating global variable repeatedly when type are not specified (PR #114948)
via cfe-commits
cfe-commits at lists.llvm.org
Tue Nov 5 00:13:29 PST 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-codegen
Author: Chuanqi Xu (ChuanqiXu9)
<details>
<summary>Changes</summary>
This comes from an internal crash. I know generally it is better to reproduce it first but I do feel the pattern is pretty risky. So I am wondering if we can discuss it first. So maybe this is more of a discussion instead of a pure PR.
Then story is, when we try to get or create a LLVM global for a C/C++'s global, we will try to look up the name first for the existing globals. And if we find one, we will perform some checks. If the checks pass, we will return the found one. If not, we will create a new one and replace the previous one. (Why do we want to do this? My instinct reaction is that we should abort here):
https://github.com/llvm/llvm-project/blob/bf43a138f0a6220cd043a376200bd739cacb25e3/clang/lib/CodeGen/CodeGenModule.cpp#L4966-L4982
https://github.com/llvm/llvm-project/blob/bf43a138f0a6220cd043a376200bd739cacb25e3/clang/lib/CodeGen/CodeGenModule.cpp#L5017-L5032
The problem is, if we store the address of a global variable and the global variable got replaced later, the address we stored became a wild pointer! e.g.
https://github.com/llvm/llvm-project/blob/283273fa1e3be4a03f06a5efd08a8c818be981fd/clang/lib/CodeGen/CodeGenModule.cpp#L2092-L2097
I feel this is pretty dangerous. And to my knowledge, I think we'd better to not remove things emitted during CodeGen.
Then, one of the trigger for the problem is `CodeGenModule::GetAddrOfGlobalVar`:
https://github.com/llvm/llvm-project/blob/283273fa1e3be4a03f06a5efd08a8c818be981fd/clang/lib/CodeGen/CodeGenModule.cpp#L5232C17-L5243
The arguments except `D` can be omitted. And if we don't specify `Ty`, the function will try to deduce the type from `D`. And use the type to get or create a LLVM global in the above process. And the `Ty` arguments may not always be omitted, e.g., in https://github.com/llvm/llvm-project/blob/283273fa1e3be4a03f06a5efd08a8c818be981fd/clang/lib/CodeGen/CodeGenModule.cpp#L5484-L5564, we will try to deduce the LLVM type directly.
Then problem happens, sometimes we try to get or create the global variable by the AST type, but sometimes we try to get or create the **same** global variable by deduced type, and if the two types differs, we may be in the trouble of wild pointer.
(the two types are compatible: e.g., one is `struct { %another.struct}` with `%another.struct = { ptr }` and another type is `{ { ptr } }`).
The solution or one workaround I got is, in `CodeGenModule::GetAddrOfGlobalVar`, if we didn't specify the `Ty` and we have the same variable, return the variable directly. I think it makes sense since if the `Ty` is not specified, it implies the caller doesn't care about it too much.
WDYT?
---
Full diff: https://github.com/llvm/llvm-project/pull/114948.diff
2 Files Affected:
- (modified) clang/lib/CodeGen/CodeGenModule.cpp (+8-1)
- (modified) clang/test/CodeGen/attr-weakref2.c (+1-1)
``````````diff
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index ba376f9ecfacde..9566cfb8d6e794 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5233,11 +5233,18 @@ llvm::Constant *CodeGenModule::GetAddrOfGlobalVar(const VarDecl *D,
llvm::Type *Ty,
ForDefinition_t IsForDefinition) {
assert(D->hasGlobalStorage() && "Not a global variable");
+
+ StringRef MangledName = getMangledName(D);
+ llvm::GlobalValue *Entry = GetGlobalValue(MangledName);
QualType ASTTy = D->getType();
+ LangAS AddrSpace = ASTTy.getAddressSpace();
+
+ if (Entry && !Ty && Entry->getAddressSpace() == getContext().getTargetAddressSpace(AddrSpace))
+ return Entry;
+
if (!Ty)
Ty = getTypes().ConvertTypeForMem(ASTTy);
- StringRef MangledName = getMangledName(D);
return GetOrCreateLLVMGlobal(MangledName, Ty, ASTTy.getAddressSpace(), D,
IsForDefinition);
}
diff --git a/clang/test/CodeGen/attr-weakref2.c b/clang/test/CodeGen/attr-weakref2.c
index 114f048a851832..a67f906810faf3 100644
--- a/clang/test/CodeGen/attr-weakref2.c
+++ b/clang/test/CodeGen/attr-weakref2.c
@@ -33,7 +33,7 @@ int test4_h(void) {
}
int test4_f;
-// CHECK: @test5_f = external global i32
+// CHECK: @test5_f = extern_weak global i32
extern int test5_f;
static int test5_g __attribute__((weakref("test5_f")));
int test5_h(void) {
``````````
</details>
https://github.com/llvm/llvm-project/pull/114948
More information about the cfe-commits
mailing list