[llvm-dev] Incorrect code generation when using -fprofile-generate on code which contains exception handling (Windows target)
Chrulski, Christopher M via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 14 19:38:52 PST 2020
Thanks. I’ll take a look at the examples and give it a try for creating a patch.
Chris
From: Shoaib Meenai <smeenai at fb.com>
Sent: Tuesday, January 14, 2020 2:49 PM
To: Reid Kleckner <rnk at google.com>; Chrulski, Christopher M <christopher.m.chrulski at intel.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Incorrect code generation when using -fprofile-generate on code which contains exception handling (Windows target)
https://reviews.llvm.org/D44641 is an example of using colorEHFunclets this way (and https://reviews.llvm.org/D45857 is a follow-up which demonstrates another usage). If I remember correctly, the analysis needs to be updated if the pass is updating the CFG, which those ARC passes weren’t doing. I’m not sure what that updating looks like, though I’m sure we have an example of it somewhere in-tree … Reid might know off the top of his head.
From: Reid Kleckner <rnk at google.com<mailto:rnk at google.com>>
Date: Tuesday, January 14, 2020 at 12:45 PM
To: "Chrulski, Christopher M" <christopher.m.chrulski at intel.com<mailto:christopher.m.chrulski at intel.com>>
Cc: Shoaib Meenai <smeenai at fb.com<mailto:smeenai at fb.com>>, "llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Subject: Re: [llvm-dev] Incorrect code generation when using -fprofile-generate on code which contains exception handling (Windows target)
I think the simplest, most complete, short term fix, would be to call llvm::colorEHFunclets, and to have the relevant instrumentation passes apply the appropriate funclet bundle when inserting function calls. It's not elegant because it means every simple instrumentation pass that inserts regular function calls (ASan, TSan, MSan, instrprof, etc) needs to be funclet-aware. But, it will work, and the code isn't so bad.
On Tue, Jan 14, 2020 at 10:59 AM Chrulski, Christopher M <christopher.m.chrulski at intel.com<mailto:christopher.m.chrulski at intel.com>> wrote:
Thanks for the link. I agree, looks like it's another instance of the same class of problems. Since the compiler still requires funclet operand bundles, it sounds like one of the 3 options below will be required until a longer term project of eliminating them is available. Anybody have a strong preference?
Chris
-----Original Message-----
From: Shoaib Meenai <smeenai at fb.com<mailto:smeenai at fb.com>>
Sent: Monday, January 13, 2020 1:10 AM
To: Chrulski, Christopher M <christopher.m.chrulski at intel.com<mailto:christopher.m.chrulski at intel.com>>; llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Cc: Reid Kleckner <rnk at google.com<mailto:rnk at google.com>>
Subject: Re: [llvm-dev] Incorrect code generation when using -fprofile-generate on code which contains exception handling (Windows target)
I think this is the same underlying issue as https://bugs.llvm.org/show_bug.cgi?id=40320<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.llvm.org_show-5Fbug.cgi-3Fid-3D40320&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=2kJvP66PH1_0V_6F0MmFxRneQh7gCOtMlcJ05vyhwho&s=IQa4LvJ69syknBZI4oGapryXiISJwyjSHUW8YjtIJfo&e=>. CCing Reid, who's had a bunch of thoughts on this in the past.
On 1/11/20, 10:25 AM, "llvm-dev on behalf of Chrulski, Christopher M via llvm-dev" <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org> on behalf of llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi,
I've run into a bug with the LLVM backend that causes incorrect code generation to happen when using -fprofile-generate on programs that contain C++ exception handling when building for Windows.
The problem occurs when the value profiling inserts function calls into exception handling blocks. The instrumentation inserts value profiling intrinsic calls, and these are subsequently lowered into target library calls. However, these library calls do not get a funclet operand bundle associated with them. This causes the Windows Exception Handling Preparation Pass to drop all the instructions within the exception handler starting from the PGO instrumentation call, and replace them with 'unreachable'. This is being done by the function removeImplausibleInstructions (WinEHPrepare.cpp).
A simple reproducer of the problem shown here which will lead to incorrect code on the method test::run(). In this example, the virtual function called from within the exception handler triggers the bug when using -fprofile-generate.
#include <stdexcept>
#include <iostream>
extern void may_throw(int);
class base {
public:
base() : x(0) {};
int get_x() const { return x; }
virtual void update() { x++; }
int x;
};
class derived : public base {
public:
derived() {}
virtual void update() { x--; }
};
class test {
public:
void run(base* b, int count) {
try {
for (int i = 0; i < count; ++i)
may_throw(i);
}
catch (std::exception& e) {
// Virtual function call in exception handler for value profiling.
b->update();
}
}
};
void run_test() {
test tester;
base *obj = new derived;
tester.run(obj, 100);
std::cout << "Value in obj (should be -1): " << obj->get_x() << "\n";
if (obj->get_x() == -1)
std::cout << "test passed\n";
else
std::cout << "test failed\n";
}
int main() {
// Without PGO, test runs and prints result.
// With -fprofile-generate, program seg-faults without printing.
run_test();
return 0;
}
__attribute__((noinline))
void may_throw(int x) {
if (x > 10)
throw std::range_error("value out of range");
}
On Windows, build with: clang -O2 -fprofile-generate test.cpp
When profiling is enabled the program will seg fault without printing anything. Without the -fprofile-generate flag, the program will run successfully.
The compiler problem is as follows: Prior to the Windows Exception Handling Preparation Pass, the IR for the function "test::run" contains the following:
19: ; preds = %17
%20 = catchpad within %18 [%rtti.TypeDescriptor19* @"??_R0?AVexception at std@@@8", i32 8, %"class.std::exception"** %6]
%21 = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
%22 = add i64 %21, 1
store i64 %22, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
%23 = bitcast %class.base* %1 to void (%class.base*)***
%24 = load void (%class.base*)**, void (%class.base*)*** %23, align 8, !tbaa !9
%25 = load void (%class.base*)*, void (%class.base*)** %24, align 8
%26 = ptrtoint void (%class.base*)* %25 to i64
call void @__llvm_profile_instrument_target(i64 %26, i8* bitcast ({ i64, i64, i64*, i8*, i8*, i32, [2 x i16] }* @"__profd_?run at test@@QEAAXPEAVbase@@H at Z" to i8*), i32 0)
call void %25(%class.base* %1) [ "funclet"(token %20) ]
call void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #15 [ "funclet"(token %20) ]
unreachable
Following this pass, this IR has been replaced with the following, causing a breakage to the original program. This is occurring because the instrumentation function call, "__llvm_profile_instrument_target", is not marked with the funclet operand bundle [ "funclet"(token %20) ].
19: ; preds = %17
%20 = catchpad within %18 [%rtti.TypeDescriptor19* @"??_R0?AVexception at std@@@8", i32 8, %"class.std::exception"** %6]
%21 = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
%22 = add i64 %21, 1
store i64 %22, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run at test@@QEAAXPEAVbase@@H at Z", i64 0, i64 2), align 8
unreachable
Possible solutions:
1) Avoid value profiling of calls within exception handling blocks
Pros: Solves the problem
Cons: Could lose some cases of value profiling, but since the exception code is not supposed to be the primary execution path, this should not be a significant performance issue.
2) Propagate the funclet information onto the value profiling intrinsics created. And then also propagate this info to the library routines these intrinsics get lowered into.
For indirect function calls, the funclet information can be copied from the original function call.
However, for MemIntrinsic call operand value profiling, these do not have funclet operand bundles attached to them by the front-end. (Not sure if it's possible to do because the interfaces that are used to create these do not take operand bundles) Therefore, PGO would need to determine the appropriate funclet value with colorEHFunlets to identify the funclet operand bundle to attach to the instrumentation calls. Unfortunately, because it is possible that a basic block could be associated with multiple funclets or both a funclet and outside the funclet, this may also need to clone some of basic blocks similar to the WinEHPrepare.cpp routine cloneCommonBlocks(), prior to computing the instrumentation.
Pros: does not disable value profiling opportunities.
Cons: complex to implement due to the need to determine the appropriate funclet to place on the memory operand value profiling calls. This would necessitate the same cloning behavior to be done for the PGO use compilation.
3) Teach the Windows Exception Preparation Pass about the value profiling library functions. Currently this pass will ignore llvm intrinsic functions that are marked with the 'does not throw' attribute, but the value profiling intrinsic calls have been lowered from being intrinsic calls into runtime library target specific functions before reaching this point.
Pros: does not disable value profiling opportunities
Cons: requires exposing function names from InstrProf.h to the WinEHPrepare.cpp file, or requires a new attribute on the function calls to identify them as instrumentation library calls. Also, the IR does not correctly reflect the correct state regarding the operand bundle funclet information for the PGO inserted function calls.
For options 2 or 3 to work, it also requires that the PGO indirect function call promotion pass used for -fprofile-use to maintain the 'funclet' operand bundle on the specialized function call that is inserted as a direct function call target. Fortunately, the code within that pass is cloning the original indirect call, so the 'funclet' operand bundle is being maintained on it.
Any thoughts on which of these options should be taken, or other suggestions for resolving this problem?
Chris
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=IHHZpUcEfZDvndGY8yafot6m6x5pu5ytT0d-lHiPtj4&s=G_7nV5L14eMSueKjWv77SYDeHf7Jzl5s0TrkM7l3YRM&e=
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200115/20dd8db2/attachment.html>
More information about the llvm-dev
mailing list