[PATCH] D148665: Change -fsanitize=function to place two words before the function entry

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 19 07:43:36 PDT 2023

MaskRay marked an inline comment as done.
MaskRay added inline comments.

Comment at: clang/lib/CodeGen/TargetInfo.h:205
   getUBSanFunctionSignature(CodeGen::CodeGenModule &CGM) const {
-    return nullptr;
+    return llvm::ConstantInt::get(CGM.Int32Ty, 0xc105cafe);
peter.smith wrote:
> Is it worth making this target specific? Defaulting to 0xc105cafe for all targets, if that gets allocated in the future on any one target we can change it without having to test against all other targets.
> For example on AArch64 this is is in the decoding space of SME instructions op0 = 0b1 op1 = 0b0000. This may get allocated in the future. Although admittedly it is likely to be very rare to find a use of a SME instruction at the end of a function to cause it to get misidentified.
> I suspect most targets have an explicit undefined instruction, such as UDF in AArch64 which is 0x0000???? where ? can be any value. On Arm there two separate 4-byte encodings for Arm, Thumb of UDF.
Thanks for the review! This is a virtual function, so a target can override as appropriate.

Thanks for informing that this is an encoding that AArch64 may allocate in the future. Since the signature is placed before the function label and is used to protect uninstrumented object files, the signature doesn't need to be out of all encoding space. As long as it is unlikely to be the last 2 instructions of a previous function, this is going to have a good defensive effect. So I expect that A32/A64 may not want to change this as well.

I assume that T32 unlikely uses 0xca 0xca (ldm r2, {r1-r7}) as one of the last two instructions of the previous function, so this seems fine as well, but no objection if T32 wants to change to a different signature :)

  rG LLVM Github Monorepo



More information about the llvm-commits mailing list