[llvm] [BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening (PR #120064)

Paschalis Mpeis via llvm-commits llvm-commits at lists.llvm.org
Wed May 14 03:18:26 PDT 2025


================
@@ -0,0 +1,137 @@
+#!/usr/bin/env python3
+
+# This tool helps matching dwarf dumps
+# (= the output from running llvm-objdump --dwarf=frames),
+# by address to function names (which are parsed from a normal objdump).
+# The script is used for checking if .cfi_negate_ra_state CFIs
+# are generated by BOLT the same way they are generated by LLVM.
+# The script is called twice in unittests: once with the objdumps of
+# the BOLT input binary, and once with the output binary from BOLT.
+# We output the offsets of .cfi_negate_ra_state instructions from the
+# function's start address to see that BOLT can generate them to the same
+# locations.
+# Because we check the location, this is only useful for testing without
+# optimization flags, so `llvm-bolt input.exe -o output.exe`
+
+
+import argparse
+import subprocess
+import sys
+import re
+
+
+class NameDwarfPair(object):
+    def __init__(self, name, body):
+        self.name = name
+        self.body = body
+        self.finalized = False
+
+    def append(self, body_line):
+        # only store elements into the body until the first whitespace line is encountered.
+        if body_line.isspace():
+            self.finalized = True
+        if not self.finalized:
+            self.body += body_line
+
+    def print(self):
+        print(self.name)
+        print(self.body)
+
+    def parse_negate_offsets(self):
+        """
+        Create a list of locations/offsets of the negate_ra_state CFIs in the
+        dwarf entry. To find offsets for each, we match the DW_CFA_advance_loc
+        entries, and sum up their values.
+        """
+        negate_offsets = []
+        loc = 0
+        # TODO: make sure this is not printed in hex
+        re_advloc = r"DW_CFA_advance_loc: (\d+)"
+
+        for line in self.body.splitlines():
+            # if line matches advance_loc int
+            match = re.search(re_advloc, line)
+            if match:
+                loc += int(match.group(1))
+            if "DW_CFA_AARCH64_negate_ra_state" in line:
+                negate_offsets.append(loc)
+
+        self.negate_offsets = negate_offsets
+
+    def __eq__(self, other):
+        return self.name == other.name and self.negate_offsets == other.negate_offsets
+
+
+def extract_function_addresses(objdump):
+    """
+    Parse and return address-to-name dictionary from objdump file.
+    Function names in the objdump look like this:
+        000123abc <foo>:
+    We create a dict from the addr (000123abc), to the name (foo).
+    """
+    addr_name_dict = dict()
+    re_function = re.compile(r"^([0-9a-fA-F]+)\s<(.*)>:$")
+    with open(objdump, "r") as f:
+        for line in f.readlines():
+            match = re_function.match(line)
+            if not match:
+                continue
+            m_addr = match.groups()[0]
+            m_name = match.groups()[1]
+            addr_name_dict[int(m_addr, 16)] = m_name
+
+    return addr_name_dict
+
+
+def match_dwarf_to_name(dwarfdump, addr_name_dict):
+    """
+    Parse dwarf dump, and match names to blocks using the dict from the objdump.
+    Return a list of NameDwarfPairs.
+    The matched lines look like this:
+    000123 000456 000789 FDE cie=000000  pc=0123abc...0456def
+    We do not have the function name for this, only the PC range it applies to.
+    We match the pc=0123abc (the start address), and find the matching name from
+    the addr_name_dict.
+    The resultint NameDwarfPair will hold the lines this header applied to, and
----------------
paschalis-mpeis wrote:

nit: typo

```suggestion
    The resulting NameDwarfPair will hold the lines this header applied to, and
```

https://github.com/llvm/llvm-project/pull/120064


More information about the llvm-commits mailing list