[llvm] BPF: Use DebugLoc to find Filename for BTF line info (PR #90302)

via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 26 21:28:03 PDT 2024


https://github.com/yonghong-song updated https://github.com/llvm/llvm-project/pull/90302

>From eb23b4f35b8eb756cc9d5ccf34b07b938993c667 Mon Sep 17 00:00:00 2001
From: Yonghong Song <yonghong.song at linux.dev>
Date: Fri, 26 Apr 2024 15:21:24 -0700
Subject: [PATCH] BPF: Use DebugLoc to find Filename for BTF line info

Andrii found an issue where the BTF line info may have empty
source which seems wrong. The program is a Meta internal
bpf program. I can reproduce with latest upstream compiler
as well. Let the bpf program built without this patch and then with
the following veristat check where veristat is a bpf verifier tool
to do kernel verification for bpf programs:

  $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
  $ rg '^;' log | sort | uniq -c | sort -nr | head -n10
   4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
   3907 ; if (now < start_allow_time) { @ ban.h:17
   3674 ;  @ src_mitigations.h:0
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
   1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
   1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
   1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
   1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
   1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
   1688 ; if (throttle_cfg) { @ rate_limit.h:85

You can see

   3674 ;  @ src_mitigations.h:0

where we do not have proper line information and line number.

In LLVM Machine IR, some instructions may carry DebugLoc information
to specify where the corresponding source is for this instruction.
The information includes file_name, line_num and col_num.
Each instruction may also attribute to a function in debuginfo.
So there are two ways to find file_name for a particular insn:
  (1) find the corresponding function in debuginfo
      (MI->getMF()->getFunction().getSubprogram()) and then
      find the file_name from DISubprogram.
  (2) find the corresponding file_name from DebugLoc.

The option (1) is used in current implementation. This mostly works.
But if one instruction is somehow generated from multiple functions,
the compiler has to pick just one. This may cause a mismatch between
file_name and line_num/col_num.

Besides potential incorrect mismatch of file_name vs. line_num/col_num,
There is another issue where some DebugLoc has line number 0. For example,
I dumped the dwarf line table for the above bpf program:

  Address            Line   Column File   ISA Discriminator OpIndex Flags
  ------------------ ------ ------ ------ --- ------------- ------- -------------
  0x0000000000000000     96      0     17   0             0       0  is_stmt
  0x0000000000000010    100     12     17   0             0       0  is_stmt prologue_end
  0x0000000000000020      0     12     17   0             0       0
  0x0000000000000058     37      7     17   0             0       0  is_stmt
  0x0000000000000060      0      0     17   0             0       0
  0x0000000000000088     37      7     17   0             0       0
  0x0000000000000090     42     75     17   0             0       0  is_stmt
  0x00000000000000a8     42     52     17   0             0       0
  0x00000000000000c0    120      9     17   0             0       0  is_stmt
  0x00000000000000c8      0      9     17   0             0       0
  0x00000000000000d0    106     21     17   0             0       0  is_stmt
  0x00000000000000d8    106      3     17   0             0       0
  0x00000000000000e0    110     25     17   0             0       0  is_stmt
  0x00000000000000f8    110     36     17   0             0       0
  0x0000000000000100      0     36     17   0             0       0
  ...

These DebugLoc with line number 0 needs to be skipped since we cannot
map them to the correct source code. Note that selftest offset-reloc-basic.ll
has this issue as well which is adjusted by this patch.

With the above two fixes, empty lines for source annotation are removed.

  $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
  $ rg '^;' log.latest | sort | uniq -c | sort -nr | head -n10
   4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
   3907 ; if (now < start_allow_time) { @ ban.h:17
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
   1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
   1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
   1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
   1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
   1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
   1688 ; if (throttle_cfg) { @ rate_limit.h:85
   1670 ; if (rl_cfg) { @ rate_limit.h:77

You can see that we do not have empty line any more.

   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85

Signed-off-by: Yonghong Song <yonghong.song at linux.dev>
---
 llvm/lib/Target/BPF/BTFDebug.cpp               | 18 ++++++++----------
 llvm/lib/Target/BPF/BTFDebug.h                 |  4 ++--
 .../CodeGen/BPF/CORE/offset-reloc-basic.ll     |  4 ++--
 3 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/llvm/lib/Target/BPF/BTFDebug.cpp b/llvm/lib/Target/BPF/BTFDebug.cpp
index ebd8447eba850e..8c9f5c4dc55487 100644
--- a/llvm/lib/Target/BPF/BTFDebug.cpp
+++ b/llvm/lib/Target/BPF/BTFDebug.cpp
@@ -973,8 +973,7 @@ void BTFDebug::visitMapDefType(const DIType *Ty, uint32_t &TypeId) {
 }
 
 /// Read file contents from the actual file or from the source
-std::string BTFDebug::populateFileContent(const DISubprogram *SP) {
-  auto File = SP->getFile();
+std::string BTFDebug::populateFileContent(const DIFile *File) {
   std::string FileName;
 
   if (!File->getFilename().starts_with("/") && File->getDirectory().size())
@@ -1005,9 +1004,9 @@ std::string BTFDebug::populateFileContent(const DISubprogram *SP) {
   return FileName;
 }
 
-void BTFDebug::constructLineInfo(const DISubprogram *SP, MCSymbol *Label,
+void BTFDebug::constructLineInfo(MCSymbol *Label, const DIFile *File,
                                  uint32_t Line, uint32_t Column) {
-  std::string FileName = populateFileContent(SP);
+  std::string FileName = populateFileContent(File);
   BTFLineInfo LineInfo;
 
   LineInfo.Label = Label;
@@ -1366,10 +1365,10 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
   if (!CurMI) // no debug info
     return;
 
-  // Skip this instruction if no DebugLoc or the DebugLoc
-  // is the same as the previous instruction.
+  // Skip this instruction if no DebugLoc, the DebugLoc
+  // is the same as the previous instruction or Line is 0.
   const DebugLoc &DL = MI->getDebugLoc();
-  if (!DL || PrevInstLoc == DL) {
+  if (!DL || PrevInstLoc == DL || DL.getLine() == 0) {
     // This instruction will be skipped, no LineInfo has
     // been generated, construct one based on function signature.
     if (LineInfoGenerated == false) {
@@ -1377,7 +1376,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
       if (!S)
         return;
       MCSymbol *FuncLabel = Asm->getFunctionBegin();
-      constructLineInfo(S, FuncLabel, S->getLine(), 0);
+      constructLineInfo(FuncLabel, S->getFile(), S->getLine(), 0);
       LineInfoGenerated = true;
     }
 
@@ -1389,8 +1388,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
   OS.emitLabel(LineSym);
 
   // Construct the lineinfo.
-  auto SP = DL->getScope()->getSubprogram();
-  constructLineInfo(SP, LineSym, DL.getLine(), DL.getCol());
+  constructLineInfo(LineSym, DL->getFile(), DL.getLine(), DL.getCol());
 
   LineInfoGenerated = true;
   PrevInstLoc = DL;
diff --git a/llvm/lib/Target/BPF/BTFDebug.h b/llvm/lib/Target/BPF/BTFDebug.h
index 7536006ed21ccd..11a0c59ba6c90b 100644
--- a/llvm/lib/Target/BPF/BTFDebug.h
+++ b/llvm/lib/Target/BPF/BTFDebug.h
@@ -343,10 +343,10 @@ class BTFDebug : public DebugHandlerBase {
 
   /// Get the file content for the subprogram. Certain lines of the file
   /// later may be put into string table and referenced by line info.
-  std::string populateFileContent(const DISubprogram *SP);
+  std::string populateFileContent(const DIFile *File);
 
   /// Construct a line info.
-  void constructLineInfo(const DISubprogram *SP, MCSymbol *Label, uint32_t Line,
+  void constructLineInfo(MCSymbol *Label, const DIFile *File, uint32_t Line,
                          uint32_t Column);
 
   /// Generate types and variables for globals.
diff --git a/llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll b/llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll
index 8ca8a66027379a..024ed04f6e5e17 100644
--- a/llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll
+++ b/llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll
@@ -108,8 +108,8 @@ define dso_local i32 @bpf_prog(ptr) local_unnamed_addr #0 !dbg !15 {
 ; CHECK-NEXT:        .long   0
 ; CHECK-NEXT:        .long   20
 ; CHECK-NEXT:        .long   20
-; CHECK-NEXT:        .long   124
-; CHECK-NEXT:        .long   144
+; CHECK-NEXT:        .long   108
+; CHECK-NEXT:        .long   128
 ; CHECK-NEXT:        .long   28
 ; CHECK-NEXT:        .long   8                       # FuncInfo
 



More information about the llvm-commits mailing list