[llvm] BPF: Use DebugLoc to find Filename for BTF line info (PR #90302)

via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 26 19:13:55 PDT 2024


https://github.com/yonghong-song updated https://github.com/llvm/llvm-project/pull/90302

>From e5234553cbd012c47a937936f9b52d0eac9df13e Mon Sep 17 00:00:00 2001
From: Yonghong Song <yonghong.song at linux.dev>
Date: Fri, 26 Apr 2024 15:21:24 -0700
Subject: [PATCH] BPF: Use DebugLoc to find Filename for BTF line info

Andrii found an issue where the BTF line info may have empty
source which seems wrong. The program is a Meta internal
bpf program. I can reproduce with latest upstream compiler
as well. Let the bpf program built without this patch and then with
the following veristat check where veristat is a bpf verifier tool
to do kernel verification for bpf programs:
  $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
  $ rg '^;' log | sort | uniq -c | sort -nr | head -n10
   4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
   3907 ; if (now < start_allow_time) { @ ban.h:17
   3674 ;  @ src_mitigations.h:0
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
   1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
   1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
   1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
   1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
   1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
   1688 ; if (throttle_cfg) { @ rate_limit.h:85

You can see
   3674 ;  @ src_mitigations.h:0
where we do not have proper line information and line number.

In LLVM Machine IR, some instructions may carry DebugLoc information
to specify where the corresponding source is for this instruction.
The information includes file_name, line_num and col_num.
Each instruction may also attribute to a function in debuginfo.
So there are two ways to find file_name for a particular insn:
  (1) find the corresponding function in debuginfo
      (MI->getMF()->getFunction().getSubprogram()) and then
      find the file_name from DISubprogram.
  (2) find the corresponding file_name from DebugLoc.

The option (1) is used in current implementation. This mostly works.
But if one instruction is somehow generated from multiple functions,
the compiler has to pick just one. This may cause a mismatch between
file_name and line_num/col_num. This is exactly what happened
in the previous example. I found bpf selftests also have some
cases where file names from DISubprogram and DebugLoc are different.

It looks like that finding file_name from DebugLoc is more
robust since all of file_name/line_num/col_num are from
the same entity. This patch used this approach. With this
patch, we have:
  $ veristat -vl2 yhs.bpf.o --log-size=150000000 >& log
  $ rg '^;' log.latest | sort | uniq -c | sort -nr | head -n10
   4206 ; } else if (action->dry_run) { @ src_mitigations.h:57
   3907 ; if (now < start_allow_time) { @ ban.h:17
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85
   1737 ; pkt_info->is_dry_run_drop = action->dry_run; @ src_mitigations.h:26
   1737 ; if (mitigation == ALLOW) { @ src_mitigations.h:28
   1737 ; enum match_action mitigation = action->action; @ src_mitigations.h:25
   1727 ; void* res = bpf_map_lookup_elem(bpf_map, key); @ filter_helpers.h:498
   1691 ; bpf_map_lookup_elem(&rate_limit_config_map, rule_id); @ rate_limit.h:76
   1688 ; if (throttle_cfg) { @ rate_limit.h:85
   1670 ; if (rl_cfg) { @ rate_limit.h:77

You can see that we do not have empty line any more.
   3223 ; if (action->vip_id != ALL_VIPS_ID && action->vip_id != vip_id) { @ src_mitigations.h:85

Signed-off-by: Yonghong Song <yonghong.song at linux.dev>
---
 llvm/lib/Target/BPF/BTFDebug.cpp | 12 +++++-------
 llvm/lib/Target/BPF/BTFDebug.h   |  4 ++--
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Target/BPF/BTFDebug.cpp b/llvm/lib/Target/BPF/BTFDebug.cpp
index ebd8447eba850e..0cf73272b5fd08 100644
--- a/llvm/lib/Target/BPF/BTFDebug.cpp
+++ b/llvm/lib/Target/BPF/BTFDebug.cpp
@@ -973,8 +973,7 @@ void BTFDebug::visitMapDefType(const DIType *Ty, uint32_t &TypeId) {
 }
 
 /// Read file contents from the actual file or from the source
-std::string BTFDebug::populateFileContent(const DISubprogram *SP) {
-  auto File = SP->getFile();
+std::string BTFDebug::populateFileContent(const DIFile *File) {
   std::string FileName;
 
   if (!File->getFilename().starts_with("/") && File->getDirectory().size())
@@ -1005,9 +1004,9 @@ std::string BTFDebug::populateFileContent(const DISubprogram *SP) {
   return FileName;
 }
 
-void BTFDebug::constructLineInfo(const DISubprogram *SP, MCSymbol *Label,
+void BTFDebug::constructLineInfo(MCSymbol *Label, const DIFile *File,
                                  uint32_t Line, uint32_t Column) {
-  std::string FileName = populateFileContent(SP);
+  std::string FileName = populateFileContent(File);
   BTFLineInfo LineInfo;
 
   LineInfo.Label = Label;
@@ -1377,7 +1376,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
       if (!S)
         return;
       MCSymbol *FuncLabel = Asm->getFunctionBegin();
-      constructLineInfo(S, FuncLabel, S->getLine(), 0);
+      constructLineInfo(FuncLabel, S->getFile(), S->getLine(), 0);
       LineInfoGenerated = true;
     }
 
@@ -1389,8 +1388,7 @@ void BTFDebug::beginInstruction(const MachineInstr *MI) {
   OS.emitLabel(LineSym);
 
   // Construct the lineinfo.
-  auto SP = DL->getScope()->getSubprogram();
-  constructLineInfo(SP, LineSym, DL.getLine(), DL.getCol());
+  constructLineInfo(LineSym, DL.get()->getFile(), DL.getLine(), DL.getCol());
 
   LineInfoGenerated = true;
   PrevInstLoc = DL;
diff --git a/llvm/lib/Target/BPF/BTFDebug.h b/llvm/lib/Target/BPF/BTFDebug.h
index 7536006ed21ccd..11a0c59ba6c90b 100644
--- a/llvm/lib/Target/BPF/BTFDebug.h
+++ b/llvm/lib/Target/BPF/BTFDebug.h
@@ -343,10 +343,10 @@ class BTFDebug : public DebugHandlerBase {
 
   /// Get the file content for the subprogram. Certain lines of the file
   /// later may be put into string table and referenced by line info.
-  std::string populateFileContent(const DISubprogram *SP);
+  std::string populateFileContent(const DIFile *File);
 
   /// Construct a line info.
-  void constructLineInfo(const DISubprogram *SP, MCSymbol *Label, uint32_t Line,
+  void constructLineInfo(MCSymbol *Label, const DIFile *File, uint32_t Line,
                          uint32_t Column);
 
   /// Generate types and variables for globals.



More information about the llvm-commits mailing list