[lldb] [llvm] [lldb] Add HTTP support in SymbolLocatorSymStore (PR #186986)

Stefan Gränitz via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 20 02:40:35 PDT 2026


================
@@ -95,20 +98,156 @@ SymbolLocator *SymbolLocatorSymStore::CreateInstance() {
   return new SymbolLocatorSymStore();
 }
 
+namespace {
+
 // RSDS entries store identity as a 20-byte UUID composed of 16-byte GUID and
 // 4-byte age:
 //   12345678-1234-5678-9ABC-DEF012345678-00000001
 //
 // SymStore key is a string with no separators and age as decimal:
 //   12345678123456789ABCDEF0123456781
 //
-static std::string formatSymStoreKey(const UUID &uuid) {
+std::string formatSymStoreKey(const UUID &uuid) {
   llvm::ArrayRef<uint8_t> bytes = uuid.GetBytes();
   uint32_t age = llvm::support::endian::read32be(bytes.data() + 16);
   constexpr bool LowerCase = false;
   return llvm::toHex(bytes.slice(0, 16), LowerCase) + std::to_string(age);
 }
 
+// This is a simple version of Debuginfod's StreamedHTTPResponseHandler. We
+// should consider reusing that once we introduce caching.
+class FileDownloadHandler : public llvm::HTTPResponseHandler {
+private:
+  std::error_code m_ec;
+  llvm::raw_fd_ostream m_stream;
+
+public:
+  FileDownloadHandler(llvm::StringRef file) : m_stream(file.str(), m_ec) {}
+  virtual ~FileDownloadHandler() = default;
+
+  llvm::Error handleBodyChunk(llvm::StringRef data) override {
+    // Propagate error from ctor.
+    if (m_ec)
+      return llvm::createStringError(m_ec, "Failed to open file for writing");
+    m_stream.write(data.data(), data.size());
+    if (std::error_code ec = m_stream.error())
+      return llvm::createStringError(ec, "Error writing to file");
+
+    return llvm::Error::success();
+  }
+};
+
+llvm::Error downloadFileHTTP(llvm::StringRef url, FileDownloadHandler dest) {
+  if (!llvm::HTTPClient::isAvailable())
+    return llvm::createStringError(
+        std::make_error_code(std::errc::not_supported),
+        "HTTP client is not available");
+  llvm::HTTPRequest Request(url);
+  Request.FollowRedirects = true;
+
+  llvm::HTTPClient Client;
+
+  // TODO: Since PDBs can be huge, we should distinguish between resolve,
+  // connect, send and receive.
+  Client.setTimeout(std::chrono::seconds(60));
+
+  if (llvm::Error Err = Client.perform(Request, dest))
+    return Err;
+
+  unsigned ResponseCode = Client.responseCode();
+  if (ResponseCode != 200) {
+    return llvm::createStringError(std::make_error_code(std::errc::io_error),
+                                   "HTTP request failed with status code " +
+                                       std::to_string(ResponseCode));
+  }
+
+  return llvm::Error::success();
+}
+
+bool has_unsafe_characters(llvm::StringRef s) {
+  for (unsigned char c : s) {
+    // RFC 3986 unreserved characters are safe for file names and URLs.
+    if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') ||
+        (c >= '0' && c <= '9') || c == '-' || c == '.' || c == '_' ||
+        c == '~') {
+      continue;
+    }
+
+    return true;
+  }
+
+  // Avoid path semantics issues.
+  return s == "." || s == "..";
+}
+
+std::optional<FileSpec>
+requestFileFromSymStoreServerHTTP(llvm::StringRef base_url, llvm::StringRef key,
+                                  llvm::StringRef pdb_name) {
+  using namespace llvm::sys;
+  Log *log = GetLog(LLDBLog::Symbols);
+
+  // Make sure URL will be valid, portable, and compatible with symbol servers.
+  if (has_unsafe_characters(pdb_name)) {
+    Debugger::ReportWarning(llvm::formatv(
+        "Rejecting HTTP lookup for PDB file due to unsafe characters in "
+        "name: {0}",
+        pdb_name));
+    return {};
+  }
+
+  // Construct the path for local storage. Configurable cache coming soon.
+  llvm::SmallString<128> cache_file;
+  if (!path::cache_directory(cache_file)) {
+    Debugger::ReportWarning("Failed to determine cache directory for SymStore");
+    return {};
----------------
weliveindetail wrote:

Yes, this code is dump and far from the final behavior in many ways. We need a file-baked buffered download into a temporary file eventually to avoid corruptions in the cache. Then read it (at least partially) to check if the UUID fits, because PDBs have no checksum-based validation. If it's good, we copy it into the cache and use it. One open question is, if we want to do the same for existing PDBs in the cache, since the files might be corrupt too.

IMHO these are all caching and fine-tuning questions. If possible, I'd like to take care of it in a separate patch. The current state works and is testable in isolation. Let me add a TODO and make a note in the summary.

https://github.com/llvm/llvm-project/pull/186986


More information about the llvm-commits mailing list