[libc-commits] [libc] [libc][hdrgen] Fix type extraction from compound function pointer args (PR #194354)

Jeff Bailey via libc-commits libc-commits at lists.llvm.org
Mon Apr 27 05:12:23 PDT 2026


https://github.com/kaladron created https://github.com/llvm/llvm-project/pull/194354

hdrgen extracts type names from function signatures to generate #include lines. It does this by splitting each argument type string into identifier words, filtering out C keywords, and joining the remainder with underscores to form a header filename.

For a simple type like "struct dirent **", this correctly produces "struct_dirent", which becomes:

  #include "llvm-libc-types/struct_dirent.h"

For a function pointer argument that mentions the same struct twice, like scandir's comparator:

  int (*)(const struct dirent **, const struct dirent **)

the words [struct, dirent, struct, dirent] were joined into "struct_dirent_struct_dirent", generating a bogus:

  #include "llvm-libc-types/struct_dirent_struct_dirent.h"

Fixed by deduplicating the word list before joining so that repeated type references within a single argument collapse to one type name.

>From 148a45d03a13f5d70e1614507a77725e53cabc9f Mon Sep 17 00:00:00 2001
From: Jeff Bailey <jbailey at raspberryginger.com>
Date: Mon, 27 Apr 2026 11:37:56 +0100
Subject: [PATCH] [libc][hdrgen] Fix type extraction from compound function
 pointer args

hdrgen extracts type names from function signatures to generate
#include lines. It does this by splitting each argument type string
into identifier words, filtering out C keywords, and joining the
remainder with underscores to form a header filename.

For a simple type like "struct dirent **", this correctly produces
"struct_dirent", which becomes:

  #include "llvm-libc-types/struct_dirent.h"

For a function pointer argument that mentions the same struct twice,
like scandir's comparator:

  int (*)(const struct dirent **, const struct dirent **)

the words [struct, dirent, struct, dirent] were joined into
"struct_dirent_struct_dirent", generating a bogus:

  #include "llvm-libc-types/struct_dirent_struct_dirent.h"

Fixed by deduplicating the word list before joining so that repeated
type references within a single argument collapse to one type name.
---
 libc/utils/hdrgen/hdrgen/function.py | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libc/utils/hdrgen/hdrgen/function.py b/libc/utils/hdrgen/hdrgen/function.py
index 4de3406cc408e..5211429a86c0d 100644
--- a/libc/utils/hdrgen/hdrgen/function.py
+++ b/libc/utils/hdrgen/hdrgen/function.py
@@ -57,11 +57,16 @@ def collapse(type_string):
             assert type_string
             # Split into words at nonidentifier characters (`*`, `[`, etc.),
             # filter out keywords and numbers, and then rejoin with "_".
-            return "_".join(
+            # Use dict.fromkeys to deduplicate while preserving order, so
+            # that function pointer types like
+            # "int (*)(const struct dirent **, const struct dirent **)"
+            # produce "struct_dirent" rather than "struct_dirent_struct_dirent".
+            words = [
                 word
                 for word in NONIDENTIFIER.split(type_string)
                 if word and not word.isdecimal() and word not in KEYWORDS
-            )
+            ]
+            return "_".join(dict.fromkeys(words))
 
         all_types = [self.return_type] + self.arguments
         return {



More information about the libc-commits mailing list