[llvm-bugs] Issue 15022 in oss-fuzz: llvm/llvm-microsoft-demangle-fuzzer: Out-of-memory in llvm_llvm-microsoft-demangle-fuzzer

tha… via monorail via llvm-bugs llvm-bugs at lists.llvm.org
Wed Jun 5 05:58:14 PDT 2019


Comment #2 on issue 15022 by thakis at chromium.org:  
llvm/llvm-microsoft-demangle-fuzzer: Out-of-memory in  
llvm_llvm-microsoft-demangle-fuzzer
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15022#c2

The root cause here ultimately is that demangling output can be exponential  
in the size of the input, even for valid manglings. Consider:

$ cat test.cc
template <class T1, class T2, class T3, class T4, class T5,
           class T6, class T7, class T8, class T9, class T10>
struct Fooob {};

using A0 = Fooob<int, int, int, int, int, int, int, int, int, int>;
using A1 = Fooob<A0, A0, A0, A0, A0, A0, A0, A0, A0, A0>;
using A2 = Fooob<A1, A1, A1, A1, A1, A1, A1, A1, A1, A1>;
using A3 = Fooob<A2, A2, A2, A2, A2, A2, A2, A2, A2, A2>;
using A4 = Fooob<A3, A3, A3, A3, A3, A3, A3, A3, A3, A3>;
using A5 = Fooob<A4, A4, A4, A4, A4, A4, A4, A4, A4, A4>;
using A6 = Fooob<A5, A5, A5, A5, A5, A5, A5, A5, A5, A5>;
using A7 = Fooob<A6, A6, A6, A6, A6, A6, A6, A6, A6, A6>;

void f(A7 a) {}

$ out/gn/bin/clang-cl /c test.cc
$ out/gn/bin/llvm-nm test.obj
00000000  
T ?f@@YAXU?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at HHHHHHHHHH@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@@Z
00000000 a @feat.00
$ /usr/bin/time -l  
out/gn/bin/llvm-undname '?f@@YAXU?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at HHHHHHHHHH@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@@Z'  
> tmp.txt
         3.93 real         0.76 user         1.15 sys
1459576832  maximum resident set size
          0  average shared memory size
          0  average unshared data size
          0  average unshared stack size
     516405  page reclaims
          0  page faults
          0  swaps
          0  block input operations
          0  block output operations
          0  messages sent
          0  messages received
          0  signals received
        943  voluntary context switches
         62  involuntary context switches
$ ls -hl tmp.txt
-rw-r--r--  1 thakis  staff   625M Jun  5 08:36 tmp.txt


This grows 10x with each additional A. I can make the backref entries lazy  
and that should save one level of memory use and for this example it'd  
probably cut memory use almost 10x -- but the cc file only needs to grow a  
single additional line to waste that again, and the mangled string will  
only grow by a few bytes for this.

This also isn't llvm-undname specific, llvm-cxxfilt has the same issue:


$ out/gn/bin/clang -c test.cc
$ out/gn/bin/llvm-nm test.o
0000000000000000 T  
__Z1f5FooobIS_IS_IS_IS_IS_IS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_ES2_S2_S2_S2_S2_S2_S2_S2_S2_ES3_S3_S3_S3_S3_S3_S3_S3_S3_ES4_S4_S4_S4_S4_S4_S4_S4_S4_ES5_S5_S5_S5_S5_S5_S5_S5_S5_ES6_S6_S6_S6_S6_S6_S6_S6_S6_E
$ /usr/bin/time -l out/gn/bin/llvm-cxxfilt  
_Z1f5FooobIS_IS_IS_IS_IS_IS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_ES2_S2_S2_S2_S2_S2_S2_S2_S2_ES3_S3_S3_S3_S3_S3_S3_S3_S3_ES4_S4_S4_S4_S4_S4_S4_S4_S4_ES5_S5_S5_S5_S5_S5_S5_S5_S5_ES6_S6_S6_S6_S6_S6_S6_S6_S6_E  
> tmp.txt
         5.02 real         2.85 user         0.78 sys
1159876608  maximum resident set size
          0  average shared memory size
          0  average unshared data size
          0  average unshared stack size
     316004  page reclaims
         16  page faults
          0  swaps
          0  block input operations
          0  block output operations
          0  messages sent
          0  messages received
          0  signals received
        671  voluntary context switches
        189  involuntary context switches
$ ls -hl tmp.txt
-rw-r--r--  1 thakis  staff   552M Jun  5 08:39 tmp.txt



As said, the root cause here is that the output can be exponential in the  
input size. Nobody really wants to have exponentially-sized output from  
their demangler -- the following is just hard to read:

demumble '__Z1f5FooobIS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_E'
f(Fooob<Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,  
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,  
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,  
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,  
int, int> > >)


So maybe we should change the demangler output for backreferences to  
instead produces something like this instead:

$  
demumble '__Z1f5FooobIS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_E'
#define S0 Foob
#define S1 S0<int, int, int, int, int,  int, int, int, int, int>
#define S2 S0<S1, S1, S1, S1, S1,  S1, S1, S1, S1, S1>
f(S2)

which is a more direct translation of what's in the mangling and arguably  
easier to read. It has the advantage of needing storage that's linear in  
the input.

-- 
You received this message because:
   1. You were specifically CC'd on the issue

You may adjust your notification preferences at:
https://bugs.chromium.org/hosting/settings

Reply to this email to add a comment.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190605/ca9f959d/attachment-0001.html>


More information about the llvm-bugs mailing list