[llvm-bugs] Issue 15022 in oss-fuzz: llvm/llvm-microsoft-demangle-fuzzer: Out-of-memory in llvm_llvm-microsoft-demangle-fuzzer
tha… via monorail via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Jun 5 05:58:14 PDT 2019
Comment #2 on issue 15022 by thakis at chromium.org:
llvm/llvm-microsoft-demangle-fuzzer: Out-of-memory in
llvm_llvm-microsoft-demangle-fuzzer
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15022#c2
The root cause here ultimately is that demangling output can be exponential
in the size of the input, even for valid manglings. Consider:
$ cat test.cc
template <class T1, class T2, class T3, class T4, class T5,
class T6, class T7, class T8, class T9, class T10>
struct Fooob {};
using A0 = Fooob<int, int, int, int, int, int, int, int, int, int>;
using A1 = Fooob<A0, A0, A0, A0, A0, A0, A0, A0, A0, A0>;
using A2 = Fooob<A1, A1, A1, A1, A1, A1, A1, A1, A1, A1>;
using A3 = Fooob<A2, A2, A2, A2, A2, A2, A2, A2, A2, A2>;
using A4 = Fooob<A3, A3, A3, A3, A3, A3, A3, A3, A3, A3>;
using A5 = Fooob<A4, A4, A4, A4, A4, A4, A4, A4, A4, A4>;
using A6 = Fooob<A5, A5, A5, A5, A5, A5, A5, A5, A5, A5>;
using A7 = Fooob<A6, A6, A6, A6, A6, A6, A6, A6, A6, A6>;
void f(A7 a) {}
$ out/gn/bin/clang-cl /c test.cc
$ out/gn/bin/llvm-nm test.obj
00000000
T ?f@@YAXU?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at HHHHHHHHHH@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@@Z
00000000 a @feat.00
$ /usr/bin/time -l
out/gn/bin/llvm-undname '?f@@YAXU?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at U?$Fooob at HHHHHHHHHH@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@U1 at U1@U1 at U1@U1 at U1@U1 at U1@U1@@@@Z'
> tmp.txt
3.93 real 0.76 user 1.15 sys
1459576832 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
516405 page reclaims
0 page faults
0 swaps
0 block input operations
0 block output operations
0 messages sent
0 messages received
0 signals received
943 voluntary context switches
62 involuntary context switches
$ ls -hl tmp.txt
-rw-r--r-- 1 thakis staff 625M Jun 5 08:36 tmp.txt
This grows 10x with each additional A. I can make the backref entries lazy
and that should save one level of memory use and for this example it'd
probably cut memory use almost 10x -- but the cc file only needs to grow a
single additional line to waste that again, and the mangled string will
only grow by a few bytes for this.
This also isn't llvm-undname specific, llvm-cxxfilt has the same issue:
$ out/gn/bin/clang -c test.cc
$ out/gn/bin/llvm-nm test.o
0000000000000000 T
__Z1f5FooobIS_IS_IS_IS_IS_IS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_ES2_S2_S2_S2_S2_S2_S2_S2_S2_ES3_S3_S3_S3_S3_S3_S3_S3_S3_ES4_S4_S4_S4_S4_S4_S4_S4_S4_ES5_S5_S5_S5_S5_S5_S5_S5_S5_ES6_S6_S6_S6_S6_S6_S6_S6_S6_E
$ /usr/bin/time -l out/gn/bin/llvm-cxxfilt
_Z1f5FooobIS_IS_IS_IS_IS_IS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_ES2_S2_S2_S2_S2_S2_S2_S2_S2_ES3_S3_S3_S3_S3_S3_S3_S3_S3_ES4_S4_S4_S4_S4_S4_S4_S4_S4_ES5_S5_S5_S5_S5_S5_S5_S5_S5_ES6_S6_S6_S6_S6_S6_S6_S6_S6_E
> tmp.txt
5.02 real 2.85 user 0.78 sys
1159876608 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
316004 page reclaims
16 page faults
0 swaps
0 block input operations
0 block output operations
0 messages sent
0 messages received
0 signals received
671 voluntary context switches
189 involuntary context switches
$ ls -hl tmp.txt
-rw-r--r-- 1 thakis staff 552M Jun 5 08:39 tmp.txt
As said, the root cause here is that the output can be exponential in the
input size. Nobody really wants to have exponentially-sized output from
their demangler -- the following is just hard to read:
demumble '__Z1f5FooobIS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_E'
f(Fooob<Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> >, Fooob<Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int>, Fooob<int, int, int, int, int, int, int, int, int, int>,
Fooob<int, int, int, int, int, int, int, int, int, int>, Fooob<int, int,
int, int, int, int, int, int, int, int>, Fooob<int, int, int, int, int,
int, int, int, int, int>, Fooob<int, int, int, int, int, int, int, int,
int, int> > >)
So maybe we should change the demangler output for backreferences to
instead produces something like this instead:
$
demumble '__Z1f5FooobIS_IS_IiiiiiiiiiiES0_S0_S0_S0_S0_S0_S0_S0_S0_ES1_S1_S1_S1_S1_S1_S1_S1_S1_E'
#define S0 Foob
#define S1 S0<int, int, int, int, int, int, int, int, int, int>
#define S2 S0<S1, S1, S1, S1, S1, S1, S1, S1, S1, S1>
f(S2)
which is a more direct translation of what's in the mangling and arguably
easier to read. It has the advantage of needing storage that's linear in
the input.
--
You received this message because:
1. You were specifically CC'd on the issue
You may adjust your notification preferences at:
https://bugs.chromium.org/hosting/settings
Reply to this email to add a comment.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190605/ca9f959d/attachment-0001.html>
More information about the llvm-bugs
mailing list