<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/62800>62800</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[mlir] Property storage makes parsing a file with many attributes go OOM
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
matthiaskramm
</td>
</tr>
</table>
<pre>
We have a .mlir that contains a function with a huge number of arguments (~300k args, so the function has properties like `arg272506`).
While this is arguably a silly thing to have (this is part of a stress test, but as far as I can tell, it stems from a real model), it does break MLIR after the introduction of properties. Memory consumption goes beyond 100Gb just for parsing the file.
As far as I can tell, this is what happens:
(a) `OperationParser` calls `OpBuilder::create` which calls `Operation::create`.
(b) `Operation::create` calls `Operation::setAttrs` with the (huge) attribute dictionary
(c) `Operation::setAttrs` discovers that we have properties storage and decides that it has to set the attributes one by one, since any of them might be an inherent attribute.
(d) `Operation::setAttr` is called individually for each of the 500k attributes, likely uniquing the (partial) `NamedAttrLists` in the process.
(e) We run out of memory.
I'm not sure what the best fix is here. Maybe `Operation::setAttr` needs to check the name of each inherent attribute against `newAttrs` to see if there's overlap, and if there isn't, it can still fall back to `attrs = newAttrs`?
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx8Vc2O2zYQfhr6MoghU7YsH3zYdOEiQLYJesl5JI0lxhTpcka71aXPXgzt9TrbJhcLMIfz_fAbEpldH4j2ZvPRbB4XOMkQ035EkcEhnxKO46KJ3bz_RjDgMwHCcvQugQwo0MYg6AIDwnEKrbgY4MXJAAjD1BOEaWwoQTwCpn4aKQiDsfU_ZVGc9C829jfgCDLQW4MBGc4pnimJIwbvTgSmKjD1dms3RWWqwtjd0hSPpni4_H4bnCeQwTE4zljY-BkQ2Hk_60LoQeJFgbH1a-UZk2R2wJKIGYRYlFMzCSDDEZN-PkGLAYS81zUnwEIjwzHFERASoYcxduSN3V0LukgMTSI8wdPnT38CHoVSlumCpNhNF6nxeKd0CU80xjSrqzyN51zR50Y0x9DBqih-b-D7xALHmJQ8Z11qnvP0gyMPPyH_qvxFj2_A85kCm_LhfquxNRq7U8-_nCmh8viKiSmZqoAWvefL2sfJ-Y6Sbi8f2kQopBUvg2uH-7prj3d1yxta8x7tfcef9WKSB5HEGVVjp1YYW2v2tCeKJNdMQtC5bDim-Yba_j_qfc_OcRufKfEl7S_XEbgLJ0tM2BNg6KCj1nV0rXWScywRmCTzupFhiIGgmfWT8-9Cqx1mjYMMNMLo-kGg0T_BhYESBXnb_2Zc90sJqsBxNo86cKFzz66bUAdC40PYDldE2OR5vBFUVjp2foYpuL-m15QZW-vEOPRX4D9wpE6xPjuWbJkLufKcYkvMb1TzeXwjSFOAOOWhG3Paf0jtJ2O3I4QowFOiS0i1XUOaefe36lE7lvCEc0O_1h6IunwC7UDtKfcJOJJCZ-3_dRaw19tMtG-gl1sO8ikSuGxWImO3DJoLj2e1Sg__dQ0cB2O3cr0HdPRYnPdwRO-hQSUS83WmzcGUj3CHZMrDotuX3a7c4YL2q6per9f1brNeDPtdVW2r7brF6lis67LY2XW1KhsqsaJyu2oWbm8LWxabVW2t3a3WS6wslk1T1lRX5Q7RrAsa0fml98_jMqZ-4Zgn2le2LoqFx4Y8vz4Eaa9FH5qpZ7MuvB7v2zZx4vOToS-B2TzC18tEzLd5GPFEfLuhMN9PlxEdNeh3s9BH-PLlaTElvx9EzvkysgdjD72TYWqWbRyNPSj09fPhnOJ3asXYQ6bPxh6ygn8DAAD__34iQEY">