<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/124874>124874</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[WebAssembly] Non-canonical LEB128 parameters wasting significant space
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Photosounder
</td>
</tr>
</table>
<pre>
WebAssembly binaries made by clang (currently using 19.1.7) do something strange, they write LEB128 immediate parameters in the longest padded out form (long enough for 32 or 64 bits, regardless of the actual value's needs) which is both wasteful and non-canonical, like writing `82 80 80 80 00` instead of simply `02`. It seems that at least a few percent of the binary size is made of those unnecessary padded LEB128 bytes.
Examples from a binary compiled with a wasm64 target:
global.get 0: `23 80 80 80 80 00` (`23` is the `global.get` opcode and what follows is the immediate argument which is the global's ID)
global.set 0: `24 80 80 80 80 00`
local.set and local.get don't have this problem, for instance local.get 0: `20 00`
i64.const 21093: `42 e5 a4 81 80 80 80 80 80 80 00`
i32.store 2 91016: `32 02 88 c7 85 80 80 80 80 80 80 00`
i32.load 2 91008: `28 02 80 c7 85 80 80 80 80 80 80 00`
call 20: `10 94 80 80 80 00`
call_indirect 14 0: `11 8e 80 80 80 00 80 80 80 80 00`
return_call 46: `12 ae 80 80 80 00`
Oddly something like i32.store is inconsistent, sometimes it's canonical like `36 02 01`, other times it's not, seemingly when the argument (the address offset) is a long address like `0x15060` instead of a small offset like `4`, presumably because the long address offset could be 32 or 64 bits while the small offset (like when accessing a `struct` element) is known to be a small number right away.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyEVUur4zgT_TXKpuggy07iLLJIPy40fHwzu1k2Zblia1qWjKp805lfP0h53NyehoZAsFSPc46kU8jshkB0UJuPavN5hYuMMR3-HKNEjkvoKa262F8Of1F3ZKap8xfoXMDkiGHCnqC7gPUYBlCmtUtKFMRfYGEXBqj262q9U2YPfQSOE8mYl1kShoGU-QQy0gXOyQnB_758rEwLbpqodygEMyacSCgxuJAjwccwEAvM2PfUQ1wETjFNuXXeAgpxGca8BrWBmGDbQOeEc6dEA6beEzPEU6mGVhb08Ip-IWV2DIGo54z2PDo7gmPoooxwRhY6LR4w9BBi-GAxxOAs-lzXu-9UGGRmaqtbA62-_bRWWw0usBD2uS27afaXHKaN2uo1fBVgoolBRhRAAU_IAggnOsNMyVKQO96i-wXY_UMZW1G_bEUmWEIgS8w54ibPTdDuIsRrpY9KH7_8wGn2xHBKcQK8l7Rxmp2nHs5ORsDMeNo2IJgGElUfr8mDjx369UACWtXHzMLU8Mb2QViZtuwV8lywq61-S8_rcbaxpyLpOVM_Re_jme_xb7cA07BMWYXHoeT9a7Fyal8_K7N_g8fP8JpfwFP66KO9Reb-169Mq49BmZ3AiK8EMjqGOcXO05QPOt-qfJQYLD3lPHppeNR322ZtY2ABU-l9fYtoDABtABtoq3ew3oNztVmzxERgYF_pantLrw2ANgBtC3YH7eY3JXzE_lpBt3eI7a2C_m0Fi96DuXOrNMC--XXUNxd6l8gKVM1DjKqClp7jf30OiWRJ4Vtp1tx5VgYA6b_BSh__6Ht_eTKS8vbe9HLZKLLujoWC5EMrsW4iBifltjze7jU567q9qqKr3Md8gigjJXiXFeK1GtHkwuAvcB7pakmP66lMW777Pl095sQk2UwcAxbneuzdO-sf1UZvf_YIBJ6yINcKj-Dmhm5OxMuExYjJ4sL0sMafmoONi--ho_dmmB-Svya9a5RNtHhZpoY2m0nWGHNzlrTY8m7J01S0LcS-h3gOIDE3ueMOy9RRguSGUQDPeFmv-kPd7-s9ruhQ7eq2MtXW1KvxsD-RxmbX7Da7DRljbNtZQ705NbSriOzKHYw2G12Zva7ruqnX-2Zz2latRTzRhupaNZomdH7t_eu0jmlYOeaFDpVp2l2z8tiR5zLdjAl0hrKrjMnDLh1y0oduGVg12jsWfisjTnwZi0-TT20-w_-f_f9usU-TKg-LMuPcENzJWQwCPKOl1ZL8YRSZORuqeVHmZXAyLt3axkmZl9z59vdhTvFvsqLMS8HLyrzcCL0ezL8BAAD__zBlZGQ">