Outside IT blocks, the MOV immediate instruction sets flags.
LLVM often avoids it when it could do something else instead, and this ends up
ballooning up code quite a bit - 0 and 1 being very common 8-bit constant.

In fact, this occurs for much more than just MOV. Here are some examples:
180:    f100 0001     add.w    r0, r0, #1    ; 0x1
186:    f04f 0009     mov.w    r0, #9    ; 0x9
1ea:    f145 0200     adc.w    r2, r5, #0    ; 0x0
1f4:    f04f 0600     mov.w    r6, #0    ; 0x0

Here is an example of why:
 1ee:    4283          cmp    r3, r0
 1f0:    f8cc 201c     str.w    r2, [ip, #28]
 1f4:    f04f 0600     mov.w    r6, #0    ; 0x0 <---
 1f8:    bf98          it    ls
 1fa:    2601          movls    r6, #1
 1fc:    428a          cmp    r2, r1
 1fe:    f04f 0500     mov.w    r5, #0    ; 0x0 <---
 202:    bfd8          it    le

cmp in both of these cases could be moved directly next to it, allowing mov.w
to be a 2-byte movs. Here's another example:

2c8:    2a00          cmp    r2, #0
 2ca:    bf01          itttt    eq
 2cc:    f8cc 8008     streq.w    r8, [ip, #8]
 2d0:    f8cc 100c     streq.w    r1, [ip, #12]
 2d4:    f04f 0e00     moveq.w    lr, #0    ; 0x0
 2d8:    460d          moveq    r5, r1
 2da:    f1be 0f00     cmp.w    lr, #0    ; 0x0
 2de:    f04f 0200     mov.w    r2, #0    ; 0x0 <---
 2e2:    bfb8          it    lt

Not only could cmp/itttt be turned into cbnz to save code space and processor
time, again we see a mov.w inserted between the cmp.w and it unnecessarily,
causing it to be 4 bytes instead of 2.

It would be nice if this could be fixed. The cmp are unnecessarily far from
their corresponding it.

I expect this affects performance as well as size, given that only so much code
gets retrieved at a time, and this increases it *a lot*.

