<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Assembler runs forever trying to assemble file with uleb128 and balign"
   href="https://bugs.llvm.org/show_bug.cgi?id=35809">35809</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Assembler runs forever trying to assemble file with uleb128 and balign
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>MC
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>ryan.prichard@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The LLVM MC assembler loops forever trying to assemble this:

            .data
            .uleb128 thing_end - thing
    thing:
            .byte 0xd1, 0xd2, 0xd3, 0xd4
            .balign 2
            .fill 0x7b, 1, 0xcc
    thing_end:

It's not obvious how to assemble it, because it's impossible to satisfy all the
apparent constraints:

 - If the .uleb128 is only one byte [encodes 00..0x7f], then .balign 2 also
uses one byte, and (thing_end - thing) is 0x80.

 - If the .uleb128 is two bytes, then .balign 2 (should?) use zero bytes, and
(thing_end - thing) is 0x7f. 0x7f's uleb128 encoding is one byte.

I don't know if this issue is a practical concern yet. I noticed it while
looking at Clang's DWARF EH output, which uses udata4 encoding for offsets
rather than uleb128.

The GNU assembler successfully assembles it by expanding .balign 2 into two
zero bytes, which surprised me, because the directive was already aligned to 2
bytes.

    $ gcc uleb128-assembly-problem.s -c && readelf -x .data
uleb128-assembly-problem.o
    Hex dump of section '.data':
      0x00000000 8101d1d2 d3d40000 cccccccc cccccccc ................
      0x00000010 cccccccc cccccccc cccccccc cccccccc ................
      0x00000020 cccccccc cccccccc cccccccc cccccccc ................
      0x00000030 cccccccc cccccccc cccccccc cccccccc ................
      0x00000040 cccccccc cccccccc cccccccc cccccccc ................
      0x00000050 cccccccc cccccccc cccccccc cccccccc ................
      0x00000060 cccccccc cccccccc cccccccc cccccccc ................
      0x00000070 cccccccc cccccccc cccccccc cccccccc ................
      0x00000080 cccccc                              ...</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>