[llvm] r322520 - [docs] Only LLVM IR bitstreams begin with 'BC'
Brian Gesiak via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 15 13:23:32 PST 2018
Author: modocache
Date: Mon Jan 15 13:23:32 2018
New Revision: 322520
URL: http://llvm.org/viewvc/llvm-project?rev=322520&view=rev
Log:
[docs] Only LLVM IR bitstreams begin with 'BC'
Summary:
The LLVM Bitcode File Format documentation states that all bitstreams
begin with the magic number 'BC', and that generic bitstream analyzer
tools may check for this number in order to determine whether the
stream is a bitstream.
However, in practice:
* Only LLVM IR bitcode begins with 'BC'. Other bitstreams -- Clang
AST files and precompiled headers, Clang serialized diagnostics,
Swift modules -- do not start with 'BC'. A tool that actually checked
for 'BC' would only be able to recognize LLVM IR.
* The `llvm-bcanalyzer`, arguably the most used generic bitstream
analyzer tool, does not check for a magic number 'BC' (except to
determine whether the file is LLVM IR).
Update the bitcode format documentation to make it clear that not all
bitstreams begin with 'BC', and that tools should not rely on that
particular magic number value.
Test Plan:
Build the `docs-llvm-html` target and confirm the changes render in
a Safari web browser.
Reviewers: harlanhaskins, eugenis, mehdi_amini, pcc, angerman
Reviewed By: angerman
Subscribers: angerman, llvm-commits
Differential Revision: https://reviews.llvm.org/D42002
Modified:
llvm/trunk/docs/BitCodeFormat.rst
Modified: llvm/trunk/docs/BitCodeFormat.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/BitCodeFormat.rst?rev=322520&r1=322519&r2=322520&view=diff
==============================================================================
--- llvm/trunk/docs/BitCodeFormat.rst (original)
+++ llvm/trunk/docs/BitCodeFormat.rst Mon Jan 15 13:23:32 2018
@@ -62,10 +62,12 @@ understanding the encoding.
Magic Numbers
-------------
-The first two bytes of a bitcode file are 'BC' (``0x42``, ``0x43``). The second
-two bytes are an application-specific magic number. Generic bitcode tools can
-look at only the first two bytes to verify the file is bitcode, while
-application-specific programs will want to look at all four.
+The first four bytes of a bitstream are used as an application-specific magic
+number. Generic bitcode tools may look at the first four bytes to determine
+whether the stream is a known stream type. However, these tools should *not*
+determine whether a bitstream is valid based on its magic number alone. New
+application-specific bitstream formats are being developed all the time; tools
+should not reject them just because they have a hitherto unseen magic number.
.. _primitives:
@@ -496,12 +498,9 @@ LLVM IR Magic Number
The magic number for LLVM IR files is:
:raw-html:`<tt><blockquote>`
-[0x0\ :sub:`4`, 0xC\ :sub:`4`, 0xE\ :sub:`4`, 0xD\ :sub:`4`]
+['B'\ :sub:`8`, 'C'\ :sub:`8`, 0x0\ :sub:`4`, 0xC\ :sub:`4`, 0xE\ :sub:`4`, 0xD\ :sub:`4`]
:raw-html:`</blockquote></tt>`
-When combined with the bitcode magic number and viewed as bytes, this is
-``"BC 0xC0DE"``.
-
.. _Signed VBRs:
Signed VBRs
More information about the llvm-commits
mailing list