[llvm-dev] [RFC] New Clang target selection options for ARM/AArch64
Manoj Gupta via llvm-dev
llvm-dev at lists.llvm.org
Wed Apr 10 08:34:16 PDT 2019
On Fri, Sep 21, 2018 at 3:06 AM David Spickett via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> Below is a document detailing changes we'd like to make to Clang/LLVM to
> improve the usability of the target options for ARM and AArch64.
>
> To keep things simple the proposed changes are listed at the start and you
> can find the supporting examples at the end of the document.
>
> I look forward to your feedback.
>
> Thanks,
> David Spickett.
>
>
>
> RFC New Clang target feature selection options for ARM/AArch64
> --------------------------------------------------------------
>
> In this RFC we propose changes to ARM and AArch64 target selection. With
> the top level goals to:
> - validate that given options make sense within architectural restrictions
> - make option discovery and documentation easier
> - unify the list of extensions that command lines and asm directives use
> - bring the options closer to GCC's where appropriate
>
> Current Options Comparison
> --------------------------
>
> | GCC | Clang |
> |-------------------------------|
> | ARM | AArch64 | ARM | AArch64 |
> |----------------------|-----|---------|-----|---------|
> | -march with '+<ext>' | Y | Y | Y | Y |
> | checks extensions | Y | N | N | N |
> | .arch with '+<ext>' | N | N | N | Y |
> | .arch_extension | Y | Y | Y | N |
> | .fpu | Y | N | Y | N |
> | -mfpu | Y | N | Y | N |
> | checks FPUs | N | n/a | N | n/a |
> |----------------------|-----|---------|-----|---------|
>
> Examples of each of these can be found at the end of this document.
>
> Problems With the Current Options
> ---------------------------------
>
> - You cannot select all extensions through an assembly directive, since
> the AsmParser's list is a separate subset of the complete one in
> TargetParser.
> - Combinations of options are not checked for compatibility.
> - Many extensions are tied to their base architecture, though it is valid
> to add them individually to a previous v8.x-a architecture.
> - Users need to work out what FPU they need for ARM, this should be
> implied by the selected arch and extensions.
> - Discovery of valid extensions is difficult, both for the user and for
> the purposes of generating documentation.
>
> Proposed solution
> ------------------
>
> ARM and AArch64:
> - Make the TargetParser the single source for extension names, removing
> the AsmParser tables.
> - Reject unknown extension names with a diagnostic that includes a list of
> valid extensions for that architecture/CPU.
> - Reject invalid combinations of architecture/CPU and extensions with an
> error diagnostic.
> - Add independent subtarget features for each extension so that v8.x+1-a
> extensions can be used individually with earlier v8.x-a architectures where
> allowed.
> - Emit a warning when a mandatory feature of the base architecture is
> enabled with '+extension', or disabled with '+noextension'. (and ignore the
> option)
> - Errors caused by the solution above should be able to be downgraded to
> warnings with the usual -W* options. This applies only to cases where there
> is a reasonable interpretation of the options chosen.
>
> ARM:
> - Allow all possible ARM extensions in the '.arch_extension' directive,
> without the '+' syntax
> (allow them to be recognised, they could still be rejected for
> compatibility).
> - Add an 'auto' value for -mfpu and make it the default. Meaning that the
> FPU is implied by mcpu/march. If mfpu is not auto, it should override other
> options and a warning should be emitted.
> - Reject invalid mfpu and march/mcpu combinations with an error diagnostic.
> - Reject invalid arch/cpu and extension combinations with an error
> diagnostic.
>
> Optional features
> -----------------
>
> AArch64:
> - add the '.arch_extension' directive, with the same behaviour as ARM (no
> '+', one extension per directive). This brings Clang in line with GCC which
> has this directive for both architectures. Clang does however allow you to
> achieve the same thing by using '+' with '.arch'.
>
> ARM:
> - Allow '+' in '.arch' and '.cpu'. GCC does not allow this, but it would
> make ARM/AArch64 more consistent within Clang.
>
> Options Comparison With the Proposed Solution
> ----------------------------------------------
>
> Anything in brackets has changed from the previous table.
>
> | GCC | Clang |
> |-----------------------------------|
> | ARM | AArch64 | ARM | AArch64 |
> |----------------------|-----|---------|---------|---------|
> | -march with '+<ext>' | Y | Y | Y | Y |
> | checks extensions | Y | N | (Y) | (Y) |
> | .arch with '+<ext>' | N | N | (Y) | Y | (optional)
> | .arch_extension | Y | Y | Y | (Y) | (optional)
> | -mfpu | Y | N | Y | N |
> | .fpu | Y | N | Y | N |
> | checks FPUs | N | n/a | (Y) | n/a |
> |----------------------|-----|---------|---------|---------|
>
> Implementation
> --------------
>
> Use of Table-gen
> ================
>
> The current implementation of TargetParser has a number of FIXME comments
> saying that it should be changed to use tablegen instead of pre processor
> macros. There are several advantages of porting TargetParser to tablegen:
> - more readable than the current macros
> - allows default/optional values more easily
> - we can generate code and documentation from the same source
> - easier to add new properties
>
> Drawbacks:
> - it requires a new tablegen backend to generate the include files
> - additional indirection which could make debugging and future changes
> more difficult
>
> We think the benefits outweigh the disadvantages in this case.
>
> To do this, we would need to move TargetParser to break the cyclic
> dependency of LLVMSupport -> llvm-tblgen -> LLVMSupport. There are 2
> options for this:
> 1. create a new LLVMTargetParser library that contains all parsers for
> architectures that use it.
> 2. put the TargetParser for each backend in the library group for that
> backend. This requires one of:
> * Relaxing the requirement that target parsers must be built even if
> the backend is not.
> * Modifying the CMake scripts to build the target parsers even if the
> backend is not being built.
>
> Option 1 is simpler but option 2 would allow us to make use of the
> existing tablegen files in the backends so it is preferred.
>
> Using existing SubTarget features
> =================================
>
> If we go with option 2 above, we can reuse the existing subtarget features
> to work out any dependencies.
>
> We have a prototype that took option 1 above. The command line is
> converted into a sequence of options and resolved by the LLVM backend. This
> means that Clang does not know exactly what will be enabled. It needs to
> know this to output the correct pre processor feature test macros.
>
> Consider this AArch64 march:
> -march=armv8.4-a+crypto+nosha2
>
> The base arch is armv8.4-a, the crypto extension turns on
> AES/SHA2/SHA3/SM4. The nosha2 disables SHA2/SHA3 (since SHA3 is dependant
> on SHA2). Each of these features has an ACLE feature test macro, so Clang
> needs to know that nosha2 also disables SHA3.
>
> New Errors and Warnings
> =======================
>
> Whether these are errors or warnings by default is up for debate. This is
> a suggestion to begin with.
> (these apply to cmd lines and directives unless stated)
>
> Errors:
> - unknown extension in an assembly directive (currently fails silently)
> - extension incompatible with base arch, message shows the base arch it
> requires.
> - extension requires another which is disabled later, message shows which
> one is required.
> - extension requires another which is not enabled, message shows
> requirements.
> - ARM mfpu option is not 'auto' and is incompatible with the base arch,
> message shows list of valid FPUs.
>
> Warnings:
> - ARM mfpu option is not auto and another option implies a different FPU
> than the mfpu value. The mfpu value will be used, and the message will show
> what was overridden.
> - mandatory feature of the base arch is enabled with '+' (option is
> redundant so is ignored)
> - mandatory feature of a base arch is disabled with '+no<feature>' (option
> makes no sense so the extension remains enabled)
>
> Proposed diagnostic names: (in the same order as above)
> - "target-feature" (top level group)
> - "incompatible-feature"
> - "extension-requirement-disabled"
> - "extension-requires"
> - "incompatible-fpu"
> - "implied-fpu-unused"
> - "mandatory-feature-ignored"
> - "mandatory-feature-disabled"
>
> "Negative" Backend Features
> ===========================
>
> There are a couple of features in ARM which remove capabilities rather
> than adding them. These are 'd16' (removes the top 16 D registers) and
> 'fp-only-sp' (removes double precision).
> It would simplify the implementation if those were replaced with positive
> options. As in one that adds the top 16 D registers and one that enables
> double precision operations.
>
> This is a relatively simple change to LLVM but it will effect a large
> number of tests and would be a breaking change for users of LLVM as a
> library.
>
> .arch_extension Directive
> =========================
>
> Regardless of '.arch_extension' being added to AArch64, it has some issues
> that need to be addressed for the rest of these changes.
>
> Extensions can now have different meanings based on the base architecture
> they apply to. For example on AArch64, 'crypto' means different things for
> v8.{1,2,3}-a than v8.4-a. The former adds 'sha2' and 'aes', the latter adds
> those and 'sm4' and 'sha3' on top.
>
> We can handle this in a few of ways:
> - Remove .arch_extension in favour of .arch. This conflicts with the
> option above to add it to AArch64 to bring us in line with GCC, and will
> break a lot of code written for older versions of Clang.
> - Only accept options which do not vary with base architecture. For ARM,
> only the FPU options vary, and there is the .fpu directive for those. If we
> do decide to add .arch_extension to AArch64 this will mean that things like
> crypto will only be valid in .arch.
> - Track the current base target, as implied by the command line or the
> last .arch/.cpu directive. This makes the directives as similar to the
> command lines as they can be without breaking backwards compatibility.
>
> The last option makes the most sense to us, certainly if we want to add
> .arch_extension to AArch64 in a straightforward way.
>
> ARM Assembly Directives
> =======================
>
> As discussed for AArch64 the ARM assembly directives ('.arch', '.cpu',
> '.fpu', '.arch_extension') should be updated to use the new target parser.
> Giving them access to a complete list of features.
>
> '.arch' and '.cpu' supporting the '+' syntax is mentioned as an optional
> goal above. This makes ARM/AArch64 consistent within Clang but breaks from
> GCC's features.
>
> Current Command Line Option Examples
> ------------------------------------
>
> Clang ARM
> =========
>
> Extensions can be used with '+<{no}extension>' syntax on march or mcpu,
> there is no checking that the combinations are valid. The FPU is selected
> with -mfpu and this is not validated either.
>
> $ ./clang --target=arm-arm-none-eabi -march=armv8.2-a -mfpu=none -c
> /tmp/test.c -o /tmp/test.o
> $ ./clang --target=arm-arm-none-eabi -mcpu=cortex-a53+dotprod -c
> /tmp/test.c -o /tmp/test.o
> (can't use dotprod with v8-a)
>
> $ ./clang --target=arm-arm-none-eabi -march=armv7-m -mfpu=neon-fp16 -c
> /tmp/test.c -o /tmp/test.o
> (should be invalid but is allowed)
>
> GCC ARM
> =======
>
> For GCC it is the same except that mfpu defaults to 'auto', meaning that
> the value is implied by other options. Extensions are checked for
> compatibility with the base architecture but FPUs are not.
>
> $ ./arm-eabi-gcc -mcpu=cortex-a53 -mfpu=neon -c /tmp/test.c -o /tmp/test.o
> $ ./arm-eabi-gcc -march=armv8-a -mfpu=auto -c /tmp/test.c -o /tmp/test.o
>
> $ ./arm-eabi-gcc -march=armv8-a+dotprod -c /tmp/test.c -o /tmp/test.o
> arm-eabi-gcc: error: 'armv8-a' does not support feature 'dotprod'
> arm-eabi-gcc: note: valid feature names are: crc simd crypto nocrypto nofp
>
> $ ./arm-eabi-gcc -march=armv7-m -mfpu=neon-fp16 -c /tmp/test.c -o
> /tmp/test.o
> (same example given for Clang above, should be invalid)
>
> Clang AArch64
> =============
>
> The '+' syntax still applies but mfpu is replaced with '+' extensions.
>
> $ ./clang --target=aarch64-arm-none-eabi -march=armv8.2-a -mfpu=none -c
> /tmp/test.c -o /tmp/test.o
> clang-7: warning: argument unused during compilation: '-mfpu=none'
> [-Wunused-command-line-argument]
> $ ./clang --target=aarch64-arm-none-eabi -march=armv8.2-a+nofp -c
> /tmp/test.c -o /tmp/test.o
> $ ./clang --target=aarch64-arm-none-eabi -march=armv8-a+crypto -c
> /tmp/test.c -o /tmp/test.o
>
> Dependencies within extensions are not checked. For example crypto
> requires simd, but it can be disabled in the same march option.
>
> $ ./clang --target=aarch64-arm-none-eabi -march=armv8-a+crypto+nosimd -c
> /tmp/test.c -o /tmp/test.o
>
>
It is a bit late to reply but can the options be specified independently of
"-march". i.e. -march=armv8-a -mcrypto -mnosimd etc. similar to "-msse",
"-mavx" on x86.
This is for situations where certain packages e.g. media packages want to
enable certain features based on runtime cpu detection.
To enable e.g. "crypto", they are also forced to choose a march, but that
could override the architecture specified by the build system
( or could get overridden by the -march specified by build system). e.g. it
makes little sense for "-march=armv8-a+extension" to override the build
system "-march=armv8.3-a"
and vice-versa when the only desire is to enable the specific extension
additively.
The additive alternative is to use "-Xclang -target-feature -Xclang
+feature" which is pretty ugly.
Thanks
Dependencies between an extension and the base arch are not checked either.
> Dot product cannot be used with v8.0-a but it is allowed.
>
> $ ./clang --target=aarch64-arm-none-eabi -march=armv8-a+dotprod -c
> /tmp/test.c -o /tmp/test.o
>
> GCC AArch64
> ===========
>
> For GCC AArch64 mfpu is also dropped in favour of '+' extensions.
>
> $ ./aarch64-elf-gcc -march=armv8.2-a -mfpu=none -c /tmp/test.c -o
> /tmp/test.o
> aarch64-elf-gcc: error: unrecognized command line option '-mfpu=none'; did
> you mean '-gz=none'?
>
> Extensions are rejected if not recognised but not checked for
> compatibility. Hence the Clang crypto/simd example above is allowed with
> GCC too.
>
> $ ./aarch64-elf-gcc -march=armv8.2-a+food -c /tmp/test.c -o /tmp/test.o
> cc1: error: invalid feature modifier in '-march=armv8.2-a+food'
> $ ./aarch64-elf-gcc -march=armv8.2-a+dotprod -c /tmp/test.c -o /tmp/test.o
> $ ./aarch64-elf-gcc -march=armv8-a+dotprod -c /tmp/test.c -o /tmp/test.o
> (should not be allowed)
> $ ./aarch64-elf-gcc -march=armv8-a+crypto+nosimd -c /tmp/test.c -o
> /tmp/test.o
> (should not be allowed)
>
> Current Assembly Directive Examples
> -----------------------------------
>
> Clang .arch/.arch_extension
> ===========================
>
> AArch64 uses .arch and '+' syntax, ARM uses .arch_extension/.arch and does
> not support '+' syntax in either.
>
> In both arches, the list of possible extensions is not complete since it
> is separate from the one in TargetParser. So there is no way to enable
> dotprod (amongst other things) with a directive.
>
> (example is using AArch64)
> .arch armv8.2-a # error: instruction requires: dotprod
> udot v0.2s, v1.8b, v2.8b
>
> .arch armv8.2-a+dotprod # error: instruction requires: dotprod
> udot v0.2s, v1.8b, v2.8b
>
> ARM uses the .arch_extension directive which is one extension per use,
> with no '+'.
>
> .arch armv7-a #error: instruction requires: crc armv8
> CRC32B r0, r1, r2
>
> .arch armv8-a+crc #error: Unknown arch name
> CRC32B r0, r1, r2
>
> .arch armv8-a # no error
> .arch_extension crc
> CRC32B r0, r1, r2
>
> You can see here that though ARM march/mcpu would understand +crc, the
> assembly directive does not.
>
> ARM does check validity of extensions provided with '.arch_extension'.
>
> .arch armv7-a
> .arch_extension crc
> CRC32B r0, r1, r2
>
> main.s:20:17: error: architectural extension 'crc' is not allowed for the
> current base architecture
> .arch_extension crc
>
> AArch64 only rejects known extensions that aren't supported at all.
>
> .arch armv8-a+pan # unsupported architectural extension: pan
> nop
>
> Neither ARM or AArch64 know about the inter dependencies between
> extensions. So the example from the command lines applies here too.
>
> (example is using AArch64)
> .arch armv8-a+crypto+nosimd # no error/warning, crypto requires simd
> nop
>
> GCC .arch/.arch_extension
> =========================
>
> GCC is more consistent across the two arches, both use .arch and
> .arch_extension. Neither understand the '+' syntax.
>
> .arch armv8-a+crc # invalid
>
> .arch armv8-a # valid
> .arch_extension crc
>
> .arch_extension crc # valid
> .arch_extension crc+crypto #invalid
>
> For extensions that vary based on base architecture, GCC tracks the last
> known arch.
>
> Clang .fpu
> ==========
>
> .fpu is only available for ARM. Values are not checked for compatibility,
> only rejected if completely unknown.
>
> ./clang --target=aarch64-arm-none-eabi -march=armv8-a -c /tmp/test.s -o
> /tmp/test.o
> /tmp/test.s:1:1: error: unknown directive
> .fpu neon
> ^
>
> $ ./clang --target=arm-arm-none-eabi -march=armv7-m -c /tmp/test.s -o
> /tmp/test.o
> /tmp/test.s:1:6: error: Unknown FPU name
> .fpu clearly-not-valid
> ^
>
> (same example as 'Clang ARM' command lines, should be invalid)
> $ cat /tmp/test.s
> .fpu neon-fp16
> $ ./clang --target=arm-arm-none-eabi -march=armv7-m -c /tmp/test.s -o
> /tmp/test.o
>
> GCC .fpu
> ========
>
> .fpu is provided for ARM only and the FPU names are not checked against
> the base arch or CPU.
>
> This is correctly rejected from a command line:
> $ ./arm-eabi-gcc -march=armv6zk+neon -c /tmp/test.s -o /tmp/test.o
> arm-eabi-gcc: error: 'armv6zk' does not support feature 'neon'
> arm-eabi-gcc: note: valid feature names are: fp nofp vfpv2
>
> Whereas the directive is accepted:
> $ cat /tmp/test.s
> .fpu neon
> nop
> $ ./arm-eabi-gcc -march=armv6zk -c /tmp/test.s -o /tmp/test.o
>
> For AArch64 .fpu is removed in favour of .arch_extension. Instead of
> directly selecting an FPU it is implied by the extensions used.
>
> $ cat /tmp/test.s
> .fpu neon
> $ ./aarch64-elf-gcc -march=armv8-a+simd -c /tmp/test.s -o /tmp/test.o
> /tmp/test.s: Assembler messages:
> /tmp/test.s:1: Error: unknown pseudo-op: `.fpu'
>
> $ cat /tmp/test.s
> .arch_extension simd
> $ ./aarch64-elf-gcc -march=armv8-a -c /tmp/test.s -o /tmp/test.o
>
> References
> ----------
>
> Crypto extension requires SIMD:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0500e/CJHDEBAF.html
> GCC ARM options: https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
> GCC ARM directives:
> https://sourceware.org/binutils/docs/as/ARM-Directives.html
> GCC AArch64 options:
> https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html
> GCC AArch64 directives:
> https://sourceware.org/binutils/docs/as/AArch64-Directives.html
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190410/bc4a3968/attachment-0001.html>
More information about the llvm-dev
mailing list