[clang] [clang-repl] Enable native CPU detection by default (PR #77491)

Tue Jan 9 07:56:19 PST 2024

https://github.com/weliveindetail created https://github.com/llvm/llvm-project/pull/77491

We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively.

>From 2a6847e9d1b2b91e7236208b3a62458a4fc7c78b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Stefan=20Gr=C3=A4nitz?= <stefan.graenitz at gmail.com>
Date: Tue, 9 Jan 2024 14:52:29 +0100
Subject: [PATCH] [clang-repl] Enable native CPU detection by default

We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively.
---
 clang/lib/Interpreter/Interpreter.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index c9fcef5b5b5af1..734fe90d0d89b4 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -148,6 +148,7 @@ IncrementalCompilerBuilder::create(std::vector<const char *> &ClangArgv) {
   // We do C++ by default; append right after argv[0] if no "-x" given
   ClangArgv.insert(ClangArgv.end(), "-Xclang");
   ClangArgv.insert(ClangArgv.end(), "-fincremental-extensions");
+  ClangArgv.insert(ClangArgv.end(), "-mcpu=native");
   ClangArgv.insert(ClangArgv.end(), "-c");
 
   // Put a dummy C++ file on to ensure there's at least one compile job for the