<div dir="ltr"><p>If you follow the news sites you're probably aware that <a href="http://packet.net">packet.net</a>
launched an Aarch64 cloud server this week with 96 core (2x 48 core) 2.0
GHz Cavium "ThunderX" with 128GB RAM, 320 GB m.2 SSD, for $0.50/hour.<br>
<br>
<a href="https://www.packet.net/blog/arming-the-world-with-an-arm64-bare-metal-server/">https://www.packet.net/blog/arming-the-world-with-an-arm64-bare-metal-server/</a><br>
<br>
I made an account and tried building llvm&clang on it, and comparing to various Intel machines (sorted by speed):<br>
<br>
05m31s AWS c4.8xlarge 36 vCPU 60 GB RAM $1.68/hour<br>
08m47s <a href="http://packet.net">packet.net</a> A64 96 core 128 GB RAM $0.50/hour<br>
15m08s local i7 6700K 4 core 32 GB RAM<br>
21m30s AWS c4.2xlarge 4 vCPU 15 GB RAM $0.42/hour<br>
22m37s local i7 3770 4 core 32 GB RAM<br>
<br>
So, this ARM server is not as fast as the fastest Intel machine (AWS
c4.8xlarge), but it has much better price/performance with 60% of the
performance at 30% of the price [1]. It's four times faster than a
comparably priced Intel server (with far less RAM).<br>
<br>
On this particular highly parallel task, of course.<br>
<br>
If you actually need to run Aarch64 code then this is definitely
massively better than qemu. (So are Raspberry Pi or Odroid, except they
have too little RAM to reasonably build llvm)<br>
<br>
I ran the following commands on fresh Ubuntu 16.04 installs:<br>
<br>
sudo apt-get update<br>
sudo apt-get -y install g++ cmake make bzip2 gzip zip subversion<br>
svn co <a href="http://llvm.org/svn/llvm-project/llvm/trunk">http://llvm.org/svn/llvm-project/llvm/trunk</a> llvm<br>
pushd llvm/tools<br>
svn co <a href="http://llvm.org/svn/llvm-project/cfe/trunk">http://llvm.org/svn/llvm-project/cfe/trunk</a> clang<br>
popd<br>
mkdir build install<br>
cd build<br>
cmake -DCMAKE_INSTALL_PREFIX=$(readlink -f ../install) -DCMAKE_BUILD_TYPE=Release ../llvm<br>
time make -j$(grep -c ^processor /proc/cpuinfo)<br>
<br>
<br>
Interestingly, I failed in attempts to run 32 bit ARM code on the
<a href="http://packet.net">packet.net</a> server. Even something as simple as the following (assembled
and tested on a Pi) which I think should need no runtime support other
than the kernel. It tried to run it, but segfaulted. Could it be that
this is an Aarch64 *only* CPU? I haven't been able to find anything
about this on the net.<br>
<br>
.syntax unified<br>
.arch armv4<br>
<br>
.equ SYSCALL_EXIT, 1<br>
.equ SYSCALL_WRITE, 4<br>
.equ STDOUT, 1<br>
<br>
.globl _start<br>
_start:<br>
movs r0,#STDOUT<br>
adr r1,hello<br>
movs r2,#11<br>
movs r7,#SYSCALL_WRITE<br>
swi 0x0<br>
movs r7,#SYSCALL_EXIT<br>
swi 0x0<br>
<br>
.align 2<br>
hello: .asciz "Hello asm!\n"<br>
<br>
</p><p>[1] but the c4.8xlarge can often be obtained for $0.30 - $0.50 with spot pricing</p><p><br></p></div>