[PATCH] MIR Serialization: Serialize machine instruction names.

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed Jun 17 17:51:11 PDT 2015


> On 2015 Jun 17, at 16:45, Ahmed Bougacha <ahmed.bougacha at gmail.com> wrote:
> 
> REPOSITORY
>  rL LLVM
> 
> ================
> Comment at: lib/CodeGen/MIRPrinter.cpp:115
> @@ -86,1 +114,3 @@
> +
> +  OS << StringRef(TII->getName(MI.getOpcode())).lower();
> }
> ----------------
> Why lower-case?

I prefer lowercase because the characters are designed to be easier for
humans to read than uppercase (and I'm human).

> I'd much prefer the exact same as the C++/MI enums/names.

LLVM IR has different capitalization in textual format vs. C++ classes.  I
think it works well.

> And if no one else does this lowercasing already (I don't think so), this can introduce ambiguity (it will).

I was somewhat skeptical this was a valid concern -- that any ambiguity is
a bug -- and I set out to prove it.  I was thinking we could just add some
rules to enforce it, and be done with it.

Sadly I found ST_Fp32m vs. ST_FP32m and ST_Fp64m vs. ST_FP64m in X86.
--
$ clang -x c++ -E -DGET_INSTRINFO_ENUM *GenInstrInfo.inc 2>/dev/null |
awk '$0 ~ /namespace Sched {/ {exit}' |
sed -e 's, *,,g' -e 's,=.*,,' |
grep '.' | grep -v '[{}]' |
tr '[A-Z]' '[a-z]' | sort | uniq -c | sort -n |
awk '$1 > 1 {print}'
   2 st_fp32m
   2 st_fp64m
--
(There's also ST_FpP{32,64}m, as it happens.)

Same issue in ARM:
--
   2 tstri
   2 tstrr
--

(As it happens, AArch64 is clean.  I didn't check other targets.)

I'd weakly argue it would be worth removing the ambiguity (and enforcing its
absence), but that's definitely outside the scope of serializing MIR :(.

Case-sensitive it is!



More information about the llvm-commits mailing list