[PATCH] D88390: [M68k] (Patch 4/8) MC layer and object file support

Sun Dec 20 10:46:11 PST 2020

jrtc27 added inline comments.

================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:752
+enum {
+#include "ELFRelocs/m68k.def"
+};
----------------
File should be named the same as the directory in Target, ie with a capital M.

================
Comment at: llvm/include/llvm/BinaryFormat/ELFRelocs/m68k.def:5
+
+ELF_RELOC (R_68K_NONE,          0)  /* No reloc */
+ELF_RELOC (R_68K_32,            1)  /* Direct 32 bit  */
----------------
No space before (

================
Comment at: llvm/include/llvm/Object/ELFObjectFile.h:1094
+    case ELF::EM_68K:
+      return "ELF32-M680x0";
     case ELF::EM_386:
----------------
glaubitz wrote:
> jrtc27 wrote:
> > glaubitz wrote:
> > > jrtc27 wrote:
> > > > glaubitz wrote:
> > > > > jrtc27 wrote:
> > > > > > As I said on D88389:
> > > > > > 
> > > > > > This gets reported in the file format line of llvm-objdump so should match what binutils has, which is elf32-m68k, though even if that weren't the case it should at least be in keeping with the style of all the others here.
> > > > > Yeah, I agree this should definitely match with what GNU is using there.
> > > > > 
> > > > > I would still prefer the backend being called "M680x0" and therefore the patches should be prefixed with "[M680x0]", similar to "SystemZ" and "s390x".
> > > > > 
> > > > > Naming the "M680x0" instead of "M68K" improves the readability in my personal opinion as it's easier to tell when you are talking about the backend and when you're talking about the architecture and GNU triplet.
> > > > The difference there is "IBM System z9" etc are the actual product names, and it's not a really clumsy name to type like M680x0. NXP's own site categorises its M68K-derived processors as "68K Processors (Legacy)", and the manuals say things like:
> > > > 
> > > > > The MCF5407 extends the legacy of Motorola’s 68K family
> > > > 
> > > > GCC's own manage uses the M680x0 term in the following ways:
> > > > 
> > > > > These are the -m options defined for M680x0 and ColdFire processors.
> > > > 
> > > > > Generate code for a specific M680x0 or ColdFire instruction set architecture. Permissible values of arch for M680x0 architectures are: 68000, 68010, 68020, 68030, 68040, 68060 and cpu32.  ColdFire architectures are selected according to Freescale's ISA classification and the permissible values are: isaa, isaaplus, isab and isac.
> > > > 
> > > > So the name M680x0 would actually be *more* narrow than what M68K means in practice, with the latter being the general term for any M68000-derived processor and the former being only for the 68000 through 68060 processors and *not* including the ColdFire extensions, yet there's no reason why our backend can't support that just like GCC does.
> > > Meh, I think this is really a bike-shedding contest. As I said, I like the name "M680x0" because it clearly tells me we're talking about the backend and not the architecture. It makes reading the code easier in my opinion.
> > > 
> > > > GCC's own manage uses the M680x0 term in the following ways:
> > > 
> > > Which is titled with "3.19.25 M680x0 Options" ;-)
> > > 
> > > See: https://gcc.gnu.org/onlinedocs/gcc/M680x0-Options.html
> > > 
> > > 
> > > Which is titled with "3.19.25 M680x0 Options" ;-)
> > 
> > Which I think is a bug (perhaps historical, if that was the name chosen before ColdFire was added) given the language used throughout. I've been considering sending a patch to change that title though.
> Well, ok. If LLVM upstream insists on the name "M68k" for the backend, I'm not going fight it. In the end, I want the backend to succeed and I don't want something like a naming dispute to block it.
> 
> But please call it "M68k" (with a lower "k"), not "M68K" which would be incorrect as "kilo" is spelled all lower case.
This specific string still needs fixing to match binutils's name for the file format as all lowercase.

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kBaseInfo.h:195
+  default:
+    llvm_unreachable("Not an Address nor Data register");
+  case M68k::WA0:
----------------
The existence of this is surprising; one would expect to be able to pass any valid register number to a function called isAddressRegister and just get back false for special registers. Do you need this assert? If so, please change the function name to reflect that it's only valid for the "general-purpose" (if they're called that on m68k given it has split A and D) registers.

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp:14
+
+// TODO Conform with Motorola ASM syntax
+
----------------
Would be helpful to state (either here at the top or in the places where the code gets it wrong) the ways in which it doesn't conform. Also is that the Motorola ASM syntax what GAS uses (which is the most important thing to implement) or do they differ?

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kInstPrinter.cpp:181
+
+void M68kInstPrinter::printPCDMem(const MCInst *MI, uint64_t Address,
+                                    int opNum, raw_ostream &O) {
----------------
myhsu wrote:
> MaskRay wrote:
> > The PCREL_OPERAND style `print*` functions are usually used this way: D77853
> Well...actually Motorola is using its own assembly language (and where this `(N,%pc)` syntax comes from). We're trying to conform with that in order to have better coordination with other existing toolchains
The objdump output is not the same as the asm input/codegen output in this specific case. What does binutils's objdump do for m68k?

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp:22
+
+// TODO get back to it when it comes to printing
+M68kELFMCAsmInfo::M68kELFMCAsmInfo(const Triple &T) {
----------------
What does this mean?

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp:29
+
+  // 0b0100_1110_0111_0001
+  TextAlignFillValue = 0x4e71;
----------------
Which is what? A specific instruction? A special bit pattern that's never a valid instruction?

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kMCAsmInfo.cpp:39
+  // Always enable the integrated assembler by default.
+  // Clang also enabled it when the OS is Solaris but that is redundant here.
+  UseIntegratedAssembler = true;
----------------
Clang doesn't do and has never done anything for m68k. This line can go.

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp:43
+
+// TODO
+static std::string ParseM68kTriple(const Triple &TT, StringRef CPU) {
----------------
RKSimon wrote:
> todo what?
I assume "implement the ParseM68kTriple function" so it's not just a stub... but yes.

================
Comment at: llvm/lib/Target/M68k/MCTargetDesc/M68kMCTargetDesc.cpp:88-92
+  //??? Do i need this?
+  // // Add return address to move list
+  // MCCFIInstruction Inst2 = MCCFIInstruction::createOffset(
+  //   nullptr, MRI.getDwarfRegNum(M68k::PC, true), stackGrowth);
+  // MAI->addInitialFrameState(Inst2);
----------------
If your hardware-provided call instruction saves the return address in memory like x86, then yes. If your hardware-provided call instruction saves the return address in a register like most other architectures, then no, and the prologue code will ensure the return address gets proper DWARF info if spilled, just like any other register. Given m68k is like x86 and jsr/rts/etc save/restore PC to/from the stack, yes, you do need this I believe.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88390/new/

https://reviews.llvm.org/D88390