[LLVMdev] Fw: Accepting iCode as input to SDCC

Paul Sokolovsky pmiscml at gmail.com
Sat May 11 06:25:06 PDT 2013


FYI for people who may be interested in using LLVM with 8-bit CPUs.


Begin forwarded message:

Date: Sat, 11 May 2013 15:29:12 +0300
From: Paul Sokolovsky
To: sdcc-devel at lists.sourceforge.net
Subject: Accepting iCode as input to SDCC


Hello,

I'm interested to make SDCC accept iCode (its own intermediate
representation) as an input format. The motivation being taking
intermediate format from other compilers' frontends, which support more
languages and have better high-level optimizations, convert to iCode
and feed to SDCC for codegeneration to 8-bit microcontrollers.

The target I specifically have in mind is LLVM IR and C++. I did
initial comparison of LLVM IR and iCode and they appear to share same
level of concepts, the biggest difference I've seen so far is that LLVM
IR is SSA form, while iCode is not. I'm sure there're lots of
devils-in-details, bit I was satisfied with a quickly put Python script
to convert trivial LLVM IR program to iCode, so I proceeded with
looking how to feed iCode into SDCC in the first place.

Well, before talking about taking iCode format as an input, first iCode
format should be defined. So far, iCode is just internal SDCC data
structure, with some adhoc and verbose dumps available for that
internal structure. Actually, it's not even possible to produce iCode
for *input* C program, because even --dumpraw starts dumping after some
code transformations are started.

So, my initial steps were to add --emit-i-code switch to dump iCode
directly after construction from AST, and then try to define "external
representation" of iCode and add dumping support for it, while leaving
existing "debugging representation" mostly intact.

This work is available in this branch:
https://github.com/pfalcon/sdcc/commits/icode-output

Based on that branch, there's another branch 
https://github.com/pfalcon/sdcc/commits/icode-input which provides
(initial so far) iCode parsing support. After considering how I can add
support for iCode parsing grammar, I went with the solution as described
here:
http://stackoverflow.com/questions/16452737/is-it-possible-to-call-one-yacc-parser-from-another-to-parse-specific-token-subs

icode-input branch is largely work in progress, what I achieved so far
is that compiling trivial C code to iCode, and then recompiling to iCode
again produces matching output iCode's and asm's. As an example,

========
int a, b, c;

int foo()
{
    return a & b;
}
========

is compiled into external iCode representation as:

========
	defvar a{int fixed as=data}
	defvar b{int fixed as=data}
	defvar c{int fixed as=data}

_entry:
	proc foo{int ( ) fixed as=code}
	iTemp0{int fixed as=data} = a{int fixed as=data} & b{int fixed
as=data} ret iTemp0{int fixed as=data}
_return:
	eproc foo{int ( ) fixed as=code}
========


I would target this work to be submitted upstream, so would appreciate
comments on the idea, approach, and early implementation available so
far. Please note that branches above will be rebased as I'm striving to
provide clean patchset. (Btw, did you guys consider switching to git?)


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com



More information about the llvm-dev mailing list