[LLVMdev] Google SOC - Fortran Front-End Application

Wed Mar 21 18:35:40 PDT 2007

Hi All,

Thank you for all the excellent pieces of advice I got in response to
the draft application I sent out.

I have incorporated all (I think) of the suggestions into the
application and it's much improved. Here is the updated version.
Please don't force yourself to read through it again if you don't want
to. I'll submit this version with any suggestions I receive on the
23rd.

Thank you all,
Scott

ABSTRACT
----------------

The purpose of this project is to develop a Fortran front-end to the
Low Level Virtual Machine (LLVM) compiler infrastructure
[http://llvm.org]. LLVM is a mature collection of tools that provides
a powerful resource for language developers and end-user programmers.
LLVM consists of roughly three components. The first is a front-end to
a language (such as C, C++, or virtually any other) that parses the
language and converts it into LLVM's Intermediate Representation (IR).
The LLVM IR is a language- and target- independent representation. The
second component of LLVM is a collection of powerful optimization
routines that operate on the IR. The final module of LLVM is a backend
that can compile optimized code to a number of platforms including x86
and PowerPC, can emit an optimized C representation of the original
program, or use a Just-In-Time compiler to interpret the program on a
variety of platforms.

To claim Fortran is mature is an understatement. In use for over 50
years, Fortran is utilized in a wide variety of legacy code bases.
While younger and flashier languages get all the press, Fortran still
enjoys wide usage in many scientific fields and other
businesses—especially when legacy code is involved. Additionally, the
SPEC CPU2000 and CPU2006 suite of benchmarks, both of which contain
Fortran components, are widely used as performance evaluation tools in
both the research community and industry.

The implementation of a Fortran front end would benefit both the
Fortran user-base and the LLMV project. The scientific uses of
Fortran—along with its other applications—are often heavily involved
simulations and calculations which are very resource demanding and
could greatly benefit from LLVM's powerful optimization mechanisms.
The LLVM project will, of course, benefit from having another
front-end language and the resulting larger "market" available that
can utilize LLVM. Additionally, the LLVM optimization team will have
another case study to explore the effectiveness of its optimization
routines. Especially the development of mixed-language optimization
routines involving Fortran and one or more other language may be
explored and implemented.

Deliverables for this project are:
* Fortran front-end to LLVM
* Documentation and tools for using the front-end
* A suite of tests to ensure the functionality of the front-end

DETAILED DESCRIPTION
----------------------------

The reader should refer to the abstract for a description of the goals
of this project and their justifications. This section is devoted to
the applicant's experience, interests, qualifications, and plan for
completing the project.

* Personal Background

I am an Environmental Engineering and Economics double major at
Swarthmore College class of 2008. My programming background is
self-taught, and I have only had the available time to take a couple
of formal Computer Science courses in my career. Most of my
programming education has been driven by my passion for figuring
things out and my willingness to hit my head against a problem until
it is solved. My learning has been aided by the books of excellent Bay
Area libraries, the kind support of community members of various
projects, and the vast trove of resources available on the internet.

My programming ideology is one of problem solving: I encounter a
problem in my life and solve it using whatever tools or resources are
needed to do so. To this end, I have dabbled in a wide range of fields
from databases, to statistical analysis, to GUI applications, to web
applications, and more. Examples of this work are an anti-censorship,
in-browser web-browser [Palary Browser -- http://palary.org] and a
unique calculator program for OS X [Longhand --
http://longhand.palary.com]. I have also worked for a number of
clients developing GUI applications, data analysis applications, and
web applications.

I became interested in the area of language development as a result of
a desire for better tools to deal with the environmental modeling and
economical modeling issues that I came into contact with in my
studies. It seemed to me as if these areas could benefit greatly from
domain specific languages that were tailored to their specific needs
(such as built in units in the environmental modeling case). I am
currently working toward developing such a language for my Senior
Thesis here at Swarthmore.

* Motivation

To gain the background to carry out the complicated task of developing
a domain specific modeling language, I have since enrolled in an
upper-level compiler course here at Swarthmore. Additionally, I have
actively immersed myself in the field. In this immersion I came across
the LLVM project and I believe that its IR would be an excellent
target for the language I eventually create.  LLVM is a complex tool
and I wish to gain much more familiarity with it so as to best utilize
its strengths and capabilities.

My primary motivation for working on a Fortran front-end for LLVM is
gaining experience and background in language development. I want to
learn from this experience, and would like to contribute to both the
LLVM and Fortran community whilst doing so.

* The Plan

I will be able to spend roughly three-months this summer working on
the front-end. I am not as familiar with the technologies involved as
I would like to be, so my planning is necessarily imprecise. My rough
plan proceeds as follows:

=====
- 2 weeks – Gain increased familiarity with the technologies: LLVM,
Fortran, and GCC's Fortran implementation. Do not engage in any direct
work on the project but acquire additional experience with the tools.
Ascertain what previous work has been accomplished towards developing
a Fortran front-end.

- 4 weeks – Build the Fortran front-end; documenting the front-end as
development proceeds.

- 2 weeks – Smooth things out, unit tests, etc…

- 1 week – Polish documentation, comments, and supporting documents.
Ensure that the front-end can be maintained and worked on by other
people.

- 3 weeks – The three remaining weeks will be used to account for any
unforeseen occurrences or inaccuracies in this plan.
=====

I plan on first attempting to implement the Fortran front-end by
co-opting the GCC Fortran parser. I will most likely utilizethe GCC
4.2 Fortran front-end as earlier versions of the GCC Fortran front-end
are less than ideal. Success in this approach has been demonstrated in
other projects such as with the Ada LLVM front-end implemented by
Duncan Sands. If this approach is successful, the version of Fortran
that will be targeted is Fortran 95.

If that fails, I will build a front-end using ANTLR [http://antlr.org]
a parser generator with which I am familiar and for which a Fortran
grammar is already available (targeting an obsolete version of ANTLR,
but it should not be too difficult to update). Completing the
front-end using ANTLR is a much larger task than using GCC. A
front-end supporting a subset of a Fortran version will definitely be
achievable, but it is unlikely that the entirety of the Fortran
version will be completed in the available time. Fortunately, it is
unlikely that I will have resort to this approach.

* In Short

I'm psyched :)

I am thrilled to start working on this project, and feel that I will
be able to complete it on time successfully. While working on this
project, I hope to learn a great deal.

* Acknowledgments

I would like to thank Jeff Cohen, Kenneth Hoste, Chris Lattner, Scott
Murray, Duncan Sands, and Bill Wendling for their advice and
suggestions as to a fruitful course for this project.