[llvm-dev] [GSoC16] Seeking Guidance for a project regarding SAFECode

Fri Mar 18 23:55:24 PDT 2016

Sorry, forgot to add the mailing list....

Hi,
Thanks for the detailed response.
.

>
> The most useful project for SAFECode right now is to update its code to
> work with either LLVM 3.7 or LLVM 3.8.  I had a student work on this last
> summer (code is at https://github.com/jtcriswell/safecode-llvm37), but it
> needs to be completed and tested.  On my end, I'm interested in getting
> SAFECode dusted off because I'd like to use it for research projects that
> need to attach metadata to memory objects.
>
> Until SAFECode is updated to a newer version of LLVM, its utility is
> pretty limited, and any projects to enhance it will basically require that
> it be updated to a newer version of LLVM.
>

I would definitely love to work on this, I do have experience with using
LLVM (atleast till the last version) for the past 1 year. I couldn't
completely compile and get the code (int the link) to work though. But, I
would like to try and work on it's completion.

>
> Static Array Bounds Checking requires that you understand static
> analysis.  Multiple static analysis methods are applicable: range analysis,
> integer linear programming, SMT solvers, etc.  For a successful proposal
> for static array bounds checking, you should know which algorithm you will
> implement and be able to explain why you think it will work well.  For
> SAFECode, the algorithm must be sound with respect to two's complement
> arithmetic (i.e., the algorithm must take into account that integers in C
> can experience underflow or overflow when used in arithmetic).
>

I have experience with Integer linear programming using octave and matlab.
I also developed very basic solvers of my own.  Would it be enough for this
project or should I focus on updating SAFECode to newer version?

The CompleteChecks pass currently uses the DSA points-to analysis (which is
> large and complicated).  There are simpler analyses that one could do to
> determine whether a memory object is read or written by external code.  For
> example, a simple intra-procedural analysis could determine if a memory
> object is allocated and only used by the current function, and a simple
> inter-procedural analysis could create a very simple heap abstraction and
> perform data-flow analysis on the pointers contained within heap objects to
> determine if they are influenced by external library code.  Basically,
> there are some simple quick analyses that would be imprecise but could
> probably find memory objects that are not manipulated by external code.
>

This seems fairly daunting task. Although, I love challenges but since the
last date of sending proposal is very close I don't think I can read and
understand much in this aspect of code analysis to write a decent proposal.

I think the only proficiency test is whether you can show in your proposal
> that you have the necessary programming skills and background information
> to be able to do what you propose.  In both of these projects, if you're
> not familiar with static analysis (e.g., Kam/Ulman data-flow analysis),
> then you're likely not ready for these projects.  For the two projects you
> mentioned, I would also expect existing familiarity with LLVM.
>
> Again, though, the best project is probably to update SAFECode to a modern
> version of LLVM.
>

Thanks. I am definitely inclined to updating SAFECode for now. I will
submit the proposal to google within a couple of days. I hope you could
provide some pointers in my proposal before the end date so that I could
improve it.
I have existing familiarity with llvm and clang as I used it for working on
a project to marshal codes from C/C++ to C# and C++/CLI.

Regards,
Abhinav

>
> Regards,
>
> John Criswell
>

On Sat, Mar 19, 2016 at 12:24 PM, Abhinav Tripathi <ee130002001 at iiti.ac.in>
wrote:

> Hi,
> Thanks for the detailed response.
> .
>
>>
>> The most useful project for SAFECode right now is to update its code to
>> work with either LLVM 3.7 or LLVM 3.8.  I had a student work on this last
>> summer (code is at https://github.com/jtcriswell/safecode-llvm37), but
>> it needs to be completed and tested.  On my end, I'm interested in getting
>> SAFECode dusted off because I'd like to use it for research projects that
>> need to attach metadata to memory objects.
>>
>> Until SAFECode is updated to a newer version of LLVM, its utility is
>> pretty limited, and any projects to enhance it will basically require that
>> it be updated to a newer version of LLVM.
>>
>
> I would definitely love to work on this, I do have experience with using
> LLVM (atleast till the last version) for the past 1 year. I couldn't
> completely compile and get the code (int the link) to work though. But, I
> would like to try and work on it's completion.
>
>
>
>>
>> Static Array Bounds Checking requires that you understand static
>> analysis.  Multiple static analysis methods are applicable: range analysis,
>> integer linear programming, SMT solvers, etc.  For a successful proposal
>> for static array bounds checking, you should know which algorithm you will
>> implement and be able to explain why you think it will work well.  For
>> SAFECode, the algorithm must be sound with respect to two's complement
>> arithmetic (i.e., the algorithm must take into account that integers in C
>> can experience underflow or overflow when used in arithmetic).
>>
>
>
> I have experience with Integer linear programming using octave and matlab.
> I also developed very basic solvers of my own.  Would it be enough for this
> project or should I focus on updating SAFECode to newer version?
>
>
> The CompleteChecks pass currently uses the DSA points-to analysis (which
>> is large and complicated).  There are simpler analyses that one could do to
>> determine whether a memory object is read or written by external code.  For
>> example, a simple intra-procedural analysis could determine if a memory
>> object is allocated and only used by the current function, and a simple
>> inter-procedural analysis could create a very simple heap abstraction and
>> perform data-flow analysis on the pointers contained within heap objects to
>> determine if they are influenced by external library code.  Basically,
>> there are some simple quick analyses that would be imprecise but could
>> probably find memory objects that are not manipulated by external code.
>>
>
> This seems fairly daunting task. Although, I love challenges but since the
> last date of sending proposal is very close I don't think I can read and
> understand much in this aspect of code analysis to write a decent proposal.
>
>
> I think the only proficiency test is whether you can show in your proposal
>> that you have the necessary programming skills and background information
>> to be able to do what you propose.  In both of these projects, if you're
>> not familiar with static analysis (e.g., Kam/Ulman data-flow analysis),
>> then you're likely not ready for these projects.  For the two projects you
>> mentioned, I would also expect existing familiarity with LLVM.
>>
>> Again, though, the best project is probably to update SAFECode to a
>> modern version of LLVM.
>>
>
>
> Thanks. I am definitely inclined to updating SAFECode for now. I will
> submit the proposal to google within a couple of days. I hope you could
> provide some pointers in my proposal before the end date so that I could
> improve it.
> I have existing familiarity with llvm and clang as I used it for working
> on a project to marshal codes from C/C++ to C# and C++/CLI.
>
> Regards,
> Abhinav
>
>
>>
>> Regards,
>>
>> John Criswell
>>
>> John Criswell
>> Assistant Professor
>> Department of Computer Science, University of Rochesterhttp://www.cs.rochester.edu/u/criswell
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160319/f50e52cd/attachment-0001.html>