[LLVMdev] BNF for IL/IR interpreter
Kenneth Adam Miller
kennethadammiller at gmail.com
Mon Apr 13 19:36:42 PDT 2015
How come no one pointed me toward the LLVM Kaleidescope project?
On Thu, Apr 9, 2015 at 3:10 PM, Kenneth Adam Miller <
kennethadammiller at gmail.com> wrote:
> I apologize, I found Yacc and Bison. I have my reading cut out for me.
>
> On Thu, Apr 9, 2015 at 2:37 PM, Kenneth Adam Miller <
> kennethadammiller at gmail.com> wrote:
>
>> This might be a very beginner question, but I'm looking for an example
>> for something that I have never done.
>>
>> Suppose that I wanted to express actions with respect to lifted semantics
>> of CPU instructions to an intermediate representation, BAP IL or LLVM IR.
>> How might I go about providing a Backus Naur Form specification and then
>> dynamically interpreting those lifted instructions by also specifying
>> actions to be done with any given IL/IR primitive? I'm looking for any
>> library that allows me to express BNF terms and actions on them.
>>
>> Like, say I convert push ebp to Bap IL (here's a json representation from
>> their live development branch):
>>
>> {
>> "move": {
>> "lvar": { "name": "t", "id": 107, "typ": { "imm": 64 } },
>> "rexp": { "var": { "name": "RBP", "id": 30, "typ": { "imm": 64 } } }
>> }
>> },
>> {
>> "move": {
>> "lvar": { "name": "RSP", "id": 32, "typ": { "imm": 64 } },
>> "rexp": {
>> "binop": {
>> "op": "minus",
>> "lexp": {
>> "var": { "name": "RSP", "id": 32, "typ": { "imm": 64 } }
>> },
>> "rexp": { "inte": { "int": "MHg4OjY0" } }
>> }
>> }
>> }
>> },
>> {
>> "move": {
>> "lvar": {
>> "name": "mem64",
>> "id": 58,
>> "typ": {
>> "mem": {
>> "index_type": { "r64": true },
>> "element_type": { "r8": true }
>> }
>> }
>> },
>> "rexp": {
>> "store": {
>> "memory": {
>> "var": {
>> "name": "mem64",
>> "id": 58,
>> "typ": {
>> "mem": {
>> "index_type": { "r64": true },
>> "element_type": { "r8": true }
>> }
>> }
>> }
>> },
>> "address": {
>> "var": { "name": "RSP", "id": 32, "typ": { "imm": 64 } }
>> },
>> "value": {
>> "var": { "name": "t", "id": 107, "typ": { "imm": 64 } }
>> },
>> "endian": "little_endian",
>> "size": { "r64": true }
>> }
>> }
>> }
>> }
>>
>> Then, for say, move, I could, in my interpreter specify some reasonable
>> action that captures those semantics, like allocate a 64 bit space in which
>> to store the value, and then also a SSA for the RSP variable value at such
>> a point. In this way, I could possibly specify other things such as
>> symbolic interpretation of specific memory regions for things like solving
>> to find certain constraints and limitations on code blocks. Then, after
>> some segments of code are lifted and interpreted, I provide some meaningful
>> context in terms of state, registers and memory, and the representation
>> gained could be executed upon in order to reach interesting path and state
>> combinations.
>>
>> But I've never written a language before... I'm afraid I'm new. But I'm
>> very interested, and I want to learn so I'm looking to use infrastructure
>> that's already there, and learn how to construct this properly.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150413/5c951b4e/attachment.html>
More information about the llvm-dev
mailing list