[LLVMdev] BNF for IL/IR interpreter

Kenneth Adam Miller kennethadammiller at gmail.com
Mon Apr 13 19:36:42 PDT 2015


How come no one pointed me toward the LLVM Kaleidescope project?

On Thu, Apr 9, 2015 at 3:10 PM, Kenneth Adam Miller <
kennethadammiller at gmail.com> wrote:

> I apologize, I found Yacc and Bison. I have my reading cut out for me.
>
> On Thu, Apr 9, 2015 at 2:37 PM, Kenneth Adam Miller <
> kennethadammiller at gmail.com> wrote:
>
>> This might be a very beginner question, but I'm looking for an example
>> for something that I have never done.
>>
>> Suppose that I wanted to express actions with respect to lifted semantics
>> of CPU instructions to an intermediate representation, BAP IL or LLVM IR.
>> How might I go about providing a Backus Naur Form specification and then
>> dynamically interpreting those lifted instructions by also specifying
>> actions to be done with any given IL/IR primitive? I'm looking for any
>> library that allows me to express BNF terms and actions on them.
>>
>> Like, say I convert push ebp to Bap IL (here's a json representation from
>> their live development branch):
>>
>> {
>>   "move": {
>>     "lvar": { "name": "t", "id": 107, "typ": { "imm": 64 } },
>>     "rexp": { "var": { "name": "RBP", "id": 30, "typ": { "imm": 64 } } }
>>   }
>> },
>> {
>>   "move": {
>>     "lvar": { "name": "RSP", "id": 32, "typ": { "imm": 64 } },
>>     "rexp": {
>>       "binop": {
>>         "op": "minus",
>>         "lexp": {
>>           "var": { "name": "RSP", "id": 32, "typ": { "imm": 64 } }
>>         },
>>         "rexp": { "inte": { "int": "MHg4OjY0" } }
>>       }
>>     }
>>   }
>> },
>> {
>>   "move": {
>>     "lvar": {
>>       "name": "mem64",
>>       "id": 58,
>>       "typ": {
>>         "mem": {
>>           "index_type": { "r64": true },
>>           "element_type": { "r8": true }
>>         }
>>       }
>>     },
>>     "rexp": {
>>       "store": {
>>         "memory": {
>>           "var": {
>>             "name": "mem64",
>>             "id": 58,
>>             "typ": {
>>               "mem": {
>>                 "index_type": { "r64": true },
>>                 "element_type": { "r8": true }
>>               }
>>             }
>>           }
>>         },
>>         "address": {
>>           "var": { "name": "RSP", "id": 32, "typ": { "imm": 64 } }
>>         },
>>         "value": {
>>           "var": { "name": "t", "id": 107, "typ": { "imm": 64 } }
>>         },
>>         "endian": "little_endian",
>>         "size": { "r64": true }
>>       }
>>     }
>>   }
>> }
>>
>> Then, for say, move, I could, in my interpreter specify some reasonable
>> action that captures those semantics, like allocate a 64 bit space in which
>> to store the value, and then also a SSA for the RSP variable value at such
>> a point. In this way, I could possibly specify other things such as
>> symbolic interpretation of specific memory regions for things like solving
>> to find certain constraints and limitations on code blocks. Then, after
>> some segments of code are lifted and interpreted, I provide some meaningful
>> context in terms of state, registers and memory, and the representation
>> gained could be executed upon in order to reach interesting path and state
>> combinations.
>>
>> But I've never written a language before... I'm afraid I'm new. But I'm
>> very interested, and I want to learn so I'm looking to use infrastructure
>> that's already there, and learn how to construct this properly.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150413/5c951b4e/attachment.html>


More information about the llvm-dev mailing list