<div dir="ltr"><div><div><div><div>The toy language I've been playing around with represents all strings as a struct in llvm;<br></div><br>struct string{<br></div> char *ptr;<br></div> int str_len;<br></div> int buffer_len;<br>
<div><div><div><div>}<br><br></div><div>And my AST has an interface like;<br><br></div><div>String_AST{<br></div> int measure();<br><div> void copy(char *dest);<br><div> struct string get_value();<br>}<br><br></div><div>
A constant string can be measured at compile time, for a string variable measure() just extracts str_len. Strings passed in from other external sources are measured immediately, but llvm optimisations will eliminate the call if the return value isn't used.<br>
<br>The implementation of get_value() for a concatenation AST node can generate code to evaluate each sub string, measure them, allocate the final buffer length, and only then copy each sub string directly into the final buffer.<br>
<br>I also support a string append operation that will reallocate the buffer only if the existing one is too small.<br></div><br></div><div>Ultimately you will need to work out if you want pascal / java style strings like mine, or C style NULL terminated strings. And how the memory for these strings will be managed.<br>
</div></div></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Oct 13, 2013 at 4:43 AM, William Moses <span dir="ltr"><<a href="mailto:moses.williamsteven@gmail.com" target="_blank">moses.williamsteven@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">All,<br><br>I am building my own language with llvm as the base.<br><br>I was working on string concatenation (where a string is just an array of characters cast to a pointer to a character (i8*) ). Given two strings, it is possible to determine the length of new string by summing the number of characters until the null terminator and adding one.<br>
<br>Unfortunately, I have no idea how to use the c-api to store this. As the length of the new string is not a compile-time constant (e.g. stored in a Value*), I cannot determine at compile-time what length the llvm array-type will be? Therefore, I cannot create the GlobalVariable since I do not know the type.<br>
<br>One possible solution I thought of was linking to the malloc function and calling that, but I'm sure there's a better way. If any of you have implemented a similar sort of string concatenation, I would much appreciate any advice that you could give.<br>
<br>Thanks,<br>Billy</div>
<br>_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
<br></blockquote></div><br></div>