<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">I also realized there were some identifiers that weren't in the implementer's namespace, which I'll go fix.</div><div class="gmail_quote"><br></div>
<div class="gmail_quote">On Tue, Apr 8, 2014 at 3:59 AM, PaX Team <span dir="ltr"><<a href="mailto:pageexec@freemail.hu" target="_blank">pageexec@freemail.hu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="">On 8 Apr 2014 at 0:28, Reid Kleckner wrote:<br>
<br>
> +static __inline__ unsigned __int64 __attribute__((__always_inline__, __nodebug__))<br>
> +__readmsr(unsigned long __register) {<br>
> + // Loads the contents of a 64-bit model specific register (MSR) specified in<br>
> + // the ECX register into registers EDX:EAX. The EDX register is loaded with<br>
> + // the high-order 32 bits of the MSR and the EAX register is loaded with the<br>
> + // low-order 32 bits. If less than 64 bits are implemented in the MSR being<br>
> + // read, the values returned to EDX:EAX in unimplemented bit locations are<br>
> + // undefined.<br>
> + unsigned long __edx;<br>
> + unsigned long __eax;<br>
> + __asm__ ("rdmsr"<br>
> + : "=d"(__edx), "=a"(__eax)<br>
> + : "c"(__register)<br>
> + : "%ecx", "%edx", "%eax");<br>
> + return (((unsigned __int64)__edx) << 32) | (unsigned __int64)__eax;<br>
> +}<br>
<br>
</div>i don't think this is correct, input/output registers should not appear on<br>
the clobbered list. gcc itself doesn't accept this code and complains with:<br>
<br>
error: 'asm' operand has impossible constraints</blockquote><div><br></div><div>Yeah, that's wrong. I'll fix it. I suspected it was wrong, but it compiled fine.</div><div><br></div><div>I have *not* done execution tests of this code, because that would involve far more setup than I'm prepared to do. I've only compiled it and examined the output.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">
> +static __inline__ unsigned long __attribute__((always_inline, __nodebug__))<br>
> +__readcr3(void) {<br>
> + unsigned long value;<br>
> + __asm__ __volatile__("mov %%cr3, %0" : "=q"(value));<br>
> + return value;<br>
> +}<br>
<br>
</div>note that asm volatile won't prevent reordering per se and the solution<br>
linux uses is that these inline asm stmts have a fake dependency (read<br>
or write) on a variable used just for this purpose. the alternative would<br>
be to use a memory clobber as you did for __writecr3 but that's quite<br>
heavyweight as it prevents reorderig all other loads/stores as well which<br>
isn't often necessary (e.g., on a context switch the kernel doesn't care<br>
how loads/stores get reordered around a cr3 reload since only the userland<br>
part of the address space changes).<br></blockquote><div><br></div><div>I'm going to put the memory constraint on to prevent reordering for now. This is mostly a compatibility header being used by people writing Windows drivers. If they aren't happy with the performance, they can feel free to pull the inline asm up into their project and fine tune it appropriately. </div>
</div></div></div>