[llvm-commits] CVS: llvm-www/pubs/2008-03-ASPLOS-HardErrorPropagation.html 2008-03-ASPLOS-HardErrorPropagation.pdf pubs.js
Chris Lattner
clattner at apple.com
Sun Jun 28 13:48:18 PDT 2009
On Jun 28, 2009, at 12:54 PM, Duncan Sands wrote:
> Hi Chris,
>
>> + This paper aims to provide such a characterization, resulting in
>> identifying low-cost detection methods and providing guidelines for
>> implementation of the recovery and diagnosis components of such a
>> reliability solution. We focus on hard faults because they are
>> increasingly important and have different system implications than
>> the much studied transients. We achieve our goals through fault
>> injection experiments with a microarchitecture-level full system
>> timing simulator. Our main results are: (1) we are able to detect
>> 95% of the unmasked faults in 7 out of 8 studied microarchitectural
>> structures with simple detectors that incur zero to little hardware
>> overhead; (2) over 86% of these detections are within latencies
>> that existing hardware checkpointing schemes can handle, while
>> others require software checkpointing; and (3) a surprisingly large
>> fraction of the detected faults corrupt OS state, but almost all of
>> these are detected with latencies short enough to use hardware
> c!
>> heckpointing, thereby enabling OS recovery in virtually all such
>> cases.
>
> another mysterious line break of the same kind.
Thanks, this doesn't manifest as a rendering or validation problem in
the HTML, so I'll just leave it.
-Chris
More information about the llvm-commits
mailing list