0
votes

How does ECC for burst error correction work?

By "burst error detection", I mean a technique that can detect (for example) any combination of bit errors within any one [or two] sequences of 64 consecutive bits.

I need a conceptual explanation, not math.

I studied several descriptions of the technique that are formulated in terms of endless math symbols, but I do not understand what they are saying (because I am not fluent those advanced math formulations).

I ask this question because I dreamed up a technique to detect one burst of 64-bits in a 4096-byte (32768-bit) data-stream (disk-sector/transmission/etc), and want someone to explain the following:

#1: whether my approach is different or equivalent to "cyclic error codes".

#2: how much less efficient is my technique (640-bits corrects any 64-bit burst in 32768-bit stream).

#3: whether anyone can see a way to make my approach detect two bursts instead of only one.

#4: whether my approach is substantially simpler for software implementation.

I posted a conceptual explanation of my technique recently, but several folks were annoyed by my detailed explanation and closed the question. However, it demonstrates that [at least] my technique can be described in conceptual terms, as hopelly someone can for conventional techniques, (cyclic codes). You will also need to read my explanation (to compare it with conventional techniques) at this page:

how does ECC for burst error correction work?

1
#honestann, your approach looks good but there is one thing that doesn't seem quite right. You are not taking into account that the check bytes (code0-9) can have bit errors as well. So, I guess, with some effort, one can find an error pattern in your check bits that will make it "correct" bits that it shouldn't... ;-)guga
@guga: Right you are. I guess my answer to that is the observation that any given bit in one of the 64-bit code0 to code9 values corresponds to bits in about half the other code0 to code9 values (4~5 on average). So errors in any of the code0 to code9 values is exposed by the other code values. But I need to sit down and figure out exactly how this presents and how to assure these cases do not "correct" non-erroneous bits! Another laughable solution is to append another ECC that catches any 64-bit burst error in the first ECC. Ha! That would add another 320-bitshonestann

1 Answers

2
votes

I don't have a full answer to your question, but years ago I implemented cyclic ECC in a hard disk controller, and that design was clearly not as simple, regular and structured as your technique.

Though the hardware to implement my cyclic ECC was fairly simple to implement in hardware, I would be hard pressed to figure out how to re-formulate it in software! In hardware it was a long shift register (roughly 32 to 64-bits if memory serves) with XOR gates inserted at roughly 15 locations to conditionally flip bits based upon bits elsewhere in the stream.

While my implementation could not detect bursts as long as 64-bits (only about 11 or 13-bits as I recall), my impression is that your technique requires two or three times as many bits as optimal cyclic ECC techniques would for similar burst-length and data-stream length.

However, the overhead of your technique is probably small enough to be insignificant. Furthermore, looking at your scheme makes me think (but not be certain) you can correct far more errors than just the "one burst" you designed it for. So your technique may be more robust than conventional cyclic ECC, but more complex software processing would be necessary to "locate and correct" errors not within the one 64-bit burst.

Also on the positive side, your technique can clearly be implemented in hardware easily and efficiently.