RAID-6 makes RAID more complex. The other ones are quite simple. Either duplication of blocks (RAID-1), reordering of blocks (RAID-0), or parity using the ⊻ (XOR) properties of A ⊻ B ⊻ B = A.
But what about RAID-6 parity and the way it is stored? You can’t just store the parity in two places:
1. layout: A B C D Pa Pb;
2. lost C and D;
3. If Pa = Pb, then there’s no way to reconstruct C, D or both.
Yesterday I found a paper from NetApp describing their implementation. You can find the PDF in their site here.
The gist is that the other parity block is constructed diagonally and it skips a disk in each interaction. Like this:
A1 B1 C1 Pa1 A1 ⊻ B1 ⊻ C1 Pb1 A1 ⊻ B2 ⊻ C3
A2 B2 C2 Pa2 A2 ⊻ B2 ⊻ C2 Pb2 B1 ⊻ C2 ⊻ Pa3
A3 B3 C3 Pa3 A3 ⊻ B3 ⊻ C3 Pb3 C1 ⊻ Pa2 ⊻ A3
What does this means? The fact that in the new parity calculation a drive is missing, it means that there’s always a row were you can restore the missing block from one of the drives. With that new block, you can then use the standard parity to get the block for the other drive. With a new diagonal now missing only one block, you can then proceed to the next row, following the same route.
Imagine drive A and B fails and you replace them with two new drives, X and Y respectively:
X1 Y1 C1 Pa1 A1 ⊻ B1 ⊻ C1 Pb1 A1 ⊻ B2 ⊻ C3
X2 Y2 C2 Pa2 A2 ⊻ B2 ⊻ C2 Pb2 B1 ⊻ C2 ⊻ Pa3
X3 Y3 C3 Pa3 A3 ⊻ B3 ⊻ C3 Pb3 C1 ⊻ Pa2 ⊻ A3
The restoration steps:
- Y1 = C2 ⊻ Pa3 ⊻ Pb2
- X1 = Y1 ⊻ C1 ⊻ Pa1
- Y2 = X1 ⊻ C3 ⊻ Pb1
- X2 = Y2 ⊻ C2 ⊻ Pa2
- X3 = C1 ⊻ Pa2 ⊻ Pb3
- Y3 = X3 ⊻ C3 ⊻ Pa3
Et voilà. Drives X and Y are restored with contents of A and B.
How about Linux’s RAID6 implementation? I still have to analyze it.



Comments