Dr. Francisco J. Collazo

Description and Application of the Hamming Code:

The Laws of Cryptography:

Error correcting codes.

Codes that correct errors are essential to modern civilization and are used in devices from modems to planetary satellites. The theory is mature, difficult, and mathematically oriented, with tens of thousands of scholarly papers and books, but this section will describe only a simple and elegant code, discovered in 1949.

Description of the Hamming Code.

Richard Hamming found a beautiful binary code that will correct any single error and will detect any double error (two separate errors). The Hamming code has been used for computer RAM, and is a good choice for randomly occurring errors. (If errors come in bursts, there are other good codes.) Unlike most other error-correcting codes, this one is simple to understand.

The code uses extra redundant bits to check for errors, and performs the checks with special check equations. A parity check equation of a sequence of bits just adds the bits of the sequence and insists that the sum be even (for even parity) or odd (for odd parity). This section uses even parity. Alternatively, one says that the sum is taken modulo 2 (divide by 2 and take the remainder), or one says that the sum is taken over the integers mod 2, Z2.

A simple parity check will detect if there has been an error in one bit position, since even parity will change to odd parity. (Any odd number of errors will show up as if there were just 1 error, and any even number of errors will look the same as no error.)

One has to force even parity by adding an extra parity bit and setting it either to 1 or to 0 to make the overall parity come out even. It is important to realize that the extra parity check bit participates in the check and is itself checked for errors, along with the other bits.

The Hamming code uses parity checks over a portion of the positions in a block. Suppose there are bits in consecutive positions from 1 to n-1. The positions whose position number is a power of 2 are used as check bits, whose value must be determined from the data bits. Thus the check bits are in positions 1, 2, 4, 8, 16, ..., up to the largest power of 2 that is less than or equal to the largest bit position. The remaining positions are reserved for data bits.

Each check bit has a corresponding check equation that covers a portion of all the bits, but always includes the check bit itself. Consider the binary representation of the position numbers: 1 = 12, 2 = 102, 3 = 112, 4 = 1002, 5 = 1012, 6 = 1102, and so forth. If the position number has a 1 as its rightmost bit, then the check equation for check bit 1 covers those positions. If the position number has a 1 as its next-to-rightmost bit, then the check equation for check bit 2 covers those positions. If the position number has a 1 as its third-from-rightmost bit, then the check equation for check bit 4 covers those positions. Continue in this way through all check bits. The table below summarizes this.

Here is a table showing the parity checks for the first 17 positions of the Hamming code. (Check bits are in positions 1, 2, 4, 8, and 16, in red italic in the table below.)

Position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Bin Rep 1 10 11 100 101 110 111 1000 1001 1010 1011 1100 1101 1110 1111 10000 10001

Check:1 x x x x x x x x x

Check:2 x x x x x x x x

Check:4 x x x x x x x x

Check:8 x x x x x x x x

Check:16 x x

The table below assumes one starts with data bits 1101101 (in black below). The check equations above are used to determine values for check bits in positions 1, 2, 4, and 8, to yield the word 11101010101 below, with check bits in red italic here and below.

Position 1 2 3 4 5 6 7 8 9 10 11

Binary 1 10 11 100 101 110 111 1000 1001 1010 1011

Word 1 1 1 0 1 0 1 0 1 0 1

Check:1 1 1 1 1 1 1

Check:2 1 1 0 1 0 1

Check:4 0 1 0 1

Check:8 0 1 0 1

Intuitively, the check equations allow one to ``zero-in'' on the position of a single error. For example, suppose a single bit is transmitted in error. If the first check equation fails, then the error must be in an odd position, and otherwise it must be in an even position. In other words, if the first check fails, the position number of the bit in error must have its rightmost bit (in binary) equal to 1; otherwise it is zero. Similarly the second check gives the next-to-rightmost bit of the position in error, and so forth.

The table below gives the result of a single error in position 11 (changed from a 1 to a 0). Three of the four parity checks fail, as shown below. Adding the position number of each failing check gives the position number of the error bit, 11 in this case.

Position 1 2 3 4 5 6 7 8 9 10 11 Result of Check

Binary 1 10 11 100 101 110 111 1000 1001 1010 1011

Word 1 1 1 0 1 0 1 0 1 0 0 (error)

Check:1 1 1 1 1 1 0 1 fails

Check:2 1 1 0 1 0 0 2 fails

Check:4 0 1 0 1 - passes

Check:8 0 1 0 0 8 fails

The above discussion shows how to get single-error correction with the Hamming code. One can also get double-error detection by using a single extra check bit, which is in position 0. (All other positions are handled as above.) The check equation in this case covers all bits, including the new bit in position 0. In case of a single error, this new check will fail. If only the new equation fails, but none of the others, then the position in error is the new 0th check bit, so a single error of this new bit can also be corrected. In case of two errors, the overall check (using position 0) will pass, but at least one of the other check equations must fail. This is how one detects a double error. In this case there is not enough information present to say anything about the positions of the two bits in error. Three or more errors at the same time can show up as no error, as two errors detected, or as a single error that is ``corrected'' with a bogus correction.

Notice that the Hamming code without the extra 0th check bit would correct a double error in some bogus position as if it were a single error. Thus the extra check bit and the double error detection are very important for this code. Notice also that the check bits themselves will also be corrected if one of them is transmitted in error (without any other errors).

Law HAMMING1: The binary Hamming code is particularly useful because it provides a good balance between error correction (1 error) and error detection (2 errors).

Block sizes for the Hamming Code.

The Hamming code can accommodate any number of data bits, but it is interesting to list the maximum size for each number of check bits. The table below includes the overall check bit, so that this is the full binary Hamming code, including double error detection.

Check bits Max Data bits Max Total size

3 1 4

4 4 8

5 11 16

6 26 32

7 57 64

8 120 128

For example, with 64 bits or 8 bytes, one gets 7 bytes of data (plus 1 bit) and uses 1 byte for the check bits (actually, only 7 bits). Thus an error-prone storage or transmission system would only need to devote 1 out of each 8 bytes 12.5% to error correction/detection

Revision date: 2001-12-12. (Please use ISO 8601, the International Standard.)

Reference: The Hamming Code for Error Correction

by Neal R. Wagner

NOTE: This site is obsolete. See book draft (in PDF):

The Laws of Cryptography with Java Code.

Calculating the Hamming Code:

The key to the Hamming Code is the use of extra parity bits to allow the identification of a single error. Create the code word as follows:

Mark all bit positions that are powers of two as parity bits. (positions 1, 2, 4, 8, 16, 32, 64, etc.)

All other bit positions are for the data to be encoded. (positions 3, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 17, etc.)

Each parity bit calculates the parity for some of the bits in the code word. The position of the parity bit determines the sequence of bits that it alternately checks and skips.

Position 1: check 1 bit, skip 1 bit, check 1 bit, skip 1 bit, etc. (1,3,5,7,9,11,13,15,...)

Position 2: check 2 bits, skip 2 bits, check 2 bits, skip 2 bits, etc. (2,3,6,7,10,11,14,15,...)

Position 4: check 4 bits, skip 4 bits, check 4 bits, skip 4 bits, etc. (4,5,6,7,12,13,14,15,20,21,22,23,...)

Position 8: check 8 bits, skip 8 bits, check 8 bits, skip 8 bits, etc. (8-15,24-31,40-47,...)

Position 16: check 16 bits, skip 16 bits, check 16 bits, skip 16 bits, etc. (16-31,48-63,80-95,...)

Position 32: check 32 bits, skip 32 bits, check 32 bits, skip 32 bits, etc. (32-63,96-127,160-191,...)

etc.

Set a parity bit to 1 if the total number of ones in the positions it checks is odd. Set a parity bit to 0 if the total number of ones in the positions it checks is even.

Here is an example:

A byte of data: 10011010

Create the data word, leaving spaces for the parity bits: _ _ 1 _ 0 0 1 _ 1 0 1 0

Calculate the parity for each parity bit (a ? represents the bit position being set):

Position 1 checks bits 1,3,5,7,9,11:

? _ 1 _ 0 0 1 _ 1 0 1 0. Even parity so set position 1 to a 0: 0 _ 1 _ 0 0 1 _ 1 0 1 0

Position 2 checks bits 2,3,6,7,10,11:

0 ? 1 _ 0 0 1 _ 1 0 1 0. Odd parity so set position 2 to a 1: 0 1 1 _ 0 0 1 _ 1 0 1 0

Position 4 checks bits 4,5,6,7,12:

0 1 1 ? 0 0 1 _ 1 0 1 0. Odd parity so set position 4 to a 1: 0 1 1 1 0 0 1 _ 1 0 1 0

Position 8 checks bits 8,9,10,11,12:

0 1 1 1 0 0 1 ? 1 0 1 0. Even parity so set position 8 to a 0: 0 1 1 1 0 0 1 0 1 0 1 0

Code word: 011100101010.

Finding and fixing a bad bit

The above example created a code word of 011100101010. Suppose the word that was received was 011100101110 instead. Then the receiver could calculate which bit was wrong and correct it. The method is to verify each check bit. Write down all the incorrect parity bits. Doing so, you will discover that parity bits 2 and 8 are incorrect. It is not an accident that 2 + 8 = 10, and that bit position 10 is the location of the bad bit. In general, check each parity bit, and add the positions that are wrong, this will give you the location of the bad bit.

Try one yourself

Test if these code words are correct, assuming they were created using an even parity Hamming Code . If one is incorrect, indicate what the correct code word should have been. Also, indicate what the original data was.

010101100011

111110001100

000010001010

Referfence:

http://www.cs.fiu.edu/~downeyt/cop3402/hamming.html

Lab 2: Hamming Code

Repeating every bit three times does allow us to detect and correct errors, but it means that we need three times as much memory! That is why mathematicians work to develop smarter error correcting codes, or ways of encoding bit strings into slightly longer bit strings which have enough redundancy so that errors can be corrected. A simple example is the following Hamming code.

This error correcting code replaces every string of 4 bits by a string of 7 bits (this still means that we need more memory than for the raw, uncoded data, but we need slightly less than twice the original space, versus three times in our naive example above).

This is how it works:

given a 4 bit string b1 b2 b3 b4, we construct a 7 bit string: b1 b2 b3 b4 b5 b6 b7; the first 4 bits are just the old 4 bits again, and the new bits b5, b6 and b7 are computed by means of our old friend parity addition:

b5 =

parity addition of (b1+b2+b3)

b6 =

parity addition of (b1+b3+b4)

b7 =

parity addition of (b2+b3+b4)

Every possible string of 7 bits that can be constructed in this way is called a codeword. Let's do a few examples. Find the codewords corresponding to these four bit combinations:

All in all, there are 16 possible codewords:

0000000

0001011

0010111

0011100

0100101

0101110

0110010

0111001

1000110

1001101

1010001

1011010

1100011

1101000

1110100

1111111

How does decoding work? Again, by some kind of majority vote. Given a string of 7 bits, it can either be one of the 16 codewords, and then we know what the original string was, or it can be a corrupted codeword. In that last case, we simply find the codeword in the list of all possible codewords that differs least from our string.

For instance, if the string is 1010110, then the most similar codeword is 1000110 (since it differs in only one entry), and so the decoded string is given by the first 4 bits of that codeword, namely 1000.

Reference: MATH ALIVE 199

WEB SITE:http://www.princeton.edu/~matalive/VirtualClassroom/v0.1/htm