Final Project
Instructions

Your final project is to implement the Secure Hash Algorithm (SHA1) in C or C++.

This algorithm takes a file, cuts up and mixes the data, and produces a hash value, which is a number in a specific range.

The hash value for SHA1 is 160 bits long, so it has a decimal value of 2^{160}, which is roughly the number of atoms on planet Earth.

Since at present it is computationally infeasible to determine the original text from the SHA1 hash value, it is often used in data security for such things as storing passwords securely, error detection, comparing files, and digital signatures.

For more details, see the NIST document FIPS1801SecureHashStandard.pdf

The document is a bit technical, but has very detailed output that you can use to debug your program.

Here are the 3 input files from the document:
abc.txt
alpha.txt
a.txt

Wikipedia also has example hashes and SHA1 pseudocode at http://en.wikipedia.org/wiki/SHA1

You can also follow my instructions below to help you write the program.

Instructions for function:
unsigned int readFile(unsigned char buffer[])

The entire contents of the file are stored in the
buffer[]
array,
1 (one) bit is appended to the end of the buffer[]
array,
and the function returns the number of characters in the file (size of file in bytes).

This is similar to the character counting program counting.c

Input and count the characters from standard input until the endoffile (EOF).

For data input, create a text file (for example, abc.txt), and
use any input method you wish.

Store the characters in a large array of UNSIGNED (not signed) characters.

The largest input file to test your program is 1 million characters,
but set your MAX_SIZE to 1048576 (1 megabyte), or larger.
(Warning: Your program might not get the correct result for 1 million a's file, if you use 1000001 as the max.)

Make sure that you have error checking to stop the program and tell the user when the input file is too big for your program.

At this point you can check if your program puts the correct characters into your array, and counts the correct number of bytes in the file.

This also might be a good place to append the 1 (one) bit at the end of the message.

Note that this is equivalent to adding the byte 0x80 after the last character in the buffer.

For the abc.txt example, buffer[0] = 'a' = 0x61, buffer[1] = 'b' = 0x62, buffer[2] = 'c' = 0x63, and buffer[3] = 0x80.

Instructions for function:
unsigned int calculateBlocks(unsigned int sizeOfFileInBytes)

The next step is to calculate the block count.

Since each block is 512 bits, divide the total bits (not bytes) in the file by 512.

Before dividing by 512, you also have to add 1 (one) more to the total bit count, as 1 (one) bit is appended to the end of the data from the file.

Since the last 64 bits at the end of the last block are reserved for the bit count (not byte count) of the file, one more block must be added for any message that has a final block that is greater than 448 (512  64) bits.

This equation will calculate the block count: (((8 * sizeOfFileInBytes) + 1) / 512) + 1

This if statement will determine if an extra block needs to be added or not:
if((((8 * sizeOfFileInBytes) + 1) % 512) > (512  64)) blockCount = blockCount + 1

For the abc.txt example, blocks = ((((8 * sizeOfFileInBytes) + 1) / 512) + 1) = ((((8 * 3) + 1) / 512) + 1) = (0 + 1) = 1.
There is no extra block, because the statement
if((((8 * sizeOfFileInBytes) + 1) % 512) > (512  64)) =
if((((8 * 3) + 1) % 512) > (512  64)) =
if(25 > 448) is false.

Instructions for function:
void convertCharArrayToIntArray(unsigned char buffer[], unsigned int message[], unsigned int sizeOfFileInBytes)

It is easier to read from a file using an array of unsigned characters (buffer[]),
but it is easier to process each block (512 bits or 16 integers) using an array of unsigned integers (message[]),
so you should covert your array of unsigned characters into an equivalent array of unsigned integers.

We did this in the assignment on C Bitwise Operators, where we packed 4 characters into 1 integer variable.

Instructions for function:
void addBitCountToLastBlock(unsigned int message[], unsigned int sizeOfFileInBytes, unsigned int blockCount)

You need to insert the size of the file in bits into the last index of the last block.

Calculate the index of last word (integer) in the message[] array.

In the document, "word" refers to data type "int", both which are 32 bits.

Insert the number of bits in the file to the last integer (word) of the message[] array.
 There are 8 bits in a byte, so sizeOfTheFileInBits = sizeOfFileInBytes * 8

There are 16 integer array elements in each block,
so indexOfEndOfLastBlock = (blockCount * 16)  1.

Instructions for function:
void computeMessageDigest(unsigned int message[], unsigned int blockCount)

The final step is to compute the message digest, which is described in the document in parts 5, 6, and 7 (pages 912).

Initialize variables as described in the document on page 11. (H0 = 0x67452301, H1 = 0xEFCDAB89, etc...)

Loop through each block, and complete steps a, b, c, d, and e as described in the document.

When you implement the functions
unsigned int K(unsigned int t)
and unsigned int f(unsigned int t, unsigned int B, unsigned int C, unsigned int D)
note that the C/C++ code "if(0 <= t && t <=19)" is equivalent to the the pseudocode statement "if(0 <= t <=19)". Part 5 is function f(). Part 6 is function K().

Equivalent C/C++ notation:
C/C++ operator: & (X AND Y = X & Y = bitwise "and" of X and Y)
C/C++ operator:  (X OR Y = X  Y = bitwise "inclusiveor" of X and Y)
C/C++ operator: ^ (X XOR Y = X ^ Y = bitwise "exclusiveor" of X and Y)
C/C++ operator: ~ (NOT X = ~X = bitwise "complement" of X)
C/C++ operator: + (X + Y = 2^{32} modulus addition)

Output the message digest!

Email your C or C++ program to the instructor,
or show it to the instructor in class,
or make a class presentation.
 Your program will be evaluated by the following criteria:
projectrubric.xls