Tuesday, October 30, 2012

Hashing in .NET

Hashing is a concept to get a fixed size result from an arbitrary input.  That means you can pass any input of any size to a Hash method and you will get a fixed size result for the same.  In this post, I will talk about Hashing as a whole and the approaches that are available to you in .NET to hash your data. Lets look at some of the properties that Hash algorithm should provide.

Stability of a Hash
By stability of Hash we mean under same situation with same input the hash algorithm should generate the same result. By this what I mean that the the if Input to the hash algorithm is same the output result from the generator should remain the same as well.  So if you pass the same input a million times to the same hash algorithm, it will generate the same result. We use this stability feature of the Hash algorithm to identify if the input is correct or not.  This is the fundamental property for every hash algorithm. So if the Hash is not stable it is not useful anymore.
Uniformity of Hash
Uniformity of hash says that for a Hash algorithm every valid Value of result should have at – least one input value for the available result address space.  In other words, if you look at the result of hash algorithm in the result address space, there should be no result that is within the result address space but cannot make a result of any input. Generally Perfect uniformity is not possible for any Hash algorithm. Hash algorithm is bound to have collision  Practically the address space of Hash is smaller than the address space of the inputs, because using Hash we are actually making the data smaller.
Efficiency of Hash
We consider one hash algorithm to be efficient when the Hashing does not take long time to generate Hashing.  The cost of hashing should be always balanced with the application needs. Hence if the hash algorithm is very complex and does take long time (say 1 milliseconds for instance), it cannot be used for hashing. Any hash algorithm should be fast enough to generate from any input (say 1 microsecond for instance) to prove its efficiency.
Security of Hash
By security of Hash algorithm we mean that the Hash algorithm cannot be reverse engineered. By this we mean given a Hash value, finding the input should not be feasible.  Security of Hashing is one of the important criteria for a good Hash algorithm. So if you are using some hashing that requires security ( I mean if you are hashing some password) make sure you use Secured hash algorithm to do the same.
Lets put the simplest implementation of Hash Algorithm :
We call it Naive Implementation of summing all the ASCII values of a string.
public int AdditiveHashAlgorithm(string str)
{
        int result = 0;
        foreach(int ascii in str)
           unchecked
           {
               result += ascii;
           }
        return result;
}
The above code represents one simplest hash algorithm that adds up all the Ascii characters of a string and returns the integer equivalent. You can see I have used unchecked block to avoid integer overflows. This Hash is Stable, Uniform, Efficient but not secure.
Available Hash Algorithms in .NET
Amongst all algorithms available in .NET BCL, the ones that are most commonly used are MD5, SHA 1, SHA 2(256), SHA512 etc. We will look how you can work with them.
MD5 Hash
public string CalculateMD5(string input)
{
// Create MD5 Hash from input
MD5 md5 = MD5.Create();
byte[] inputBytes = Encoding.ASCII.GetBytes(input);
byte[] hash = md5.ComputeHash(inputBytes);
// Convert byte array to string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hash.Length; i++)
sb.Append(hash[i].ToString("X2"));
return sb.ToString();
}
This code generates MD5 Hash algorithm for you which you can use. Generally we give Hash to the external parties so that if they have same input they can generate the result in the same way and compare the two hash. The MD5 algorithm is Stable, Uniform and Efficient but not secure.
SHA 1 (Secure Hash Algorithm)
public static string CalculateSHA1(string text, Encoding enc)
{
byte[] buffer = enc.GetBytes(text);
SHA1CryptoServiceProvider crypto =  new SHA1CryptoServiceProvider();
byte[] hash = crypto.ComputeHash(buffer);
// Convert byte array to string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hash.Length; i++)
sb.Append(hash[i].ToString("X2"));
return hash;
}
The SHA1 is secure hash algorithm. The same thing you can do with SHA2 (256 bit) by replacing SHA1CryptoServiceProvider with SHA256CryptoServiceProvider; SHA384CryptoServiceProvider for 384 bit hash and SHA512CryptoServiceProvider for 512 bit hash algorithms.
Remember
SHA1 is not secure anymore as potential hack is already introduced. Hence for security it is recommended to use SHA2 or above.
I hope this post helps
Thanks for reading.

No comments:

Post a Comment