So recently I’ve been working heavily with some of the cryptography stuff in .Net and adding a large amount of it to my open source library. One of the many things I needed to perform was simply encrypting and decrypting a piece of data with a password.

It seems everyone out there is using Rfc2898DeriveBytes by now and not using the older PasswordDeriveBytes. So it’s use is simple enough, construct it with a password, ask it for a number of bytes, and off you go. When things get interesting is correctly providing salt to the derived bytes algorithm. I didn’t want to deal with this all over my code and thus built the PasswordKey class. It auto-magically prepends the salt to the stream when encrypting data. When decrypting data it reads the salt, generates the key, and then decrypts the data.

At first everything went well, it seemed to work and, as per my norm, I quickly got my unit tests pounding on the implementation. Somewhere along the way though I must have used the Rfc2898DeriveBytes class in a way that was not expected as when I began running in release (optimized) mode I started seeing strange results. The tests were generating 32-bit keys with the same password, iteration, and salt combinations and were generating different keys? The funnest part was they only differed after the first 20 bytes. How could this be? If the input was wrong there is almost no likelihood the first 20 bytes would be the same, so how did I fail? I looked and looked, googled and googled some more. Read nearly every line of code in the class using Reflector.Net, nothing. Still to this day I don’t know why it’s breaking, I do know I can get two identical 20-byte keys, or two identical 40-byte keys, but I still get two different 32-byte keys… What should I do? Well I came up with two possible solutions:

1. The first ‘fix’ for this issue resulted in the PBKDF2 class that derives from Rfc2898DeriveBytes. Then by overriding the GetBytes() method I can always pass a value of 20 to the actual GetBytes() base method:

    public override byte[] GetBytes(int cb)
    {
        byte[] buffer = new byte[cb];
        for (int i = 0; i < cb; i += 20)
        {
            int step = Math.Min(20, cb - i);
            Array.Copy(base.GetBytes(20), 0, buffer, i, step);
        }
        return buffer;
    }

2. Then I started debating the value of using a 20-byte hash to generate 32 bytes of data. Seemed like this algorithm was just as applicable with any size hash right? Sure enough… So the second, and finally preferred solution was to recreate the algorithm with a variable hash algorithm. This gave birth to the HashDerivedBytes<THash> implementation which accepts THash to be any HMAC (Keyed hash) derived implementation. With this I’m able to force the use of the managed SHA256 hash routine and elevate the iterations by nearly an order of magnitude without substantial performance loss. In the end I’m pleased with the implementation and can verify it’s compliance with the original algorithm by comparing it’s output using SHA1 with that of the original Rfc2898 and produce compatible results.

Anyway if you have problems with the Rfc2898DeriveBytes class I’d like to hear from you just to make sure I’m not crazy. And now you know of two possible solutions if you do have problems ;)

Comments