Design

 

What is Design?
Design is the results of decisions made both before and during the coding process that effect more than one component in your product/codebase.

What Design is Not?
Needless paperwork produced solely to satisfy some middle-manager’s project milestone.

 

Sometimes it is necessary to generate a password. This might be done to create a secure account on a machine, or to reset a user’s password via email (although a one-time use security token is a much better answer). Another possible use is to generate passwords for your own use on a website. There are lots of ways to achieve this, but the method below would be my approach…

/// <summary>
/// Creates a pseudo-random password containing the number of character classes
/// defined by complexity, where 2 = alpha, 3 = alpha+num, 4 = alpha+num+special.
/// </summary>
public static string GeneratePassword(int length, int complexity)
{
    System.Security.Cryptography.RNGCryptoServiceProvider csp =
        new System.Security.Cryptography.RNGCryptoServiceProvider();
    // Define the possible character classes where complexity defines the number
    // of classes to include in the final output.
    char[][] classes =
    {
        @"abcdefghijklmnopqrstuvwxyz".ToCharArray(),
        @"ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray(),
        @"0123456789".ToCharArray(),
        @" !""#$%&'()*+,./:;<>?@[\]^_{|}~".ToCharArray(),
    };

    complexity = Math.Max(1, Math.Min(classes.Length, complexity));
    if(length < complexity)
        throw new ArgumentOutOfRangeException("length");

    // Since we are taking a random number 0-255 and modulo that by the number of
    // characters, characters that appear earilier in this array will recieve a
    // heavier weight. To counter this we will then reorder the array randomly.
    // This should prevent any specific character class from recieving a priority
    // based on it's order.
    char[] allchars = classes.Take(complexity).SelectMany(c => c).ToArray();
    byte[] bytes = new byte[allchars.Length];
    csp.GetBytes(bytes);
    for (int i = 0; i < allchars.Length; i++)
    {
        char tmp = allchars[i];
        allchars[i] = allchars[bytes[i]%allchars.Length];
        allchars[bytes[i]%allchars.Length] = tmp;
    }

    // Create the random values to select the characters
    Array.Resize(ref bytes, length);
    char[] result = new char[length];

    while(true)
    {
        csp.GetBytes(bytes);
        // Obtain the character of the class for each random byte
        for (int i = 0; i < length; i++)
            result[i] = allchars[bytes[i]%allchars.Length];

        // Verify that it does not start or end with whitespace
        if (Char.IsWhiteSpace(result[0]) || Char.IsWhiteSpace(result[(length - 1) % length]))
            continue;

        string testResult = new string(result);
        // Verify that all character classes are represented
        if (0 != classes.Take(complexity).Count(c => testResult.IndexOfAny(c) < 0))
            continue;

        return testResult;
    }
}

Essentially this method starts by creating a randomly ordered set of characters to choose from. The reason this is randomly ordered is that we will offset into this array by taking a random number from 0-255 modulo the length of the array. Because of the modulo operation this can weight the first set of characters higher so we randomize it.

Once we have a randomly ordered set of characters to choose from we use the RNGCryptoServiceProvider.GetBytes to populate a byte array with random values. Each random byte will be used to select a character in the output array.

Lastly we make a few assertions about the result. Since a space ‘ ‘ can be a viable password character we make sure that the password neither starts with or ends with whitespace. Having whitespace at the beginning or end would not only be confusing, but some input forms will automatically remove the trailing whitespace. The second thing we verify is that each requested character class is represented by at least one character. This ensures that an alpha-numeric password will contain at least one number and one letter.

This brings us to the ‘how long of a password to use’ kind of question. The following is a chart to display how many password characters are required to produce an equivelent key in bit-length:

case insensitive
(26 chars)
case sensitive
(52 chars)
alpha + numeric
(62 chars)
alpha + numeric + special
(92 chars)
*56-bit 12 10 9 7
*64-bit 14 12 11 10
128-bit 28 23 22 20
256-bit 55 45 43 40

*56-bit keys are capable of being brute forced in just a few hours on current specialized hardware. Distributed systems are also capable of cracking 56-bit encryption. 128-bit keys are considered a minimum key space for today’s cryptographic algorithms.

The table is actually somewhat surprising to me in how little difference the added numeric and special characters make. To make a password as secure as the AES/128-bit algorithm you need 23 upper and lower case characters. Saving one keystroke by adding numbers to the mix just doesn’t seem all that worth while to me. When you look at most people using just 8-12 characters for what they consider a ‘good’ password it almost makes you laugh.

It seems to me that most password complexity rules requiring special characters and numbers make little sense. I guess it prevents dictionary attacks to a degree, but clearly the length of the password is much more important.

 

Ok I swear this is the last B+Tree chart, graph, or metric I’ll post… at least for a while. I wanted to hit new limits so I fired up 6 threads on my six-core and let them rip… After three hours I finally stopped the run to look at the results.

Each of the six threads had inserted over 34 million records, a total of over 200 million records inserted in 3 hours averaging over 18k inserts per second. The data file had reached 5.79 GB (6,221,824,032 bytes). I guess that isn’t too bad considering I was using my slow 2tb WD caviar green (I didn’t want to thrash my SSD). Opening the file doesn’t do much other than to read the first block, from a cold start it took 62ms. The first thousand random seeks took an average of 10ms and thereafter settled in at around 9ms. Sequential scans of 1000 keys at a random offset takes about 500ms averaging .5ms each.

The following is a graph of one thread (all 6 near identical) showing the time in seconds (vertical) and total records inserted in millions (horizontal). Obviously the point being made here is that the line is straight :)

Time / Records
B+Tree Insert Progression
 
If you are wondering why these numbers don’t line up with the B+Tree benchmarks previously posted there are two reasons. First I totally screwed up the multi-threaded test for those numbers. The second difference was the amount of data/record size. The above graph shows inserts of an Int32 key and a Guid value or 20 byte records. The benchmark used a Guid key on a 100 byte value or 116 byte records. The major reason for reducing the record size was to improve throughput so I could target a larger number of records.

 

So as I mentioned in the previous post, all this work to build lockless queues is really a waste of time. Why, well to answer that question we need something to compare it against. Thus the class below derives from Queue and provides the locking necessary to make the queue thread-safe (well, not thread safe, only those two methods are safe, but you get the idea). After a few benchmarks I quickly realized that after 10 million queue/dequeue operations (on two threads) the performance delta was around 2 seconds. So our lock + event overhead costs us around 0.0002 ms per queue/dequeue operation. When you compare that 200 nanoseconds with your 100 millisecond network latency to get the request in the first place it would be absurd to use a custom ‘Lockless’ queue. Maybe, JUST maybe, you could find use for something like that if your developing games or something; however, what you wind up with in most cases will be a maintenance nightmare and a debugging hell on earth.

Go ahead now and build yourself a little test harness using this queue and the one from the previous post and see for yourself. The below implementation is superior in many ways, it doesn’t have a polling loop so response to an enqueue are potentially faster, it supports any number of producer/consumers so you can perform operations in parallel, and you basically get this behavior free (or near to).

Now this post is not to put down the effort of the new threading objects in .Net 4.0. This is really more about what you should, and should not attempt to do on your own. I’ll leave it for you to decide if the .Net 4.0 team wasted their time ;)

    class LockingQueue<T> : Queue<T>
    {
        ManualResetEvent _mre = new ManualResetEvent(false);
        public LockingQueue(int size) : base(size)
        { }

        public bool IsEmpty { get { return base.Count == 0; } }
            public new void Enqueue(T obj)
        {
            lock (this)
            {
                base.Enqueue(obj);
                _mre.Set();
            }
        }

        public bool TryDequeue(int timeout, out T value)
        {
            lock (this)
            {
                if (base.Count > 0)
                {
                    value = base.Dequeue();
                    return true;
                }
                _mre.Reset();
            }
            _mre.WaitOne(timeout, false);
            lock (this)
            {
                if (base.Count > 0)
                {
                    value = base.Dequeue();
                    return true;
                }
            }
            value = default(T);
            return false;
        }
    }
 

Most people I’ve seen online compute a simple hash of password + salt for persistence and authentication. This is the accepted standard in a straight-forward solution:

    byte[] Hash(string password)
    {
        byte[] pass = System.Text.Encoding.UTF8.GetBytes(password);
        //Create the salt to use
        byte[] salt = new byte[32];
        new RNGCryptoServiceProvider().GetBytes(salt);
        //Create the hash of password and salt
        HashAlgorithm hashAlgo = new SHA256Managed();
        hashAlgo.TransformBlock(salt, 0, salt.Length, salt, 0);
        hashAlgo.TransformFinalBlock(pass, 0, pass.Length);
        byte[] hash = _hashAlgo.Hash;
        hashAlgo.Initialize();
        //Copy the combined salt + hash to a single array
        byte[] result = new byte[salt.Length + hash.Length];
        Array.Copy(salt, result, salt.Length);
        Array.Copy(hash, 0, result, salt.Length, hash.Length);
        return result;
    }

The only deviation i’ve seen from this is to either use existing data (say a primary key in the database) for the salt, or to ‘hide’ the salt at some offset in the result. Both of these ideas are valid yet amount to not much more than a little obfuscation. I don’t recommend it.

Now if you REALLY want to secure your passwords and/or protect against brute force attacks there is another approach. The approach is quite common in generating crypto keys from passwords but can just as easily be used for hashing the password. The idea/concept is expressed in RFC2898 by the introduction of the iteration count. By introducing this into the password hash we increase the computational complexity required to test a password by several orders of magnitude. The straight-forward way of achieving this in the BCL is as follows:

    byte[] Hash(string password)
    {
        //Create the salt to use
        byte[] salt = new byte[20];
        new RNGCryptoServiceProvider().GetBytes(salt);
        //Combine the salt with the iteration-count
        byte[] iterationBytes = BitConverter.GetBytes(1010);
        byte[] iterationSalt = (byte[])salt.Clone();
        for (int i = 0; i < iterationSalt.Length; i++)
            iterationSalt[i] ^= iterationBytes[i % iterationBytes.Length];
        //Create the hash of password and salt
        DeriveBytes deriveBytes = new Rfc2898DeriveBytes(password, iterationSalt, 1010);
        byte[] hash = deriveBytes.GetBytes(20);
        //Copy the combined salt + hash to a single array
        byte[] result = new byte[salt.Length + hash.Length];
        Array.Copy(salt, result, salt.Length);
        Array.Copy(hash, 0, result, salt.Length, hash.Length);
        return result;
    }

This will work very well but will be more costly to calculate than the first algorithm. I would not want to use this in a system that verifies passwords on every request (i.e. Basic Auth over SSL); however, for token-auth systems this will prove much more durable to brute force attacks. By combining the iteration count to the salt (or key) you make the attacker repeat the entire iteration sequence for each iteration count attempted. This makes it significantly more difficult for them to steal the password hashs and crack the passwords without also stealing the software or otherwise knowing the iteration count. Even knowing the iteration count it's now much more expensive to compute each attempt at matching an entry in their password dictionary.

Recently I've added the PasswordHash class to wrap up some of this behavior. Note that this class does not combine the iteration count with the salt, yet still provides significant security benefits over a simple hash.

 

Well just got the release version of VS2010 installed and all I can say is “OMG”.  Not an OMG as in “OMG that is so cool!”, but more like an “OMG, are you serious?”.
 
Several things are standing out as wickedly wrong before I even open a solution…
 
My hardware is no longer sufficient.  I’m running on an HP Pavilion notebook about 2yrs old. Running a dual-core 2.2ghz (3gb RAM) with Vista SP2 32-bit.  Just moving my mouse through menus and around the UI, dragging windows, etc is painful and eating 30-50% of my CPU.  Now I know it’s not stellar machine but my mouse jumping several inches at a time is crazy unusable.  Since I don’t time or money to buy new hardware right now I’ve disable the Vista Areo and reverted to the Windows 2000 look and feel.  Things are moving much better now… Almost usable if ugly.  The scrolling speed of source code is still nasty slow. So word to the wise if you scoring below 3.0 on the “Windows Experience Index” for “3D business and gaming graphics performance” your probably in for a hardware update.
 
Tabbed documents, I hate them.  Unfortunately for me Microsoft is cramming another user experience change down my throat.  Like that @#^*ing ‘Ribbon bars’ interface in Office there is no way to turn it off in VS2010.  (I actually quit using MS Office and switched to OOo to avoid ribbons).  It’s cool that you make the source window float out of the app, but I’m still going to desperately miss the good-old-fashioned MDI view.
 
I’m sure that over the comming days, weeks, and years I’ll find much more to hate and love about about VS2010, but so far I’m not a fan. I guess I’ll go take some time to get to know VS2010 a little better and see if my initial reaction changes.  I hate to think what the next GUI design change from MS will entail for me … I guess I’m just old and set in my ways.
 
LOL, at least I’m not the only one: What happened to MDI capability in the editor?

 
The exponential cost of doing things "Right"

Sometimes a picture is worth a thousand words… I simply ‘cringe’ every time I hear a developer utter the words “I’m going to do this RIGHT!”. Why do we use the words “Right way” and “Wrong way” when describing things? The fact is that there is a lot more than just one “Wrong way” to [...]

 

There are three virtual methods that IMHO should have never been added to System.Object… • ToString() • GetHashCode() • Equals() All of these could have been implemented as an interface. Had they done so I think we’d be much better off. So why are these a problem? First let’s focus on ToString(): 1. If ToString() [...]

 

The answer is simple… only when the profiler tells you to. Optimizations often make code less reliable and often constrain implementations making them less flexible. Good software performance is not created by making lots of micro optimizations throughout your code. A good design from the architectural point of view is the critical key to success [...]

 

http://97-things.near-time.net/wiki/97-things-every-software-architect-should-know-the-book This wiki was developed and published in a book available from O’Reilly Press. So far it’s a great read put together by the collective experiences of many seasoned developers. My own favorite so far is: Simplicity before generality, use before reuse. I think the idea of capturing this all by wiki is entirely awesome. [...]

 

I’ve been reviewing a lot of blog sites lately reguarding testing and unit test practices. One I ran across this morning entitled “Extract and Override refactoring techinque” caught my eye on Taswar Bhatti’s Blog. Don’t get me wrong here, I tried to read this objectively and understand where he was coming from; however, I am [...]