So I ran into this article entitled “How to Create a Personal Encryption Scheme to Easily Hide Your Data in Plain Sight” on lifehacker.
I just want to say:
This is a REALLY bad idea.
The recommendation is to create you’re very own personal ‘encryption‘ system to encode you’re personal information in plain site. The problem is they don’t suggest using encryption at all, rather they suggest obfuscation of the data. So what is the difference? Well let’s first talk about what is Obfuscation:
Obfuscation (or beclouding) is the hiding of intended meaning in communication, making communication confusing, wilfully ambiguous, and harder to interpret.
So from this the key thing to take away is that Obfuscation means that it makes it difficult to interpret. Now let’s look at what the definition of encryption:
In cryptography, encryption is the process of transforming information (referred to as plaintext) using an algorithm (called a cipher) to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key.
So encryption makes data “unreadable” whereas obfuscation makes it “difficult”. That is a good start for a definition but these lines are somewhat gray. There are ‘encryption’ systems that have proven weak such as DES. So if I can decrypt cipher-text encoded by DES, does that mean DES is just Obfuscation? Hrmm. Well if you look at this another way, even strong algorithms (AES, PKI, etc) can be brute forced given infinite time and resources. So does this extend to these algorithms as well? Are then all encryption algorithms just a means of obfuscation at varying levels of difficulty? Well strictly speaking yes; however, the level of difficulty to brute force AES is so far past our current abilities that it is, for now, unbreakable.
So what is Obfuscation?
If we intend to differentiate Obfuscation from Encryption we need a better definition for both. The above definitions are Wikipedia’s, here is what I would add:
Definition of Obfuscation: A process applied to information to intentionally make it difficult to reverse without knowing the algorithm that was applied.
In other words, knowing the process or algorithm that was used makes obfuscation significantly easier to decipher. Any of the examples in the lifehacker article are subject to this. Once I know the basic idea behind your obfuscation technique it’s easy to defeat. Take for example a letter substitution table. Let’s say we assign each letter to another random letter, a=j, b=f, c=u, etc. Now encode this post with that substitution table. It would be a trivial thing to decode even without knowing your ‘secret’ letter substitution table. Why? well just google it: letter substitution solver.
So what is encryption?
Definition of Encryption: A process applied to information that, even knowing the algorithm applied, requires a secret (key) to reverse it in a reasonable amount of time.
There now that seems to fit, even cryptographically weak algorithms like DES fit this description. They make it hard to decipher even knowing the algorithm applied unless you have the key. So this helps delineate the differences between obfuscation and encryption. Even if you use something that fits into the encryption bucket that does not make you’re data secure.
Measuring data security:
Data or information security is measure in time and each of the encryption algorithms we use have a measurable amount of time it would take to break them. This ‘time’ assumes no weaknesses in the key generation or the implementation algorithm. (Note: I’m not interested in chosen plain-text, related key, side channel, and other attacks that rely on behavior of the implementation, I’m talking about cipher-text at rest). This time can and does change, computers become faster, weaknesses in the algorithm itself are exposed, etc. This is very similar to what happened to DES in 1997, things change and what had seemed impossible suddenly became very feasible.
Ultimately no data is ‘perfectly secure’ even using a very good algorithm like AES. You should be aware that most of the time you’re data is more at risk from weak passwords that are used to create an encryption key rather than from weaknesses in the algorithm itself. This is why there are 100′s if not 1000′s of ways to crack winzip passwords. When passwords are not involved the generated pseudo-random keys require storage (you can’t remember it) and therefore the security of that storage location becomes the problem (see Keep it secret, Keep it safe).
Using a Personal Encryption Scheme?
Now back to the story we started with, How to Create a Personal Encryption Scheme to Easily Hide Your Data in Plain Sight. This, as I’ve already said, is a REALLY bad idea. So what should you do? There are 100′s of “Secure Note” applications for almost every platform still in use today. Pick one that allows full text passwords (not 4 digits) and use it. Anything that uses 4 digits is insecure by design having only a keyspace of 10,000 unique possibilities. By contrast, modern cryptography has a keyspace of at least 2^128 or 340,282,366,920,938,463,463,374,607,431,768,211,456 unique possibilities.