h@ck3d: Breaking The Code: Brute Forcing The Encryption Key

Introduction

There's no way around it, cryptography is an aspect of our digital lives that's becoming more and more prevalent. It's because we interact in a vast social network that is the internet where we enter our personal information into countless profile pages and make the majority of our purchases online that we have an increasing need to focus on cyber security and cryptography. But at the same time that cryptography has great potential in securing our information, it's just as vulnerable to attack.

In order to illustrate the points set forth by the author, we will be focusing on a single encryption cipher – the simple-substitution cipher. We will demonstrate software implemented for the purposes of encrypting, decrypting, but also breaking such encryption. As a result, the knowledge imparted through this article can and should be used as a stepping stone towards re-thinking cryptography and how we use it to secure information. 
Cipher Basics

The simple substitution cipher – generally considered weak encryption – was known for offering a relatively secure means of hiding information back when it was first introduced. This style of encryption worked by – as it's name states – shifting the clear text a certain number of positions right or left of the original character. We refer to this shift value as the encryption key. For example, the letter A shifted four spaces to the right gives us the letter E. Using this method, we could shift all the letters of the English alphabet four spaces to create what's called a cipher alphabet. This cipher alphabet could then be used to encrypt longer pieces of information.

We will be using a variant of the simple-substitution cipher in that we don't limit the encryption key to 25 (because we start counting from 0 in C++) but rather allow the shift to occur past the 26th letter of the alphabet. This will become clearer as the encryption and decryption algorithms are explained in the following section. Within a computer, each character is referenced by a number, this number is referred to as the character's ASCII value. To encrypt each character we need to use its ASCII value so we know from where in the alphabet the shift needs to occur. So to look at our previous example, the ASCII value of A is 65 and when shifted 4 spaces becomes 69 or E; figures 1 and 2 demonstrate this more mathematically.

Figure 1: Calculating Cipher Text
X = (O + K)
where:
X is the encrypted letter
O is the original letter
K is the encryption key

Figure 2: Calculating Cipher Text in Practice
69 or E = (A + 4) or (65 + 4)
where:
E is the encrypted letter
A is the original letter
4 is the encryption key

Understanding the Algorithms

It's time to put our theory into practice, listing 1 shows the encryption algorithm we wrote for the demonstration software. In order to encrypt the clear text we need to loop through the entire string and process each character one by one. The for loop starts by grabbing the first character of the clear text, converting to its ASCII value and shifting to the value of the encryption key. Once the character has been shifted, the encrypted value is then converted back to a character and concatenated to a string variable that will store the encrypted text.

The algorithm continues to execute until the end of the clear text at which point the loop exits and we are left with the encrypted text printed to the screen. As previously mentioned, the decryption algorithm works just like its encryption counter part except everything is reversed. Instead we loop over the encrypted text and subtract the encryption key from the encrypted character's ASCII value to once again recover the clear text, much like we saw in the previous section. The decryption algorithm is documented in listing 2.

Listing 1: Encryption Algorithm
for (int i = 0; i <= clearText.length; i++)
{
    currentChar = clearText[i];
    currentInt = int(currentChar);
    encryptedInt = currentInt + encryptionKey;
    encryptedChar = char(encryptedInt);
    encryptedText += encryptedChar;
}

Listing 2: Decryption Algorithm
for (int i = 0; i <= encryptedText.length; i++)
{
    currentChar = encryptedText[i];
    currentInt = int(currentChar);
    clearInt = currentInt - encryptionKey;
    clearChar = char(clearInt);
    cleartext += clearChar;
}

Following is a sample run of the encryption and decryption algorithms. Listing 3 shows how the message “the quick brown dog” gets encrypted and listing 4, the decryption. Taking a closer look at the encrypted text we see that the word length is not reflected or is more difficult to visualize when encrypted as opposed to when looking at normal English. Consider this example from a more cryptanalytic perspective for a moment. If this were the only text we had to work with, a red flag already has to be raised. Because there are multiple instances of the dollar sign in the encrypted text, we can assume that this character represents a letter in the original text or another widely used character. But just by looking at the ratio of the dollar sign to the other encrypted characters, we know that it doesn't represent a letter, mainly because the dollar sign's ASCII value is greater than any of the “A” to “Z” or “a” to “z” characters. Knowing this, it's a safe bet to assume that the dollar sign represents a space. And subsequently, we can now gain a better idea of where the word lengths occur in the example. While we have now entered into the realm of frequency analysis, it's good to point these things out as the simple-substitution cipher wears its flaws for public display.

                 Listing 3: Encrypting A Message

Listing 4: Encrypting A Message

Algorithmic Attacks
Our previous discussions have centered around the fact that we knew the encryption key. Most likely, that will not be the case and we will only have the encrypted message to work with.
When this is the case, there are a couple of attacks we can use. One – and the most widely used – is frequency analysis as we previously alluded to, the other is brute force. Now while it's not our goal to discuss how to use frequency analysis to break encryption, the process basically requires that we find the frequency of the characters used in the encrypted text along with the characters used in the clear text's language – this is usually English.

The second attack and the one we are going to learn how to use is brute force. In regard to the simple-substitution cipher, brute forcing simply involves using a range of numbers to test which one if any is the encryption key. We do this process until the encryption key is found and/or we are able to decipher the encrypted message. The brute forcing algorithm is described in listing 5.
Much like the algorithms we previously looked at, this algorithm also loops over the encrypted text only instead of using a known encryption key, the algorithm takes a number from the user to use as a range starting from 0.

Listing 5: Brute forcing algorithm
for (int i = minRange; i <= encryptedText.length; i++)
{
    for each (char currentChar in encryptedText)
    {
        currentASCII = int(currentChar);
        clearASCII = currentASCII - i;
        clearChar = char(clearASCII);
        clearText += clearChar;
    }

    cout << clearText << endl;
}

You'll also notice that we are also using a set of loops in this algorithm as opposed to before where we only needed one. Reason being, the outer loop is giving us the potential encryption key while the inner loop is using the potential encryption key on the encrypted text. Once execution reaches the inner loop, it works much like our decryption algorithm in that we are still processing character by character but the difference is that we are storing each potentially decrypted message in a string array. Lastly the outer loop prints out each index of the string array to the console window for viewing.

Listing 6 shows a sample run using the same example as before. The list format makes it easier to determine which index contains the decrypted message.

                 Listing 6: Brute forcing in action

If on running the program, the range of numbers does not turn up the decrypted text, simply increase the number previously entered and re-run the program. This process can be repeated as often as needed until the deciphered message is displayed in the list. It's by writing such software that we don't have to concern ourselves with trying each number on it's own and can more easily break the encryption.

So What?

Why care about anything we just talked about? Similarly, it's also apparent that in order to demonstrate the process of breaking encryption, we used an obviously outdated algorithm. The point to all this though is the fact that software can be written as a tool for use in cryptanalysis and ultimately the breaking of encryption. Once a software tool is written that is capable of breaking encryption, the amount of time it would theoretically require to break the cipher leaves little in the way of an acceptable deterrent. This is the case because the rise of bot nets and super-computers substantially raises the potential processing power one has to work with and as a result, the greater the processing power, the smaller the amount of time to break the encryption.

So where are we to look to find a solution to the problem? Because we face a cyclical cycle of constantly developing newer technologies to secure information, there is no clear cut solution. As we push forward into the future of information security, it's best that we do away with the oldest of encryption algorithms – merely to keep them around for theoretical purposes – and focus on those that yield the best possible strength for the current security conditions. Finally, we must not forget that the truest sense of security comes down to our treatment of the encryption key. For the advancement of cryptography is inevitable but the person in control of the keys to such systems is not.

h@ck3d

Sunday, May 1, 2011

Breaking The Code: Brute Forcing The Encryption Key

No comments:

Post a Comment