Wednesday, May 4, 2011

Analysis of A Scam

Introduction

It's all to often that we hear about being scammed on the internet especially when using Craigslist – the popular website for selling and buying almost anything on the internet.  But it seems as though the majority of the website has become devoted to messages warning us of the potential for getting scammed.  Recently, I received multiple such emails and one of them was quite believable.  Because there is such a high possibility that we will end up dealing with email scams in the course of our internet use, this article is devoted to the analysis and steps necessary to determining whether or not you should even waste your time replying.  We will start by analyzing the content of such emails and will end with pointing out just where to look in the email's header in order to determine the originating IP address and subsequent physical location of that IP address. 

Off to the Sp3ll!ng B33

For the most part, the easiest way to tell whether or not an email is worth actually reading is by looking for some obvious details.  These details include the email's formatting, spelling, grammatical usage, and punctuation.  Figure 1 shows a portion of one of the emails I received. 

                 Figure 1: The story

The text in figure 1 is peppered with errors of all kinds.  Notice that some words are capitalized  while others are not at the start of each sentence.  Also notice that some of the words are not even the right word or word tense.  If you attempt to read it all the way through, the text really doesn't make much sense and you have to ask yourself the question, why would someone be attempting to rent an apartment from another country without going through a broker or real estate agent. 

Figure 2 lists the actual information that the “landlord” needs in order to determine our ability to rent the apartment.  Yet while the information seems standard, most real estate agents don't ask the question “Are you Married?” 

                 Figure 2: The information
Furthermore, looking at the information we are being asked to supply, it's obvious that the scammer is attempting to indirectly gather identity information.  Worse yet, questions six and seven could be used as recovery questions for passwords.  This is where it becomes dangerous to reply to these emails, especially if you are new to experiencing scams.  For example, if you happened to reply to the email with all your personal information without asking the question “why?,” you have potentially put the  security of your identity, usernames, and passwords at risk. 

Having looked at an email that resembles an obvious scam, let's now turn our attention to an email that is a little more believable. 

                            Figure 3: A more believable approach

In figure 3 the first indicator is that it's written with better English in that there are not as many glaring errors.  The scammer is also attempting to let the reader know why they didn't respond right away.  Moving on, figure 4 shows what could be considered the core of the email. 

                 Figure 4: A more believable story   

Taking Steps To Protect Yourself  

Dealing with just the email is not always a clear indication as to whether or not your dealing with a scam, simply because a well designed website is all you need to disguise a phishing attack.  But that being the case, there is one piece of information that can be used to make a determination once and for all about what your dealing with, that's the originating IP address of the email.  To figure this out, we have to dig deeper into the email and track it down through the email's header information. 
 
                     Figure 5: Email header information
Received: from zap-server (***********@173.224.219.130 with login)

Figure 5 shows the originating IP address of the email and in order to find this we need to look for the very last IP address which tends to be closest to the bottom of the header information.  To do so simply click on “Actions” and then “View Full Header” if you are using Yahoo! Mail or click the down arrow in the upper right hand corner of the email and then “Show original” if using Gmail.  Now that we have obtained the originating IP address we can use a website to trace it back its physical location.  There are many websites that can be used to do this but for demonstration purposes we have used ip-adress.com.

Figure 6 shows the end result of our trace as well as the originating destination of the IP address.

                           Figure 6:  Tracing results 



Looking at the output in figure 6 we see that the email originated from Western Africa and most likely Nigeria or the like.  Knowing this information, we can now make better decisions as to whether or not we should supply our personal information or even bother replying to the email.

Conclusions
 
As is usually the case, the best way to prevent such attacks is to avoid giving out your personal details in the first place.  It should also be noted that you are almost never asked for personal data through email.  If you should be asked for personal information make sure to verify with the person who is asking for your information before filling out any web forms or submitting attached “rental applications.”  And if it happens that the email looks plausible, take the time to track down it's originating IP address and check where the email actually came from.  Following the analysis we went through as well as the steps to tracing the email's original location will help you avoid having your identity stollen and will add a layer of protection to your personal information.

Sunday, May 1, 2011

Breaking The Code: Brute Forcing The Encryption Key

Introduction

There's no way around it, cryptography is an aspect of our digital lives that's becoming more and more prevalent. It's because we interact in a vast social network that is the internet where we enter our personal information into countless profile pages and make the majority of our purchases online that we have an increasing need to focus on cyber security and cryptography. But at the same time that cryptography has great potential in securing our information, it's just as vulnerable to attack.

In order to illustrate the points set forth by the author, we will be focusing on a single encryption cipher – the simple-substitution cipher. We will demonstrate software implemented for the purposes of encrypting, decrypting, but also breaking such encryption. As a result, the knowledge imparted through this article can and should be used as a stepping stone towards re-thinking cryptography and how we use it to secure information.

Cipher Basics

The simple substitution cipher – generally considered weak encryption – was known for offering a relatively secure means of hiding information back when it was first introduced. This style of encryption worked by – as it's name states – shifting the clear text a certain number of positions right or left of the original character. We refer to this shift value as the encryption key. For example, the letter A shifted four spaces to the right gives us the letter E. Using this method, we could shift all the letters of the English alphabet four spaces to create what's called a cipher alphabet. This cipher alphabet could then be used to encrypt longer pieces of information.

We will be using a variant of the simple-substitution cipher in that we don't limit the encryption key to 25 (because we start counting from 0 in C++) but rather allow the shift to occur past the 26th letter of the alphabet. This will become clearer as the encryption and decryption algorithms are explained in the following section. Within a computer, each character is referenced by a number, this number is referred to as the character's ASCII value. To encrypt each character we need to use its ASCII value so we know from where in the alphabet the shift needs to occur. So to look at our previous example, the ASCII value of A is 65 and when shifted 4 spaces becomes 69 or E; figures 1 and 2 demonstrate this more mathematically.

Figure 1: Calculating Cipher Text
X = (O + K)
where:
X is the encrypted letter
O is the original letter
K is the encryption key

Figure 2: Calculating Cipher Text in Practice
69 or E = (A + 4) or (65 + 4)
where:
E is the encrypted letter
A is the original letter
4 is the encryption key

Understanding the Algorithms

It's time to put our theory into practice, listing 1 shows the encryption algorithm we wrote for the demonstration software. In order to encrypt the clear text we need to loop through the entire string and process each character one by one. The for loop starts by grabbing the first character of the clear text, converting to its ASCII value and shifting to the value of the encryption key. Once the character has been shifted, the encrypted value is then converted back to a character and concatenated to a string variable that will store the encrypted text.

The algorithm continues to execute until the end of the clear text at which point the loop exits and we are left with the encrypted text printed to the screen. As previously mentioned, the decryption algorithm works just like its encryption counter part except everything is reversed. Instead we loop over the encrypted text and subtract the encryption key from the encrypted character's ASCII value to once again recover the clear text, much like we saw in the previous section. The decryption algorithm is documented in listing 2.

Listing 1: Encryption Algorithm
for (int i = 0; i <= clearText.length; i++)
{
    currentChar = clearText[i];
    currentInt = int(currentChar);
    encryptedInt = currentInt + encryptionKey;
    encryptedChar = char(encryptedInt);
    encryptedText += encryptedChar;
}

Listing 2: Decryption Algorithm
for (int i = 0; i <= encryptedText.length; i++)
{
    currentChar = encryptedText[i];
    currentInt = int(currentChar);
    clearInt = currentInt - encryptionKey;
    clearChar = char(clearInt);
    cleartext += clearChar;
}

Following is a sample run of the encryption and decryption algorithms. Listing 3 shows how the message “the quick brown dog” gets encrypted and listing 4, the decryption. Taking a closer look at the encrypted text we see that the word length is not reflected or is more difficult to visualize when encrypted as opposed to when looking at normal English. Consider this example from a more cryptanalytic perspective for a moment. If this were the only text we had to work with, a red flag already has to be raised. Because there are multiple instances of the dollar sign in the encrypted text, we can assume that this character represents a letter in the original text or another widely used character. But just by looking at the ratio of the dollar sign to the other encrypted characters, we know that it doesn't represent a letter, mainly because the dollar sign's ASCII value is greater than any of the “A” to “Z” or “a” to “z” characters. Knowing this, it's a safe bet to assume that the dollar sign represents a space. And subsequently, we can now gain a better idea of where the word lengths occur in the example. While we have now entered into the realm of frequency analysis, it's good to point these things out as the simple-substitution cipher wears its flaws for public display.

                 Listing 3: Encrypting A Message


                 Listing 4: Encrypting A Message


Algorithmic Attacks
Our previous discussions have centered around the fact that we knew the encryption key. Most likely, that will not be the case and we will only have the encrypted message to work with.
When this is the case, there are a couple of attacks we can use. One – and the most widely used – is frequency analysis as we previously alluded to, the other is brute force. Now while it's not our goal to discuss how to use frequency analysis to break encryption, the process basically requires that we find the frequency of the characters used in the encrypted text along with the characters used in the clear text's language – this is usually English.

The second attack and the one we are going to learn how to use is brute force. In regard to the simple-substitution cipher, brute forcing simply involves using a range of numbers to test which one if any is the encryption key. We do this process until the encryption key is found and/or we are able to decipher the encrypted message. The brute forcing algorithm is described in listing 5.
Much like the algorithms we previously looked at, this algorithm also loops over the encrypted text only instead of using a known encryption key, the algorithm takes a number from the user to use as a range starting from 0.

Listing 5: Brute forcing algorithm
for (int i = minRange; i <= encryptedText.length; i++)
{
    for each (char currentChar in encryptedText)
    {
        currentASCII = int(currentChar);
        clearASCII = currentASCII - i;
        clearChar = char(clearASCII);
        clearText += clearChar;
    }

    cout << clearText << endl;
}

You'll also notice that we are also using a set of loops in this algorithm as opposed to before where we only needed one. Reason being, the outer loop is giving us the potential encryption key while the inner loop is using the potential encryption key on the encrypted text. Once execution reaches the inner loop, it works much like our decryption algorithm in that we are still processing character by character but the difference is that we are storing each potentially decrypted message in a string array. Lastly the outer loop prints out each index of the string array to the console window for viewing.

Listing 6 shows a sample run using the same example as before. The list format makes it easier to determine which index contains the decrypted message.

                 Listing 6: Brute forcing in action


If on running the program, the range of numbers does not turn up the decrypted text, simply increase the number previously entered and re-run the program. This process can be repeated as often as needed until the deciphered message is displayed in the list. It's by writing such software that we don't have to concern ourselves with trying each number on it's own and can more easily break the encryption.

So What?

Why care about anything we just talked about? Similarly, it's also apparent that in order to demonstrate the process of breaking encryption, we used an obviously outdated algorithm. The point to all this though is the fact that software can be written as a tool for use in cryptanalysis and ultimately the breaking of encryption. Once a software tool is written that is capable of breaking encryption, the amount of time it would theoretically require to break the cipher leaves little in the way of an acceptable deterrent. This is the case because the rise of bot nets and super-computers substantially raises the potential processing power one has to work with and as a result, the greater the processing power, the smaller the amount of time to break the encryption.

So where are we to look to find a solution to the problem? Because we face a cyclical cycle of constantly developing newer technologies to secure information, there is no clear cut solution. As we push forward into the future of information security, it's best that we do away with the oldest of encryption algorithms – merely to keep them around for theoretical purposes – and focus on those that yield the best possible strength for the current security conditions. Finally, we must not forget that the truest sense of security comes down to our treatment of the encryption key. For the advancement of cryptography is inevitable but the person in control of the keys to such systems is not.