CompTIA CYSA+ CS0-002 – Mitigate Web Application Vulnerabilities and Attacks part 3

  1. Secure Coding (OBJ 2.2)

Secure coding. In this lesson we are going to talk about some secure coding best practices. And in this lesson we’re going to talk about input validation, output encoding and parametric queries. First, let’s talk about input validation. Now, I know I’ve mentioned how important it is when I talked about XML and SQL and directory traversals and we kept saying input validation was important, but we never really defined it. Well, input validation is any technique that you use to ensure the data entered into a field or a variable into an application is handled appropriately by that application. Now, essentially, when I take something from a user, I want to make sure it’s actually accurate. If you go to my website and you try to entering your username or your password, I want to make sure you’re giving me valid data before I send it to my database to try and see if it works.

So when we talk about input validation, this can be conducted in two ways. It can be conducted locally on your client or remotely on the server that you’re trying to access. Now, this means that there are two different ways to think about this. But I want to give you a big warning here. While client side was used in the past a lot, you really should be careful using it. When you’re using clientside input validation, it is much more dangerous because it’s vulnerable to malware interference. If I’m testing something on your computer before you send it to my server, I’m now trusting that your computer is doing what I told it to do. And that can’t always be the case because attackers can modify that and malware can modify that.

So you may be wondering, why don’t we just use server side for everything? Well, server side input validation can be time and resource intensive because every time I have to check something, that’s processor cycles on the server. And while your computer is only serving you, my server is serving hundreds of thousands of people people. And so that can take up a lot of resources. Now, let me give you a good example of this. Let’s say you went to my website, you’re trying to buy your exam voucher for your cysaplus exam. You get to the checkout form and it asks for your email and you type in Jason at not my domain. But you didn’t put. com notice here. It says this is not a valid email. This isn’t being checked by my server, this is being checked by your client.

It knows that the format has to be something at something something, right? It knows that’s what an email looks like. And since you didn’t do that, it’s not going to validate it and it won’t let you enter this form. Now, if you just entered. com here, even though that’s not a valid email, it would still accept it because the client side said that’s good enough, it just needs to be something@something. com. Now, input when you’re validating it from the client, it still needs to undergo a server side validation. After passing the client side validation, for example, a lot of websites will validate on your client that that looks like it’s in a valid format of email. Then you send it to the server and they’re going to check that it’s actually a live email address before they process your order.

That’s a server side validation as well. So you can do both. And so it doesn’t have to be either or you can use both. Now, let’s talk about input here for a second. In addition to doing input validation, we also want to do what’s called normalization or sanitization. Now, what is this? Well, normalization is when you take a string and it’s stripped of all the illegal characters or substrings and it’s converted to an accepted character set. For instance, when you go to my website and you enter your name, I expect it to be something that has the alphabet in it. A-B-C-D-E-I don’t expect a bunch of numbers or special characters. And so if you put in numbers or special characters, the system should do input validation. Either reject it or it should normalize it and remove those. That’s the idea of normalization.

Now, again, I’ve mentioned in the past the reason why we have to do input validation is because we want to prevent people from doing attacks on us. And some of those attacks are things like directory traversals we mentioned. Well, one of the attacks that people can use against us is known as a canonicalization attack. Now, cannibalization is an attack method where input characters are encoded in such a way to evade vulnerable input validation measures. So I’ll go back to the example of a directory traversal. I’m going to send something that looks like this deontraining. com, question mark, user equals, etc config. Now, if I don’t want to fall victim of an attack like this, what can I do? I can normalize that data, I can sanitize it and say, oh, anytime I see dot, dot slash, just ignore it.

We’re not going to take it. And we can either reject it or we can just remove that and just pass the etc config part. Now, if somebody wants to try to send this as their username, we’re not going to process it, or if we do, we’re not going to fall victim of the attack because we’ve removed those dot slashes. Now, what happens if they send me this instead? Well, they have Dion Train question mark, user equals, percent to e, percent to e, percent to F. You get the idea, etc config. What is this? Well, they encoded their input to try to get past my validation system. So I need to make sure that I’m aware of this type of attack because I would need to go and sanitize this input not just for, but every variation of dot, dot, slash that may be used by different encoding mechanisms. So this is the idea of how you can prevent these cannibalization attacks.

Next, we want to talk about output encoding. Output encoding is any coding method that’s used to sanitize the output by converting the untrusted input into a safe form where the input is now displayed as data to the user without executing the code in the browser. So let me give you a good example of this. Let’s say you took that input from Sami with all those dot slashes, and then you outputted it back to the user and said, error username incorrect username cannot be. Well, what are you doing there? You’re having an ability that somebody can actually put code into your web page. Because if I went and put something like parenthesescript as my username, and then you aired that out and gave it back to me and said bracket script alert XSS bracket script is not a valid username.

What did you just do? You put the script into your website and displayed it back to me. Output encoding prevents that. So for example, instead of taking an ampersand, you would convert that to ampersand amp semicolon. If you take a less than sign, you would make that into ampersand lt semicolon. Why? Because the less than symbol is one of the things we use to make the word script inside our brackets. Less than symbol script greater than symbol. So if we change that to ampersand lt semicolon script ampersand GT semicolon that won’t execute when you display it back to the user. That’s the idea of output encoding. Output encoding is used to mitigate against the code injection and cross site scripting attacks that attempt to use input to run that script.

That’s what we’re trying to prevent here. Now, the third area we want to talk about is parameterized queries. And this is a great technique that is used to defend against SQL injections and insecure object references by incorporating placeholders into an SQL query. Now, what does this look like? Well, when we take these parameterized queries, we’re really doing a form of output encoding. So let me show you a small program that I wrote. It’s just five lines long, and this is a Java program. Now, I don’t expect you to be able to write this code yourself, but you should be able to read it because Java is considered a high level language and it reads fairly much like English. So let’s take this one line at a time.

Now, stringcus name equals request getparameter parentheses customer name string query equals select account balance from user data where username equals question mark prepared statement pstatement equals connection prepare statement parentheses query pstatement set string parentheses one comma cus name result set results equals pstatement execute query parentheses what is this saying? Well, in the first line, this takes the string customer name and requests you to get the parameter customer name. So we’re getting input from our user. The second line. This is the query we’re defining. We’re defining that the query will always be in this format. Select account balance from user data where username equals question mark. The question mark is essentially what we’re going to fill in.

Then we’re going to use this prepared statement call, which is going to make a connection to the database using the query. Then we’re going to make that string get set inside the query using the first position, that first question mark. We found the only question mark as Customer name, and then we’re going to get the results from executing that query. That’s all this is saying. So when we’re using a parameterized query, we’re only going to use this format when sending it to the database. This way we have different parametrized queries for different functions like writing to the database, reading from the database, updating from the database for certain fields. And we would use those queries that have already been defined by the programmers instead of trying to create them on the fly using customer data.

That’s what we’re doing here. And this is the idea of a prepared statement. Now, this is not a fully functional program. What am I missing here that can really cause me problems? We talked about at the beginning of this lesson. That’s right, input validation. Nothing here is validating that input. So when I’m getting the input from the customer, I didn’t go through input validation. So when I get that input, I should run it through validation before I put it into my query and into my prepared statement. But this was an idea of showing you what a parameterized query looks like with these prepared statements. And that’s what I wanted to focus on in this particular code. Snippet now for the exam, you do need to be able to identify input validation, output encoding and parametrized queries in a given scenario.

Remember, anytime you take input from a user, you want to do input validation. Anytime you’re outputting data that came from a user user back to the screen, you want to use output encoding. Anytime you want to connect to an SQL database, you should really be using parameterized queries because this is a best practice. It’ll make sure you’re taking the minimum amount of information from a user and putting it into specific fields only inside that database by using those parameters queries.

  1. Authentication Attacks (OBJ 1.7)

Authentication attacks. In this lesson, we are going to talk about authentication attacks and what you can do to prevent them. We’re going to talk about things like spoofing, maninthemiddle, password spraying, credential stuffing, and broken authentication, as well as several others. Now, when we talk about Spoofing, this is a softwarebased attack where the goal is to assume the identity of a user, a process, an address or other unique, unique identifier. Spoofing is used a lot to try to bypass authentication and be able to present yourself as if you’re somebody else. Now, one of the things attackers love to try is the man in the middle attack. Now, a man in the middle attack, or MITM, is an attack where the attacker is going to sit between two communicating hosts and transparently captures, monitors, and relays the communications between those hosts.

Now, we’ve talked about a man in the middle before, but essentially, if you’re on a wireless network, somebody could be sniffing the air, capturing those packets, and then being a man in the middle, they can capture what’s being said. Now, if they put themselves directly in the middle of the communication, you might be connecting to them and they would be connecting to the server and they’re listening to everything you say. They can capture it, moderate and relay it right on, or they could even modify it if they wanted to. Now, a variation on this is what’s known as a man in the browser. This is an MITB. This is an attack that intercepts the API calls between the browser process and its DLLs. And so if you’re attacking the network or between two clients or a client and a server, you’re a man in the middle.

If you’re using the browser to do it, you’re a man in the browser. Now, one of the things that people love to try to do is break passwords because if they can get your password, they can own your system, right? Well, let’s talk about the way passwords work for a moment. This will be a quick review from Security Plus. Now we’re when you take a password and you go to store it, do you store it in the database as the word password? No, you actually hash at first. So it’s going to be an MD five hash or a shaw one hash or a shaw 256 hash. And it’s going to be able to be stored in that database as that hash. So nobody knows what that actual password is, not even the system administrator, in theory. Now, this means that that password cannot be recoverable because you can’t go from the hash back to the original.

So when the user chooses that password, we’re going to make sure we hash it using that cryptographic function anytime we store it to our database. This will help protect our users inside of our web applications. Now, even though we do that, a lot of people are still going to try to guess your password. Now, there’s a lot of different ways that they try to do this. And one of the most common is what’s known as an online password attack. This involves somebody simply trying to guess what your password is and entering it directly to the service. Think about it this way. I want to log into Facebook as if I was you. I know your username because it’s tied to your email. So I type in your email address and I start typing in passwords and try hitting login. Each time I do that, I’m doing an online password attack. I’m guessing passwords over and over and over again until I get in.

Now, this is the idea of how you can do an online password attack. Now, how can you as an analyst identify that? That’s what’s going on. Well, you can look at the logs. If you look at your audit logs, you should see something that looks like this here somebody was trying to log in as Jason. They tried at 1912 and they tried using the word password. Then they tried 1913 using pass 1234. Then they tried 1914, puppy one, 2319, 15 cupcake. 1916 Admin. 1917 Admin. One, two, three. And so this is an example of when you see somebody trying to do an online password attack. They’re basically going in and logging in as if they were you, just like you would, except they’re using the wrong password because they don’t know yours yet.

This can be useful if you know the person, if you’re an attacker who has some knowledge about that person and can try to guess something that they might be thinking. But otherwise it’s a pretty inefficient way of doing it. Now, to prevent this type of an attack from happening, there’s really a couple of things you can do. You can restrict the number or the rate of login attempts to prevent these online password attacks. So you can lock the account after three incorrect attempts where they have to reset their password or contact security. That would be one way to do it. Or you could say you can only log in three times and if you get it wrong, you have to wait 20 minutes and then it would reset again. That’s limiting the rate of it. And so these are different ways that you can do this.

Now, another way that people will try to break into your password is by doing password spraying. This is a brute force type of attack in which multiple user accounts are tested with a dictionary of common passwords. So here’s an example of this going from 1912 to 1917. Again, notice the first two attempts were against Jason. They tried password and pass, one, two, three. The second two attempts were against him using again password and password. One, two, three. The third attempt was against tamra password and password. One, two, three. Notice the difference. Here we have groupings of the same passwords that are common words from a dictionary being tried over and over again against different accounts. This makes it password spraying instead of an online password attack.

Now the last one we’re going to talk about here is credential stuffing. Now Credential stuffing is another type of brute force attack. In this one they’re going to try to take stolen user account names and passwords and test them against multiple websites. So let’s say there’s a new story and there’s a new data breach that happened and Facebook hacked. And now all of Facebook’s usernames and passwords are known. So everybody knows what the user names are, which are emails and the passwords. Now Facebook is going to make everybody go in and change their passwords, right? So you’re not going to be able to get back into Facebook, but they could take that username and password and try it on Gmail or Yahoo or MSN or some other website.

And by going across to different websites you can try doing this credential stuffing because you know it was a valid username and password on one system, it may be on others because people tend to reuse their usernames and passwords. So how do you prevent credential stuffing? Well, Credential stuffing can be prevented by not reusing passwords across these different websites. Now the next thing we want to talk about is broken authentication. Broken authentication is a software vulnerability where the authentication mechanisms allow the attacker to gain entry. Essentially the coders did a really bad job. Now when this happens you can have bad things happen like displaying clear text credentials, using weak session tokens, or permitting brute force login requests.

Now what causes these type of things? Well, weak password credentials for one. Let’s say you built a system and you said all passwords will be four digits long. That’s a pretty weak system. There’s only 1000 variations, so people could brute force their way in. Another thing that would happen is if you had weak password reset methods. So you’re going to use something like knowledge factors that are tied to things that people could easily look up. What is your birthday? Where were you born? What state are you registered to vote in? These are all weak things because most of this information is stuff you can find online about people. So we wouldn’t want to use those type of things. Next. You have credential exposure.

Now Credential exposure is when the app actually exposes the credentials or the authentication tokens to somebody who’s in the middle. So if you have a man in the middle, now this is really bad because a lot of applications will hard code credentials into the application or they’re not using encryption, so they’re sending things across the network in plain text or they’re using weak encryption. And so because you’re using weak encryption, it can be cracked. These are all things that can lead to this Credential exposure. And then finally we have session. Hijacking. This is when the application is vulnerable to session hijacking because maybe you’re using session keys that just aren’t really strong and they’re really easy to guess. And so that’s an easy way for people to guess that session, jump into it, hijack it, and then get.

 

img