One of the most extended belief about web applications is that most of them are insecure. This opinion is supported by statistics published by SANS  which show that almost half the vulnerabilities published during 2007 were related to web applications, independently from being open-source or commercial software.
There are a lot of references, for example the one published by WASC , which show that most of the web applications deployed nowadays are vulnerable to at least one type of web vulnerability. According to this source, 84% of the web applications are vulnerable to a XSS attack.
Although it is clear that this statistics are not completely true and do not only apply to Java environments, from our experience in security and web application development we can assure that they are not far away from reality.
Analyzing the different possible causes of this fact would take us lot of time. So, assuming that 20% of the risks cause 80% of the security problems, the aim of this article is to show the main web vulnerabilities and how to solve them.
In order to be even more accurate, it focuses on web environments developed in Java using the following web frameworks: Struts 1.x, Struts 2.x, Spring MVC 2.x, WebWork 2.2.x, Stripes 1.4.x, Sun JSF 1.x, Apache MyFaces 1.x, Wicket 1.3.x.
WHY ARE JAVA WEB APPLICATIONS VULNERABLE?
The main reason for this reality is that the most of the developers ignore how the HTTP protocol works and so which the main web vulnerabilities are. It is easy to find developers that are still unaware of the fact that every data received by the client can be manipulated very easily, without being an expert on security. It is still believed that only parameters under GET requests can be modify, which is completely false.
The modification of the data in the client can be easily archieved by using tools that are accessible for anyone. For example, there are plugins (TamperData and TamperIE) for the most used web browsers (Firefox and Explorer) that make it possible to modify any data sent to the server.
To explain more deeply the risks related to web development we will follow the Top 10 vulnerability list created by OWASP , where each type of vulnerability is explained based on a priority order.
A1 - Cross-site Scripting (XSS)
The core of the problem is that reserved characters are not escaped when rendering a web page.
For example, suppose that there is a web forum where messages sent by the users are not validated and are rendered by the visualization jsp in this way:
Typing the following text in the message textbox:
With this message all the users accessing the forum would see a window with that text.
This type of attack is known as stored because the XSS attack is stored in a data base, which makes all the users execute the attack by simply accesing the web application.
Although this attack is inoffensive, it is possible to perform all type of actions since the web is vulnerable to a stored XSS attack:
- Web redirection: it is possible to redirect the user to another website that has the same look as the original one, making it possible to commit fraud. The previous example (rob of cookies) is related to stored XSS type, but there is another more common type called reflected that can be as dangerous as the stored XSS. In this case the XSS attack is not stored in the server. Instead, it is the user who executes the script involuntarily. Let's see an example to understand this attack:
Suppose that an online bank has a web page vulnerable to a XSS where it is possible to execute a script when visualizing a page:
In this case the script is provided as an input parameter of the request. This differs from the stored type where the attack is stored in the server.
The best protection for XSS is a combination of "whitelist" validation of all incoming data and appropriate encoding of all output data.
A2 - Injection Flaws
Injection flaws, particularly SQL injection, are common in web applications. There are many types of injections: SQL, LDAP, XPath, XSLT, HTML, XML, OS command injection and many more. In Sql Injection the problem is based in a bad programming of the data access layer.
Example: We have a web page that requires user identification. The user must fill in a form with its username and password. This information is sent to the server to check if it is correct:
As we can see in the example, the executed sql is formed by concatenating directly the values typed by the user.
In a normal request where the expected values are sent the sql works correctly. But we can have a security problem if the sent values are the following ones:
In this case, the generated sql returns all the users of the table, without having typed any valid combination of username and password. As a result, if the program doesn't control the number of returned results, it might gain access to the private zone of the application without having permission for that.
The consequences of the exploitation of this vulnerability can be mitigated by limiting the database permissions of the user used by the application. For example, if the application user can delete rows in the table the consequences can be very severe.
The solution of this vulnerability is very simple: use PreparedStament when there are dynamic input parameters within SQL. In addition to that, use whitelist validation of all incoming data.
A3 - Malicious File Execution
This attack is based on being able to execute a file in the server side, either one uploaded by the attacker or a previously existing one.
The origin of this vulnerability is based many times on an incorrect validation of the data sent by the client, including references to file or internal resources (see A4 - Insecure Direct Object Reference).
These solutions can be applied to solve it:
- Delete vulnerability from "A4 - Insecure Direct Object Reference", especially on that data that references internal files.
- Input data validation using whitelist, especially with data related with files.
- Activate and configurate Java SecurityManager in order to reduce the risk of accesing internal resources.
A4 - Insecure Direct Object Reference
A4 - Insecure Direct Objet Reference is a type of attack based on the modification of the data sent by the server in the client side.
The process of data modification is very simple for the user. When a user sends a HTTP request (GET or POST), the received HTML page may contain hidden values, which can not be seen by the browser but are sent to the server when a submit of the page is committed. Also, when the values of a form are "pre-selected" (drop-down lists, radio buttons, etc.) these values can be manipulated by the user and thus the user can send an HTTP request containing the parameter values he wants.
Example: We have a web application of a bank, where its clients can check their accounts information by selecting one account from a list:
However, it would be very easy for this user to access another user account, using a simple auditing tool .
For this reason the application (server side) must verify that the user has access to the account he asks for.
The same occurs with the rest of non editable html elements that exist in web applications, such as, links, hidden fields, checkboxes, radio buttons, destiny pages, etc.
This vulnerability is based on the lack of any verification in the server side about the created data and it must be kept in mind by the programmers when they are developing a new web application.
Solutions for avoiding this vulnerability must guarantee the integrity of all the non editable data (all components except textboxes and textareas) sent to the client.
Although some frameworks have started working in solving this problem in a transparent way, most of them force the developers to create their own particular and manual solution for validating each piece of information.
One of the most extended solution is the use of the session to guarantee data integrity, storing non editable data in session before sendind the page and then validating the data when the client's request is received.
A5 - Cross site Request Forgery
Usually, once the user has successfully authenticated, cookies are used as a mechanism to maintain session identification. Each time the user makes a request against the domain that generated the cookie the web browser sends the cookie to the server of this domain.
Based on this feature and while the user session is active, it is possible to force the authenticated user to make an unwanted request against the web server which authenticated the user.
The attack works by including a link or script in a page that accesses a site to which the user is known (or is supposed) to have authenticated.
Example : One user, Bob, might be browsing a chat forum where another user, Mallory, has posted a message. Suppose that Mallory has crafted an HTML image element that references a script on Bob's bank's website (rather than an image file), e.g.,
If Bob's bank keeps his authentication information in a cookie, and if the cookie hasn't expired, then Bob's browser's attempt to load the image will submit the withdrawal form with his cookie, thus authorizing a transaction without Bob's approval.
It is important to emphasize that the request is made by the same authenticated user and so it sends the cookie with the correct credentials.
The solution for this problem is to add a token or random parameter to each link and form, avoiding static requests against the attacked server. It must be said that this solution is valid as long as the server is not vulnerable to XSS attacks. None of the frameworks studied in this article offers a functionality to avoid this vulnerability forcing the developers to implement their own solution which means a great development cost.
A6 - Information Leakage and Improper Error Handling
It means to provide too much information to the potencial attackers such as error explanations, execution stacks, SQL errors...
The solution is not to send error information to clients and to report them in log files or monitorization systems that are not accessible for remote users.
A7 - Broken Authentication and Session Management
This vulnerability is based on breaking the authentication system or the session maintenance system. From our point of view, nowadays authentication and session maintenance systems offered by application servers are adecuate and provide a correct solution for this vulnerability.
A8 - Insecure Chrytographic Storage
This vulnerability is based on using inappropriate cryptographic systems or on directly not protecting the stored data with any type of security system.
The solution is to use secure cryptographic systems, avoiding creating propietary systems.
A9 - Insecure Communications
This vulnerability is based on not ciphering the data exchanged in the communications. The solution is to use SSL in all the communications where critical data is exchanged.
A10 - Failure to restrict URL access
It is based on accesing no authorized urls. For example, to try to access the administration zone of a web application: http://host.com/admin/.
The solution for this problem is not to trust on the belief that the attacker doesn't know the url paths and also to apply access control system to avoid this type of attack. There are suitable solutions for this vulnerability such as Spring Security  or J2EE Security.
LACK OF WEB SECURITY FUNCTIONALITIES
Once we know the most important web vulnerabilities and their solution, the next step is to try to solve them using tools and functionalities existing in the Java World. We have defined three groups of vulnerabilities based on the existing solutions.
Some vulnerabilities have very clear solution and there are already suitable solutions to apply them in the Java environment:
- A9 - Insecure Communications: use SSL.
- A8 - Insecure Cryptographic Storage: use secure cryptographic solutions.
- A7 - Broken Authentication and Session Management: use the authentication system and session maintenance provided by the application server or web container.
- A6 - Information Leakage and Improper Error Handling: use internal log systems and never send any error information to the client.
- A10 - Failure to restrict URL access: use J2EE access control systems or solutions like Acegi.
Partially resolved vulnerabilities
There is another group of vulnerabilities, A2 - Sql Injection and A1 - XSS, which has suitable solutions (using PreparedStatement and escaping output) but they need to be strengthen by a correct input data validation.
Although any web framework provides an input data validation functionality, this must be applied explicitly for each parameter and it is not possible to apply generic data validation policies. For example it is not possible to set a rule to only allow alphanumeric characters (0-9A-Za-z) in all input data parameters.
Finally, there are solutions that are very hard to apply with the current fuctionalities provided by most of the web frameworks.
For example the solution suggested to solve A5 - Cross-site Request Forgery involves adding a random token (parameter) to each link or form in our web application. If we try to apply this policy, we will find that in most of the web frameworks we will be forced to modify all the jsp pages to add this extra parameter. Moreover, we will have to create our own token generation system.
It is still more difficult to solve A4 - Insecure Direct Object Reference vulnerability. Although there are a few web frameworks that do this validation automatically, most of them don't which means that developers have to validate non editable data integrity manually (links, checkbox,...).
For this reason it is very common that web applications have this type of vulnerabity. So it is strongly recommended to audit the application before deploying it to make sure there isn't any risk.
So far we have seen that some security problems have suitable solutions while others don't. This means that existing solutions do not cover correctly all the security requirements and this leads to implement propietary and manual solutions in some cases.
In order to be more specific we have identified which security functionalities should web frameworks offer in the presentation layer to avoid most web vulnerabilities.
NON EDITABLE DATA INTEGRITY VALIDATION
There are two types of data exchanged between client and server: data generated in the server side that can be selected by the user (links, lists, hidden fields, checkboxes) and data generated by the user itself using editable components like a textbox or a textarea.
This means that a great part of the page sent to the client must be kept unmodified for a normal use of the application. But although the user shouldn't change non editable data he can actually change it and so exploit some of the vulnerabilities explained before.
The solution is to eliminate the possibility of changing data generated in the server side by the user, guaranteeing data integrity in a transparent way.
This is a feasible solution because the server knows the values of non editable data. So it is possible to use traditional solutions for integrity check such as ciphering or hash generation.
NON EDITABLE DATA CONFIDENTIALITY
As well as guaranteeing data integrity, it is also important to guarantee data confidentiality. The client mustn't edit nor know the real values of each parameter.
If we guarantee only data integrity but not confidentiality we can provide useful information for an attacker. This information can be use to complete another type of attack such as sql injection.
So, it is important to guarantee non editable data confidentiality as well to reduce the risk of suffering any attack.
GENERIC VALIDATION OF EDITABLE DATA
Most frameworks offer functionalities to validate input data. Usually these validations are applied manually for each parameter received from the client. They lack a generic input data validation functionality to apply to every input data.
As an example let's suppose that our application only accepts alphanumeric characters as input data. In order to check that all the parameters meet this limitation we would have to apply this validation rule to every input parameter which means a great effort for the developers and a risk of creating a vulnerability due to bad programming.
To avoid this it is necessary for web frameworks to offer a more generic validation functionality for automatize this validation process.
BINDING TO "NO STRING" DATA
It is necessary that web frameworks allow assigning request data to any type of attribute generating an error automatically if the input data type doesn't match the attribute type.
This avoids having to validate that the received data matches the expected type. For example, if we have an Integer type and the binding deals with generating an error if the received data isn't an Integer we save a lot of work of having to write this validation manually. This functionality is offered by almost all of the recently published web frameworks.
ESCAPE HTML RESERVED CHARACTERS
It means to escape all the meaningful characters in HTML in order to avoid web browsers interpret them and thus avoid XSS attacks. This fuctionality is offered by all the frameworks studied in this article.
To avoid A5 - Cross-site Request Forgery attacks it is necessary to include automatically a security random token to each link and form generated in the web application.
Last but not least is the necessity of having an application monitorization functionality from the security point of view. As important as stopping the attacks against our application is to know the origin or type of them. This information can be especially important to know the risk level our application is really facing, as well as the origin of this risk. The logs should collect information about possible suffered attacks, including:
- application user
- type of attack
- tampered parameters
FUNCTIONALITIES OFFERED BY JAVA WEB FRAMEWORKS
In the next table we can see the security functionalities offered by different Java web frameworks: Struts 1.x, Struts 2.x , Spring MVC 2.x, WebWork 2.2.x, Stripes 1.4.x, Sun JSF 1.x, Apache MyFaces 1.x and Wicket 1.3.x. We have also included Microsoft ASP . NET 3.5 framework in order to have an external reference outside the Java world.
- Stripes Framework supports escape characters functionallity using XSS Filter .
- JSF Frameworks (Sun 1.x, MyFaces 1.x), have three vulnerable components to integrity attacks or A4 - Insecure Direct Object Reference (HtmlInputHidden, HtmlCommandLink and HtmlOutputLink).
- .NET has two components vulnerable to integrity attacks: HiddenField and HyperLink.
- Wicket has only one component (HiddenField) vulnerable to integrity attacks.
Therefore, many of the most used Java web frameworks don't cover all the functionalities specified in the previous chapter (integrity, confidentiality, generic validations, random tokens and monitorization).
In order to cover these lacks there is a Java Web Application Security Framework known as HDIV  that fills this security gap transparently for some of the presented java web frameworks (Struts 1.x, Struts 2.x, Spring MVC, Sun JSF 1.x, MyFaces 1.x).
HDIV has been included by OWASP as a solution for three of the vulnerabilities (A4 - Insecure Direct Object Reference, A5 - Cross-site Request Forgery, A10 - Failure to restrict URL access) of OWASP top ten  for Java environments.
It's important to note that two of the three vulnerabilities are included within the group of unresolved vulnerabilities.
HDIV was designed with the purpose of covering this lack of functionalities by extending some frameworks or web libraries, which currently are: Struts 1, Struts 2 ,Spring MVC and JSTL, Sun JSF 1.x and MyFaces 1.x.
The functionalities added by HDIV are these ones:
- Automatic check of non editable data integrity
- Guarantee non editable data confidentiality
- Generation of random tokens for links and forms
- Generic validations for editable data
- Log generation for each detected attack
For HDIV a State represents all the data that composes a possible request to a web application, that is, the parameters of a request, its values and its types and the destiny or page request.
For example, having this type of link,
a state that represents this link is as follows:
We may have more than one state (possible request) for a page which represents the links and forms existing in the page. When a page (JSP) is processed in the server, HDIV generates an object of type state for each existing link o form in the page (JSP).
Generated state can be stored in two locations:
- Server: States are stored inside de session (HttpSession) of the user.
- Client: State objects are sent to the client as parameters. For each possible request (link or form) an object that represents the state of the request is added.
These states make it possible the later verification of the requests sent by the clients, comparing the data sent by the client with the state.
HDIV has two main modules:
- HDIV Tags Library: Tag Library is responsible for modifying the html content sent to the client that then will be checked by the security filter.
- HDIV Validation Filter: it validates the editable and non editable information of the requests, using the generic validations defined by the user for editable data and the state received in the requests for the non editable information.
Having the same objectives, HDIV has 3 operation strategies:
- Cipher: for each possible request of each page (link or form) an extra parameter (_HDIV_STATE_) is added which represents the state of the request.
To guarantee the integrity of the state itself, which is the base of the validation, it is ciphered using a symmetrical algorithm. Beside adding the extra parameter all the non editable values are replaced by relative values (0,1,2,...) to guarantee data confidentiality.
- Hash: This strategy is very similar to the Cipher strategy but in this case the state sent to the client is coded in Base64.
To be able to check this parameter integrity, a hash of the state is generated before being sent to the client and it is stored in the user session. This strategy does not guarantee confidentiality because the state can be decoded if we have a high technical knowledge.
- Memory: All the states of the page are stored in the user session. To be able to associate user requests with the state stored in the session, an extra parameter (_HDIV_STATE_) is added to each request. This parameter contains the identifier that makes possible to get the state from session. In this strategy non editable values are hidden as well guaranteeing confidentiality.
In order to better understand how HDIV works, let's see how it modifies a web page. For example, this is the HTML code generated by an application developed using Struts:
Once HDIV is activated, in this case using the memory strategy, the generated HTML looks like this:
The changes made over the original HTML code are these:
- Parameter's real values have been replaced by relative values (0,1,2,...)
- A new extra parameter (_HDIV_STATE_) has been added to the link and to the form.
FUNCTIONALITIES OFFERED BY WEB FRAMEWORKS POWERED BY HDIV
In the next table you can see the same web frameworks table but using HDIV in the frameworks currently supported by HDIV: Struts 1.x, Struts 2.x, Spring MVC 2.x, Sun JSF 1.x and Apache MyFaces 1.x.
HDIV adds Integrity, Confidentiality, Generic Validations, Random Tokens and Monitorization functionalities to Struts 1.x, Struts 2.x and Spring MVC web frameworks.
HDIV resolves integrity lacks of JSF. Currently there are two implementations for JSF (Sun 1.x and MyFaces 1.x). In addition to that HDIV adds Monitorization functionality to JSF web frameworks. Generic Validations and Random Tokens functionalities for JSF are going to develop in next releases of HDIV.
Nowadays the war between different Java web frameworks is focused on offering richer and more interactive applications but there are some basic security requirements that have not been covered yet by many of them, even in the most recently published ones.
If we add this fact to the general unawareness of web vulnerabilities we find that nowadays many java web applications are vulnerable to some type of web vulnerability.
From our point of view it is possible to solve this situation by broadcasting the risks related to web applications and also by automating the process of solving each type of vulnerability.
This means that we have to overcome the phase of solutions based on just generic advices like "do input validation" and offer real solutions that help to automate this task. Thus, we will really help developers by eliminating the great responsibility of having to implement proprietary solutions.
We are trying to change the current situation: by broadcasting documentation about web application risks from our website hdiv.org, by publishing articles like this or by collaborating directly with some development teams of the main java web framework so that they integrate HDIV in their framework as a part of the default distribuition.
As a result of this collaboration there is an open process to integrate HDIV with Struts. We hope that in a future more frameworks will add to this initiative and thus collaborate all together to increase the security level of web applications developed in Java.
ABOUT THE AUTHORS
Roberto Velasco is computer science engineer. He has 8 years experience as a Java developer (SCJP, SCWCD, SCBCD), 5 as Java architect (SCEA) and 5 years as Web application Security Analyst. He is founder of HDIV project.
Gorka Vicente is computer science engineer. He has 6 years experience as a Java developer (SCJP, SCWCD) and 4 years as Security Analyst. He is co-founder of HDIV project.