Learn some tips and best practices to secure your mashup applications
Level: Intermediate
Sachiko Yoshihama (sachikoy@jp.ibm.com), Researcher, IBM
Dr. Frederik De keukelelere (eb41704@jp.ibm.com), Postdoctoral Researcher, IBM
Dr. Michael Steiner (msteiner@watson.ibm.com), Researcher, IBM
Dr. Naohiko Uramoto (uramoto@jp.ibm.com), Researcher, IBM
July 16, 2007
Ajax, that is, Asynchronous JavaScript and XML, is a key technology in Web 2.0. It allows separating the interaction between users and Web pages from the communication between Web browsers and servers. Especially Ajax drivers
Mashup integrates Multiple content or services into a single user experience. However, due to its dynamic and multi-domain nature, Ajax and mashup technologies have introduced some new types of threats. Understand the threats posed by Ajax technology,
And avoid them by exploring some best practices.
Ajax is built on the Dynamic HTML (DHTML) technology, including the following common technologies:
JavaScript: JavaScript is a scripting language that is often used in client Web applications.
Document Object Model (DOM): DOM is a standard Object Model used to represent HTML or XML documents. Today, most browsers support DOM and allow JavaScript code to be dynamically read.
And modify HTML content.
Cascading Style Sheet (CSS): CSS is a Style sheet language used to describe HTML documents. JavaScript can modify the style sheet during running so that it can be dynamically updated.
Web page representation.
In Ajax, the client JavaScript updates the Web page by dynamically modifying the DOM tree and style sheet. In addition, asynchronous communication (which can be implemented through the technology described below) allows dynamic data updates without reloading the entire data.
Web pages:
XMLHttpRequest: XMLHttpRequest is an API that allows clients to establish HTTP connections with remote servers and exchange data, such as plain text, XML, and JSON (JavaScript Serialized
Object Notation ).
JSON: JSON is a lightweight, text-based, language-independent data exchange format proposed by RFC 4627. It is based on a subset of the ECMAScript language (which makes it a part of the JavaScript language)
And defines a small set of format rules for creating Portable representations of structural data.
Note that some common formats in Ajax applications can replace JSON, such as XML and plain text without format. Here we will discuss JSON because it has some hidden security issues.
I will study it in this article.
It is recommended that readers who are not familiar with Ajax read the references first.
Understanding the same-origin policy
When content from multiple origin sources is integrated into a single application in some way, some content may have different levels of trust between them, or they may not have to trust each other at all. This will naturally produce
A requirement is to separate the content from different senders and minimize the conflicts between them.
The same-origin policy is part of the current browser's protection mechanism, which separates Web applications from different domains (assuming the domain represents the initiator. That is to say, if some applications in multiple windows or frameworks are never
If they are downloaded from the same server, they cannot access data and scripts from each other. Note that the same-origin policy can only be applied to HTML documents. Using <script src = "..."> to mark the JavaScript file that imports HTML documents is considered
Part of the same source of the HTML document. This policy is executed in all major browser implementations.
In the context of XMLHttpRequest, the same-source policy aims to control the interaction between applications and remote servers. However, the same-origin policy has limited influence on Web 2.0 applications for the following reasons:
There are many ways to bypass the same-origin policy: I will demonstrate some of these methods in the article later.
An important feature of Web 2.0 applications is the user's contribution to the content: that is, the content is generally not provided by trusted services, more is provided by asynchronous users through blogs, wikis, and other media. Therefore
That is, the content in a single server can actually come from multiple sources.
The browser enforces the same-source policy to check the server's domain name as the string literal value: for example, the http://www.abc.com/and http: // 12.34.56.78/will be treated differently as different domains, even if the IP address of www.abc.com
The address is actually 12.34.56.78. In addition, any path expression in the URL will be ignored. For example, http://www.abc.com /~ Alice will be recognized as http://www.abc.com /~ Malroy's same source, thus ignoring this
The fact is that these two directories may belong to different users.
Most Web browsers allow Web applications to extend the definition of a domain to the hyperdomain of the application itself. For example, if the application is downloaded from www.abc.com, the application can set the document. domain attribute
Rewrite it to abc.com or com (in Firefox ). Most of the latest browsers only allow access to window objects in Windows or frameworks whose document. domain attributes are rewritten to the same value. However, some versions
The old Browser allows an XMLHttpRequest connection with the domain specified in the document. domain attribute.
Even if a Web server is in a trusted domain, the server may not be the origin of the content, especially in the context of Web 2.0: for example, enterprise Portal Server, Web-based email server, or wiki
Is trusted, but the content they host may contain input from potentially malicious third parties, which can be cross-site scripting (XSS) attack (this attack will be introduced later)
. Therefore, the domain where the server is located cannot represent the trustworthiness of its content.
Avoid same-origin policy: JSON and dynamic script Markup
JSON is a plain text that contains simple brackets, so many channels can exchange JSON messages. Because of the same-origin policy restrictions, we cannot use XMLHttpRequest when communicating with external servers.
JSONP (JSON with Padding) is a method that bypasses the same-origin policy. It combines JSON with <script> tags, as shown in Listing 1.
List 1. JSON example
<Script type = "text/javascript"
Src = "http://travel.com/findItinerary? Username = sachiko &
Reservationnum= 1234 & output = json & callback = showItinerary "/>
When the JavaScript code dynamically inserts the <script> flag, the browser accesses the URL in the src attribute. This will send the information in the query string to the server. In listing 1, the username and
Reservation is passed as a name-value pair. In addition, the query string contains the output format of the request to the server and the name of the callback function (showItinerary ). <Script> after the tag is loaded, the callback function is executed and
The callback parameter transmits the information returned from the service to the callback function.
Avoid same-origin policy: Ajax proxy
Ajax proxy is an application-level proxy server used to mediate HTTP requests and responses between Web browsers and servers. The Ajax proxy allows the Web browser to bypass the same-origin policy, so that you can use XMLHttpRequest to access
Third-party servers. To implement this bypass, you can choose from the following two methods:
The client Web application knows the third-party URL and passes it as a request parameter in the HTTP request to the Ajax proxy. Then, the proxy forwards the request to www.remoteservice.com. Note: You can set the Proxy Server
The use of the server is hidden in the implementation of the Ajax library used by Web application developers. For Web application developers, it may seem completely different from the same-origin policy.
The client Web application does not know the third-party URL and tries to access resources on the Ajax proxy server through HTTP. With a predefined encoding rule, the Ajax proxy converts the requested URL to
URL and the content retrieved on behalf of the customer. In this way, Web application developers seem to be communicating directly with the proxy server.
Avoid same-origin policy: Greasemonkey
Greasemonkey is a Firefox extension that allows users to dynamically modify the style and content of Web pages. Greasemonkey users can associate user script files with a URL set.
. When the browser loads the page through the URL set, these scripts are executed. Greasemonkey provides additional permissions for the user script API (compared with the permission for the script running in the browser sandbox ).
GM_XMLHttpRequest is an API in which it is essentially an XMLHttpRequest with no same-origin policy. The user script can replace the built-in XMLHttpRequest browsing with GM_XMLHttpRequest.
XMLHttpRequest is allowed to perform cross-origin access.
The use of GM_XMLHttpRequest can only be protected by means agreed by the user. That is to say, Greasemonkey requires user configuration only when the association between the new user script and the set of specific URLs is established. However,
It is hard to imagine that some users may be deceived and accept the installation if they do not fully understand the consequences.
Research attack scenarios
Not only do developers expose attacks to malicious users when they avoid the same-origin policy, but when malicious code is inserted into a Web application, the current application is also vulnerable to attacks. Unfortunately, how Malicious Code enters the Web Application
Diverse. We will briefly discuss two possible approaches, which are increasingly relevant to Web 2.0.
Cross-site scripting (XSS)
XSS is a common attack method in which attackers inject a malicious code segment into a well-running website. XS