International - English

Cart Console

Topic Center

Contact Sales

Home > Website Builders > Cloud Servers

Using Microlark to deal with Microxml

Last Update:2014-12-27 Source: Internet

Author: User

Keywords Microlark microxml

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The Microlark developed by John Cowan is an open source Microxml parser in the Java™ environment. In this article, we'll use sample code to learn Microlark.

Microxml is a backward-compatible, XML-simplified version and a new specification. In the 1th part of this series, part 1th: Exploring Microxml's http://www.aliyun.com/zixun/aggregation/17687.html, we introduced the basics of Microxml and explained its The difference between XML 1.x and related standards. Microxml was proposed by James Clark, and John Cowan created its first parser Microlark, which led to the development of this specification. Microlark belongs to the open source (Apache 2.0 license) tool, written in the Java language, which implements several parsing modes: Push mode, pull mode, and tree mode.

In this article, we'll learn to parse the Microxml format. We'll explore all aspects of the Microlark parser API using command line and sample code.

Started

To keep up with the sample progress in this article, you need to download:

Microlark.jar, if you wish, you can also download its source code open source Jython interpreter

First, you can run Microlark on the command line and use a Microxml file as an input file. Listing 1 makes some changes to the simple files used in the basic guidelines for exploring Microxml in part 1th:

Listing 1. A simple file

<! DOCTYPE html><html lang= "en" >  <head> <title>welcome page</title> </head> <body> <p>welcome to <a href= "http://ibm.com/developerworks/" &GT;IBM developerworks< /a>.</p> </body></html>

Save the sample as Listing1.xml and place it in Microlark using the code shown in Listing 2.

Listing 2. Microlark

Java-jar Microlark.jar listing1.xml

You should see the output shown in Listing 3.

Listing 3. Output results

(Htmlalang en-\n--\n-(head-\n-title-welcome page) title-\n-) head-\n-(body-\n-(P-welcome to AAhref IBM.COM/DEVELOPERWORKS/-IBM developerworks) p-\n-) body-\n) HTML

Does this look slightly different? Listing 3 uses a format called PYX, which is a line-oriented representation of an XML document, originating from the presentation specification of the SGML document. PYX renders all the information in an XML document in a way that minimizes the burden of parsing. This is a very useful tool, but unfortunately, it is often overlooked by XML developers.

The default action for Microlark is to convert a microxml document to a subset of PYX or even PYX, because Microxml is a subset of XML.

The PYX format is very simple. The first character of each line represents the content type of the row. Content is not written directly across rows, but there may be multiple rows containing the same content type. For tag attributes, property names and property values are separated directly by a space and no additional quotes are used. Listing 4 shows the prefix characters.

Listing 4. Prefix characters

(Start-tag) End-taga attribute-character data (content)? 處理 instruction

The legend corresponds to the input above. The biggest advantage of PYX is that it can be used with long, extremely useful UNIX® text processing commands, such as grep, awk, Sort, sed, and awk.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Some suggestions for improving SQL execution efficiency 09-14

A collection of sophisticated SQL statement techniques 09-14

3 Steps to address SQL injection pitfalls 09-14

Search results sorted by matching fields 09-14

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

direct digital landing development documentation data user director of marketing deploy it ddos how to description of products and services ddos information data website domain to dns

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Using Microlark to deal with Microxml

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support