Introduction to the XML library 4 Suite Server in Python, python4suite

Source: Internet
Author: User
Tags xslt

Introduction to the XML library 4 Suite Server in Python, python4suite

Before continuing to read this article, you must understand some of the technologies we will discuss in this column. The technologies we want to use include extended style table Language conversion (Extensible Stylesheet Language Transformations, XSLT) and XML Path Language (XML Path Language, XPath) and Resource Description Framework (RDF ). The references section contains links to information about all these technologies.
4 overview of Suite Server

We will use the XML resource library 4 Suite Server (4SS) developed by the author of this article as the basis for the example application in this article. 4 Suite Server is an XML resource library with many functional programs for XML data and metadata management. Whether Python is used or not, these functional programs make the 4 Suite Server very suitable for the rapid development of Web Services.

The example in this article is written in 4 Suite Server 0.11, which requires Python 1.5.2 or later and 4 Suite 0.11. Download all these applications in the references section.
Online Software Resource Library

This article is the second part of the "Python Web Services Developer" column. It is the first part of the three-part series about building an online software resource library. In this section, we will build our infrastructure. In subsequent columns, We will specifically describe how to use various protocols (for example, Simple Object Access Protocol (SOAP) HTTP and WWW Distributed programming and Version Control (WWW Distributed Authoring and Versioning, WebDAV) Search creates indexed content and proxy-based content addition or Content Retrieval.

Our online software resource library service model is based on the RDF model in RPMFind.net, but the relationship is not very close. RPMFind is a system that uses the popular Red Hat Package Manager (RPM) format to compile directories for UNIX and Linux software packages. It contains key metadata about the software package (including the author, version, and description, in the format of RDF, see Listing 1 ). For brief definitions of RDF, please read the previous section of this column, or go to the references section to find the link to the basic introduction to this simple format.

The actual XML format is irrelevant. In fact, because the technology described applies to any type of XML content, there is no need to describe the software. You can use this technology to describe the catalog, employee information, or even the restaurant's liquor list.

All the code and data files used in this example can be downloaded from the reference link.
Document Definition

In the 4SS XML resource library, document definitions allow you to specify a ing between XML content and RDF metadata. Therefore, you need to define a set composed of three XPath expressions: A subject expression, a predicate expression, and an object expression. The XPath expression allows you to define a node link set in the document, and allows you to return a subset of content from the document based on these relationships. When adding, modifying, and deleting Each XML document in the resource library, evaluate the values of these XPath expressions based on the XML document. The obtained statement, also known as triple, is automatically added to or deleted from the RDF database. If the document has been modified, you also need to change the tuples to reflect the changes. If the document has been deleted, You need to delete the tuples from the RDF server. Document definitions can inherit the defined information from other documents, which allows you to define the complex ing between XML content and RDF metadata information.

In our sample application, we will extend one of the default document definitions. The default document definition describes the dubing of Dublin Core tags embedded in XML content to Dublin Core statements. Dublin Core is a metadata initiative that defines a set of standard attributes of common Web-based objects (such as Creator, Title, and Date. The derived document definition adds a statement for each document.

As shown in the following figure, a simple declaration sets the Creator metadata of this document as the result of a obtained XPath:

RdfStatement(subject='$uri', predicate="http://purl.org/dc/elements/1.1#Title", object="/rdf:RDF/s:Software/dc:Creator")

(The code above is a single-line statement, but it is scaled down to fit this format .)

To add or update the system default data, you should run the 4 SS built-in script populate. py. This will download useful data from the ftp://ftp.fourthought.com to update your server. The downloaded data contains some common items, such as Dublin Core Document definition and Docbook Style Sheets (Docbook is a popular XML format for technical documents ).

When 4SS is installed, the script is automatically installed in the demo application. On Unix-based machines, implant scripts are generally stored in/usr/doc/4SuiteServer-0.11 or/usr/local/doc/4SuiteServer-0.11. On Windows machines, the storage directory is generally c: \ Program Files \ Python or c: \ Python20. Listing 2 shows how to install your 4 SS-based application.
List 2: 4 SS application implanted

Copy codeThe Code is as follows: [molson @ penny example] $ python/usr/doc/4SuiteServer-0.11/demo/populate. py
Downloading XML Documents
Downloading Stylesheets
Downloading DocDefs
Adding XML document: 'null'
Adding stylesheet: 'docbook _ html1.xslt'
Adding stylesheet: 'Presentation _ toc. xslt'
Adding stylesheet: 'Presentation. xslt'
Adding stylesheet: 'docbook _ text1.xslt'
Adding document definition: 'Dublin _ core'
Adding document definition: 'docbook1'

Next, we must create a document definition for the software entry list. To add a definition, we use the command line script 4ss deserialize docdef to pass the serialized file name as a unique parameter. For example:

Copy codeThe Code is as follows: [molson @ penny example] $ 4ss deserialize docdef software.doc def

Content

We will use 4ss create document from the command line to add new content to the system. In the download example, there are two software lists, which are XML files named software1.rdf and software2.rdf. To add these files to the system, run the 4ss create document command, specify the document definition to be used, the name of the file to be added, and a column of aliases for resources in the system.

First, we need to create a container for the software resource library on our server and set the permission of the container to allow write access to the "uo" group, and allow everyone to read (because we want to provide Web pages from this directory ):

Copy codeThe Code is as follows: [molson @ penny example] $ 4ss create container/softrepo
[Molson @ penny example] $ 4ss set acl -- write = uo -- world-read/softrepo

Then, we add the downloaded sample file to the resource library. Although the 4SS resource library can store a lot of data in any format, it is highly optimized for storing XML data. When we. when the tar file is added to the resource library, we specify the -- IBD option to set the Internet Media Type of the file (Internet Media Type, IBD) (here is application/x-gzip ). In addition to other functions, the CAP can also be used by the HTTP server to Retrieve Web content. Note that the content of the media source is also called the MIME type ". See listing 3 for instructions on how to add content. Note that in a more complex project, you can consider placing binary files in a separate container.
Retrieve content

Retrieving content is as simple as adding content. However, you must first add the style sheet to the resource library. Our example file contains a very simple style sheet. To add it, you can use 4ss create document and alias it software. xslt. For example:

Copy codeThe Code is as follows: [molson @ penny example] $ 4ss create document BASE_XSLT software. xslt softrepo/software. xslt

BASE_XSLT is a special document definition. It tells 4SS to use this document as an XSLT style table for optimization.

After adding the document, you can now use your Web browser to connect to the 4SS HTTP server (supports common Python and Apache servers), and then go to http: // localhost: 8080/softrepo/pong. xml page. This will retrieve the pong Software Description document from the resource library. If you are using a browser (such as Internet Explorer or Mozilla) that supports MTS text/xml, you can view the XML that has been added to the resource library. To tell the HTTP Listener that you want to process the page before the page is returned (by running in XSLT), specify the xslt URI query parameter HTTP: // localhost/softrepo/pong. xml? Xslt = software. xslt.

Note that the link to the downloaded package on the page also points to localhost. This link will also check the HTTP listener and retrieve the resources we added for the pong-0.0.2.tgz. When it is returned to the browser, it specifies the media content transfer (CAP) defined when resources are added to the system.

Generate index page

To generate index pages, we will use some 4SS extension functions to allow XSLT to access the RDF model. There are other solutions for generating an index page in 4 ss. One solution is to use Python to write a custom processing program for http get messages. In this way, you may need to query the RDF model when requesting index.html. Another solution is to use the 4SS Event System to update the index.html document whenever a new document is added to or deleted from the system.

Because XSLT is always applied according to a source document, we will add a dummy source document to the system. In the download example, there is a source file named index.doc. Use 4ss create document to add this document to the resource library, as shown below:

Copy codeThe Code is as follows: [molson @ penny example] $ 4ss create document index.doc BASE_XML softrepo/index.doc
[Molson @ penny example] $ 4ss set acl -- world-read softrepo/index.doc

We will use the extended function rdf. complete in the style sheet to collect information about all software in the system. Extended functions call the complete method on the RDF model. The complete method allows you to search for the RDF model to find a statement that matches the specified pattern. This method can use up to three parameters: a subject, a predicate, and an optional object. These parameters can be empty strings. It returns a statement that matches all specified values. For example, if you enter the subject foo and the object bar, a statement with the subject foo, any predicates, and the object bar will be returned.

4 SS automatically creates an RDF statement that links the document to the document definition. The subject of these statements is the document URI, the predicate is the http://schemas.4suite.org/4ss#metaxml.docdef, and the object is the document definition name. With this in mind, we can use a simple complete call to specify the predicates and objects, and use the software document definition to get a list of documents in our system.

The style table used to generate an index is called index. xslt. The template that matches the source document root directory first calls the rdf. complete. In the RDF model, this function calls a complete operation for all statements that use http://schemas.4suite.org/4ss#metaxml.docdef as the predicates and software as the object. The result of calling the rdf. complete function is a node set of the Statement element. Each Statement element has three child elements: Subject, Predicate, and Object. As shown in Listing 4, we use xsl: apply-templates based on the results of the function, and display each software item that matches on the Statement in the template.

To view the generated index page, go to http: // localhost: 8080/softrepo/index.doc? Xslt = index. xslt.

However, this index is not static, and we can easily understand how to extend this simple page to show the software title in any style. You can modify the display method of each item in the style sheet, or add more mappings to the document definition to adjust the data that can be used in the style sheet.

Conclusion

Oh, we still haven't written any Python code, but we did have a rough understanding of some functions of 4 Suite Server. In the column next month, we will expand on the basis of this example to enable the software resource library to manage content and search for all generated metadata.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.