Cute python: My first web based filter agent

Source: Internet
Author: User
Tags html form

This article describes txt2html, a common domain work project created by David, to illustrate Python's programming skills. Txt2html is a web-based Filter agent-a program that reads web-based documents for a user and then displays the modified pages to the user's browser. To make this possible, txt2html runs as a CGI program, querying the information of external WEB resources and using regular expressions. David will explain, illustrate, and demonstrate these multiple-use subtasks step-by-step.

In the process of writing this developerWorks series, I have encountered problems with writing in the best format. Word processor formats are proprietary, and converting between formats is not always desirable and cumbersome (and each format binds the document to a different proprietary tool, which contradicts the spirit of the open source). HTML is fairly neutral--perhaps the same format you're reading now--but it also adds tags that can easily lead to false input (or bind people to HTML-enhanced editors). DocBook is an interesting XML format that can be converted into many target formats, and it has the correct semantics for technical articles (or books), but like HTML, there are many tags to worry about in writing. LaTeX is especially good for complex printing formats, but it also has a lot of tags, and these articles don't require complex printing formats.

Unformatted ASCII is the best option in order to really worry about writing-especially with the neutrality of platforms and tools. However, the Internet (especially Usenet) recommends the development of an informal standard for "smart ASCII" documents based on completely unformatted text (see Resources). Smart ASCII adds only a little extra semantic content and context, and they look so "natural" in the text display. e-mail, newsgroup messages, FAQs, Project Readme (Readme), and other electronic documents usually include some printing/semantic elements, such as asterisks before and after accent words, underscores under headings, vertical and horizontal spaces that describe text relationships, selective uppercase letters, and other information. Project Gutenberg (see Resources) is an amazing achievement, adding many ideas to its own format and thinking that "smart ASCII" is the best choice for a long time to save and distribute good books. Even if these articles are not as enduring as the literary classics, they still decide to write them in "smart ASCII" format and automatically convert them to other formats using handy Python scripts.

Introduction txt2html

Txt2html was originally a simple file converter, as can be seen from its name. But the internet suggests adding a few obvious enhancements to the tool. Because many of the documents that readers want to view in HTML format are behind http: or ftp: links, the tool should really deal directly with such remote documents (without the need to download/convert/view the cycle). Because the goal of the conversion is ultimately HTML, all we have to do is view the converted target document in a Web browser.

After putting these together, txt2html becomes a "web-based filter agent." The word is so peculiar that it may just "fully express its meaning". They reflect the idea that a program reads a WEB page (or other resource) on your behalf, handles the content in some way, and then displays the page to you in a form that is better than the original page (at least for some special purpose). A good example of this tool is the Babelfish translation service (see Resources). After you run the URL through Babelfish, the Web page you see is very similar to the original page, but it shows the text you can read, not the language you don't understand. In a way, all the search engines that display a summary of search results are doing the same thing. But those search engines (by design) have more latitude in the format and appearance of the target page, while removing a lot of content. Of course, txt2html is not as powerful as Babelfish, but conceptually, they do the same thing to a large extent. See Resources for more examples, some of which are humorous.

The biggest advantage of txt2html is the use of many programming techniques that are common to different Web-oriented uses of Python. This article describes the techniques and explains the coding techniques and the scope of some Python modules. Note that the actual module in txt2html is called dmtxt2html to avoid conflicting module names written by others.

Using the CGI module

The CGI module in the Python standard release is an unexpected surprise for anyone using Python to develop a "Common Gateway Interface" application. You can not use it to create a CGI, but you will not do so.

Most often, you interact with a CGI application through an HTML form. To fill out a form called CGI to perform an operation that uses the specification. For example, the txt2html document uses this example to invoke an HTML form (the form generated by txt2html itself is more complex and may change, but the example will work well, even on your own Web page):

An HTML form that calls ' txt2html '

<form method="get" action="http://gnosis.cx/cgi/txt2html.cgi">
   URL: <input type="text" name="source" size=40>
   <input type="submit" name="go" value="Display!">
  </form>

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.