Xml
When the internet world is rapidly focusing attention on XML and its related technologies, a problem arises: what about Web sites that were previously built with HTML? For those who are just starting to do information construction, it is natural to use XML technology to design Web pages, but for those traditional companies that have very rich HTML files, rebuilding the site is expensive and time-consuming. So how should you handle the legacy HTML data?
There has been a new technology-extensible Hypertext Markup Language (XHTML), which is considered the ideal tool for traditional HTML migration to XML.
HTML to migrate to XML
HTML is a simple markup language. HTML contains a number of proprietary tags (tags) that are not supported by all browsers. Some elements used for visual effects, such as the 〈font〉 tag, simply make the HTML document larger in size. HTML is also not a good support for new devices such as PDAs and mobile phones that display small screens. It can be said that HTML is not suitable to continue as a standard page and information carrier language.
How do I get the HTML documents that I've accumulated to continue to work in the new environment? The transition to XML is the solution. XML documents contain clear document structure information and can be flexibly exported for a variety of needs. XML is not a simple predefined markup language like HTML and WML (Wireless Markup language), but rather a language standard that allows users to make markup based on different data and document content. Users can create tags that are more accurate and appropriate for their documents than HTML.
Extensible Stylesheet Language (XSL) provides a way to output all the required formats from a stored XML file. Many XSL-based products can output HTML documents from XML files that can be displayed correctly in various browsers by selecting a particular stylesheet, and the same XML document can use other stylesheets to create documents in WML format that can be used in wireless devices. All the designers have to do is create style sheets for this format document, and no changes to the contents of the document, that is, "body" is the same, but the "coat" is free to choose.
how HTML is converted into XHTML
The trouble with traditional HTML migrating to XML is that it's not easy to separate the content and presentation of HTML documents. So how do you modify an HTML document? An alternative scenario is to use XHTML. XHTML combines the advantages of HTML and XML, because it is very similar to HTML, so it is easy to simplify and refit the previous HTML, to form new XHTML documents, and to implement the transition from HTML to XHTML, which is much simpler than rebuilding XML documents directly.
First, XHTML is sensitive to case tags. Attributes that define elements in XHTML must be lowercase, and some of the techniques used to enhance the readability of documents in HTML are not available. For example, when you previously defined an element attribute in HTML using uppercase characters, and the specific values were lowercase, this is a bit more readable, but this technique is not available in XHTML.
Second, XHTML strictly requires that elements must end with a tag start and tag. The techniques used in HTML to open tags and then use them for other content must now be modified as well. In XHTML, all non-empty elements are required to be closed. One technique used frequently by developers is to use the 〈p〉 tag in two paragraphs instead of using the 〈p〉 at the end of each paragraph strictly by using the 〈/p〉. In addition, all XHTML attributes need to be expressed in quotes, that is, 〈table border = 2〉 Such statements need to be rewritten as 〈table border = "2".
Finally, the point is that elements such as 〈head〉 and 〈body〉 are necessary in XHTML, and 〈title〉 This element must be placed in the 〈head〉 segment as the first element.
By making these changes to the HTML document, the original HTML file is not only properly displayed on the HTML browser, but it can be processed with XML-enabled software.
HTML Conversion Tool
If your site only has a small number of documents need to be converted, even if the manual method can be handled, but if the cumulative number of years of HTML documents need to convert, then need to find a tool to help. There are a number of commercial and free tool software available in this area that can be used both to facilitate conversion and to directly edit files in a new XHTML format.
HTML Tidy is a very basic but useful tool that can be run on a variety of platforms. HTML Tidy can be used to clear markup errors for HTML files (as opposed to XHTML standards) and to reformat HTML files to make them more readable. HTML Tidy has become a versatile tool for converting HTML to XHTML.
Html-kit is a free program that can be run on many platforms. Not only does it help with HTML editing, formatted output, legality checking, previewing, and publishing, but it also transforms HTML to XHTML in the graphical interface. In its user interface, there is a window that displays the source file, another window displays the translated markup language results, and a window that shows errors and suggestions for improving XHTML.
forward directly to the XML standard
HTML modified to form a new XHTML document will no longer have the hassle of browsing and displaying. But if you want its content to be applied to all areas, consider building an XML document directly. This requires extracting content from existing HTML and separating the markup for content and representation.
XSplit is a new tool introduced by Percussion Software Corporation. XSplit enables Web developers to convert HTML documents into corresponding XSL stylesheets. XSplit can create a DTD (file type definition) file that contains a format-defined XML, and can also use static content to create an XML sample document.