The origin of Automatic static Web page generator and parsing of HTML file

Source: Internet
Author: User
Tags generator html tags xsl

I've seen too many big websites that have been using static pages. In terms of performance, this is certainly a choice for this type of site. Although I have always wanted to achieve this function, but there is no urgent need, so have been shelved. Finally, a project now decided to use static Web page generation technology, I also malicious to solve this problem.

I've thought about a lot of different options, but one by one rejected it. One scenario is to use an XML schema, from the idea of csdn plagiarism, to save data in an XML file, and then to define an XSL to parse on the client. The biggest drawback of this scenario is the inability to handle complex page layouts. Imagine a very complex page where you can hardly define the appropriate XSL, and the overhead of the client may not be accepted. The other is using JavaScript to store data as a JS file, but not all clients support JavaScript, and I don't think it's a good structure or manageable. The last way is to use a template. Many people define special strings in a template, and then a little bit of substitution, inefficient, easily wrong, difficult to support page layout and content changes.

Thinking about these issues, I used the way to define the template, but not the way I described it in the previous template. I expect to implement some custom data controls that are used in templates. In the process of generating a static page, you first parse the original template file, automatically recognize the control, and then automatically bind the data controls using a customized data structure. A lot of experts see it, this way and ASP.net way is somewhat similar. In fact, my inspiration from it, of course, to my level, can not reach its realm. In this way, you can freely change the design within a certain range without changing the program. Even if the layout changes very much, if the data is not changed or small changes, the program does not need to change or only a small change.

After all, my experience is shallow, there must be a lot of unsuitable places, I hope you can put forward a static web page to generate a better idea, but also welcome criticism.

For half a day, let's go back to the idea of how to do it. As mentioned above, I've divided the whole process into two steps: parsing the template first, and then data binding. At this stage, I have only implemented template parsing. Here we first propose the design of the custom data control, I use the HTML format: <flag name=value>body</flag>. This way, I can parse the custom control in a consistent way with parsing the HTML language.

In the actual implementation, I looked at the compiler principle of books, a cursory look at the lexical analysis. This thing is really complicated, I can't read it. However, no matter what, there is a harvest, especially the finite automata so-called state transitions inspired me. Regardless of the special HTML syntax, the general HTML tags are in the form of <flag name=value>body</flag>. I defined five states of the character scan: empty state, looking for a control, looking for a control's head, looking for a control's content, and finding the end of the control. (Note: At the time of writing this article, I think "looking for controls" is superfluous, but for the moment it is. The conversion of several States relies on the five boundary characters I define: Non-boundary characters, start boundary characters, end boundary characters, closed boundary characters, short form end boundary characters. All of these are in the form of enumerations defined in the source code.

It's getting late, I'm going home, I'll write it here today. I think the source structure is still relatively clear, the annotation is more, if someone is interested in the detailed study of its implementation, directly to see the source code should be on it. If I still don't quite understand or if I have time, I will follow up and write down the whole idea.

BTW: Now basically complete the parsing section. Structure has been completed, but there are many bugs, debugging is very troublesome, I hope that people can have a lot of feedback.

Blog: http://homer.cnblogs.com/

This article supporting source code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.