Automatic Static Web Page Generator (II)-HTML file parsing

Source: Internet
Author: User
First of all, I apologize. I have been busy with contacting summer internship units over the past few days and have not responded to or updated in time. I haven't done it yet. If you can, please help me to look at my internship. I am parsing the current html Algorithm Still satisfied. At this stage, some of my tests have added support for special HTML syntaxes such as Br, iuput, IMG, Meta, script, and annotation. The results are satisfactory for pages that fully comply with HTML syntax specifications. The disadvantage is the lack of error tolerance capabilities, which may cause parsing errors or exceptions for pages that do not conform to the HTML syntax. If you want to make it a perfect HTML Parser, I think a priority algorithm must be implemented to provide error-tolerant capabilities. However, I think this problem is quite complicated and cannot be solved well for the time being. Now, I decided to adopt another method of compromise, that is, to parse only custom controls without processing common HTML controls. With the current algorithm, you can easily implement this function. I personally think it is advisable to use the user control and intercept the render when the control is output. However, I personally feel that this method is not flexible enough to actually use the load. I personally prefer to implement a lightweight, easy-to-control static page generation class library that can be easily customized and extended. For most websites, the homepage and category pages are frequently requested, and the content needs to be updated frequently. asp. net cache mechanism is a good choice; however, for website content pages, such as news or software download content pages, the request frequency is relatively low, dynamic generation or Cache Usage are not suitable. In this case, the best choice is to generate a static page and save it on the hard disk. In addition, for homepage or category pages with low real-time requirements, you can also use the shtml include mechanism to generate static Content Parts in templates and save them on the hard disk. As mentioned at the beginning of this article, development has been suspended in recent days due to contact with the summer internship organization. Here I will first upload the previous article Article If not, continue. The entire parsing module consists of three classes: staticcontrolfactory, parsestatusmanager, and staticbasecontrol. Staticcontrolfactory is mainly responsible for Character Analysis and Control Processing of the template; parsestatusmanager is mainly responsible for processing the resolution status of the control and sending processing commands to staticcontrolfactory; staticbasecontrol is an entity class, save related results. Code The structure itself is very clear, and there are a lot of comments, I will not explain in detail, mainly to illustrate a few key points. Staticcontrolfactory uses a stack to save the control hierarchy. Each time a new control is created, the new control is added to the Child control set of the current control, and the current control is pushed to the stack. The new control becomes the current control. After the new control is parsed, the reverse operation is performed. The State management of parsestatusmanager adopts a similar approach, but does not have the level of the former. Each time staticcontrolfactory reads the boundary characters, it calls the changestatus method of the parsestatusmanager object to convert the corresponding state. Parsestatusmanager's operations on staticcontrolfactory are implemented using the proxy and command mode. The biggest advantage of this method is that parsestatusmanager does not need to maintain a lot of method signatures of staticcontrolfactory, but only needs to input appropriate commands. If you need to add new processing when processing state transition, you only need to make staticcontrolfactory support for a new command. To add support for labels that do not conform to the <flag name = value> body </flag> format, including labels that can have no ending or use/> short ending signs, such as IMG and BR, several methods are added to staticcontrolfactory. readnextword is used to obtain the next word after the control start tag to determine whether the current tag is a defined special tag. processforscriptblock is used to process the script module. If you want to change the class library to the special tag that we define, you can use readnextword to read the control type each time you read the control start tag, and then start the control parsing operation only after the judgment is correct; perform similar operations on each read control end tag.

the current solution is actually my second one. The first solution uses recursion. The idea is not very clear and debugging is quite troublesome. The current solution is much clearer. Data Binding has been partially implemented in my first solution, mainly through traversal and reflection. Unfortunately, only support for data sources such as object arrays is supported, mainly because my current project uses the petshop3 structure to transmit data in the form of data entities, and does not implement binding in the form of controls such as repeate. However, both problems are expected to be relatively simple and can be implemented quickly. I will release an available version as soon as possible.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.