Automated Tools convert Word documents to html in batches

Source: Internet
Author: User


There are many departments in an enterprise. Each department may write some documents more or less. Some document leaders need to browse the documents. The solution is to print out the edited documents for the leaders to browse, or in order to save the enterprise cost, the document is copied between people or departments. If employees or leaders are not good at document management, there may be many versions, or they will not be able to easily find relevant documents in the future.

Therefore, the company finds a person (document administrator) who is responsible for document management.There are two ways to manage documents: doc (Word document) and shared.

For the first document management method, only the document administrator is involved in collecting documents of each department and planning the Department as an example:


Then plan the personnel in the Department, such:


 

The first form solves the problem of document management. Only save the latest documents and summarize the documents. If you need any documents, you only need to let the document administrator print the documents and submit them.

But what documents do leaders or employees need? The leader is not a machine and has no unforgettable skills and does not know what documents the company has? Who will find any error in the document? To solve these problems, you need to share the document, that is, the second document management method.

SolutionIs to establish a website in the LAN, any person in the enterprise can browse the company's documents. Through this website, anyone in the company knows what documents the company has, and it is easier to find information.

The website is also very simple. You only need to copy the directory structure, department, and personnel. Convert the doc to html, and the document name corresponds to the page name. Then, the homepage is automatically generated, and the website is complete.

Technical implementation:
Here is a general introduction. If you need to know more about the technology, you can download the following source code.

1. Specify the directory to be converted (including the word document directory) and the output directory.
2. Read all Word documents in the directory
3. Convert all word files into Html files and save the html files to the corresponding directory (department and personnel) in the output directory. The Code is as follows:

 ToHtml(.FilePath == ==== (Word.Document)docsType.InvokeMember( Object[] { ()FilePath, ,  }); 
    Type docType =
     strSaveFileName = TargetFilePath;  
     saveFileName = (
    docType.InvokeMember(, doc,  , System.Reflection.BindingFlags.InvokeMethod, , doc, , System.Reflection.BindingFlags.InvokeMethod, , word, 

After converting word to html, test the page style:


4. Generate a homepage and save it to the output directory. Jquery plug-in treeview is used to clarify the homepage structure.

 = = = directory.GetFiles( (FileInfo fi  relativelinks = Uri.EscapeUriString(fi.FullName.Replace(.Format(, DirectoryPath), ).Replace(/
        _sb.AppendLine(.Format(= (DirectoryInfo dirInfo 

The generated homepage is shown as follows (non-enterprise documents, but their own documents ):




Summary:
When developing this tool, the most important thing is to convert word to html. However, I use the com method to call open, saveas, close, and quit of word. This method will leave information in the recent browsing history of word, as shown below:


In addition, the generation of html is completely controlled by word, and the style of the generated html is completely controlled by word. After some articles are converted to html, the html content is not beautiful. I tested how to install different office versions: office2003 (documents of higher versions cannot be converted), office2007, and office2010. The higher the version, the better the html style after conversion.

I also searched for word-to-html control methods on the Internet (including Chinese and foreign ones) and found that if I want to completely control the conversion,It seems impossible. Of course, there are also doc conversion tools, such as google doc.I don't know what I think, right?

 

Source code download: source code convertor.rar


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.