Basic ideas:
Upload the Word file to the server, read its contents stored as HTML, and then load the HTML content
1: Using the Microsoft.Office.Interop.Word component
This is a more common way, the code is not posted, a lot of examples on the web
Disadvantage: The server needs to install Word components, and need to set the Docm+ object on the server permissions, if a server is OK, if the project applied to multiple different servers, it is more cumbersome
2:openxml API
You can convert. docx (Word 97-2003 does not apply) to XML, with XML, to convert to HTML or other formats that are no longer a problem, and this API requires. NET freamwork3.5+ office2007+
3: Third party: e.g. Aspose.words (tested, recommended)
Aspose offers a variety of formats of the conversion scheme, interested can go in and take a closer look, the. NET Java direction, in which aspose.words this DLL, without the installation of Microsoft Office components can convert word (converting Doc,docx To HTML without MS Office Word in. Net)
Copy Code code as follows:
Aspose.Words.Document d = new Aspose.Words.Document (Wordphysicalpath);
D.save ("d:\\1.html", saveformat.html);
You can save as an HTML document (note that the pictures in Word are stored in and HTML siblings, and you need to replace Advantages: You do not need to install Microsoft Office components, you need only a 2M DLL to complete this function
Disadvantage: Aspose is not an open source component, although the domestic has cracked version, but also can be reversed after compiling their own changes, but the copyright issue is really to consider the factors
There are a number of other third party projects, most of which are charged, and here are not listed