PHP saves webpages as word files in three ways. I. two ideas or principles of generating word in PHP 1. using com components in windows 2. using PHP to write content into the doc file, the specific implementation method is as follows. 2. use the com Group in windows
I. two ideas or principles for generating word in PHP
1. use com components in windows
2. use PHP to write content into the doc file
The specific implementation method is as follows.
2. use com components in windows
Principle: com as a PHP Extension class, installed on the office server will automatically call the word. application of com, can automatically generate documents, PHP official documentation manual: http://www.php.net/manual/en/class.com.php
Use official instances:
The code is as follows:
// Starting word
$ Word = new COM ("word. application") or die ("Unable to instantiate Word ");
Echo "Loaded Word, version {$ word-> Version} \ n ";
// Bring it to front
$ Word-> Visible = 1;
// Open an empty document
$ Word-> Documents-> Add ();
// Do some weird stuff
$ Word-> Selection-> TypeText ("This is a test ...");
$ Word-> Documents [1]-> SaveAs ("Useless test.doc ");
// Closing word
$ Word-> Quit ();
// Free the object
$ Word = null;
?>
Personal suggestion: the method after the com instance needs to find the official documentation to know what it means. the editor has no code prompt, which is very inconvenient. In addition, this efficiency is not very high and is not recommended.
3. use PHP to write content into the doc file
This method can be divided into two methods.
1. generate mht format (similar to HTML) and write it to word
2. write word in pure HTML format
1) generate mht format (similar to HTML) and write it into word
The code is as follows:
/**
* Get the word document content based on HTML code
* Create a document that is essentially mht. this function analyzes the file content and downloads image resources from the remote download page.
* This function depends on the MhtFileMaker class.
* This function analyzes the img label and extracts the src attribute value. However, the src property value must be enclosed by quotation marks; otherwise, it cannot be extracted.
*
* @ Param string $ content HTML content
* @ Param string $ absolutePath indicates the absolute path of the webpage. If the image path in the HTML content is relative, you need to fill in this parameter so that the function can automatically fill in the absolute path. This parameter must end with a slash (/).
* @ Param bool $ whether isEraseLink removes the link from the HTML content
*/
Function getWordDocument ($ content, $ absolutePath = "", $ isEraseLink = true)
{
$ Mht = new MhtFileMaker ();
If ($ isEraseLink)
$ Content = preg_replace ('/(\ s *.*? \ S *) <\/a>/I ',' $ 1', $ content); // remove the link
$ Images = array ();
$ Files = array ();
$ Matches = array ();
// This algorithm requires that the attribute values after src be enclosed in quotation marks.
If (preg_match_all ('// I', $ content, $ matches ))
{
$ ArrPath = $ matches [1];
For ($ I = 0; $ I {
$ Path = $ arrPath [$ I];
$ ImgPath = trim ($ path );
If ($ imgPath! = "")
{
$ Files [] = $ imgPath;
If (substr ($ imgPath, 0, 7) = 'http ://')
{
// Absolute link without prefix
}
Else
{
$ ImgPath = $ absolutePath. $ imgPath;
}
$ Images [] = $ imgPath;
}
}
}
$ Mht-> AddContents ("tmp.html", $ mht-> GetMimeType ("tmp.html"), $ content );
For ($ I = 0; $ I {
$ Image = $ images [$ I];
If (@ fopen ($ image, 'r '))
{
$ Imgcontent = @ file_get_contents ($ image );
If ($ content)
$ Mht-> AddContents ($ files [$ I], $ mht-> GetMimeType ($ image), $ imgcontent );
}
Else
{
Echo "file:". $ image. "not exist!
";
}
}
Return $ mht-> GetFile ();
}
The main function of this function is to analyze all the image addresses in the HTML code and download them one by one. After obtaining the image content, call the MhtFileMaker class to add the image to the mht file. The added details are encapsulated in the MhtFileMaker class.
Method 1: Remote Call
The code is as follows:
$ Url = http: // www. ***. com;
$ Content = file_get_contents ($ url );
$ FileContent = getWordDocument ($ content, "http://www.yoursite.com/Music/etc ");
$ Fp = fopen ("test.doc", 'w ');
Fwrite ($ fp, $ fileContent );
Fclose ($ fp );
Among them, the $ content variable should be the HTML source code, and the link below should be the URL address that can fill the relative path of the image in the HTML code
Among them, the $ content variable should be the HTML source code, and the link below should be the URL address that can fill the relative path of the image in the HTML code
Method 2: generate a local call
The code is as follows:
Header ("Cache-Control: no-cache, must-revalidate ");
Header ("Pragma: no-cache ");
$ WordStr = 'php tutorial website --jb51.net ';
$ FileContent = getWordDocument ($ wordStr );
$ FileName = iconv ("UTF-8", "GBK", 'php tutorial '.' _ '. $ intro.' _ '. rand (100,999 ));
Header ("Content-Type: application/doc ");
Header ("Content-Disposition: attachment; filename =". $ fileName. ". doc ");
Echo $ fileContent;
Note: Before using this function, you must first include the MhtFileMaker class, which can help us generate Mht documents.
The code is as follows:
/*************************************** ********************************
Class: Mht File Maker
Version: 1.2 beta
Date: 02/11/2007
Author: Wudi
Description: The class can make. mht file.
**************************************** *******************************/
Class MhtFileMaker {
Var $ config = array ();
Var $ headers = array ();
Var $ headers_exists = array ();
Var $ files = array ();
Var $ boundary;
Var $ dir_base;
Var $ page_first;
Function MhtFile ($ config = array ()){
}
Function SetHeader ($ header ){
$ This-> headers [] = $ header;
$ Key = strtolower (substr ($ header, 0, strpos ($ header ,':')));
$ This-> headers_exists [$ key] = TRUE;
}
Function SetFrom ($ from ){
$ This-> SetHeader ("From: $ from ");
}
Function SetSubject ($ subject ){
$ This-> SetHeader ("Subject: $ subject ");
}
Function SetDate ($ date = NULL, $ istimestamp = FALSE ){
If ($ date = NULL ){
$ Date = time ();
}
If ($ istimestamp = TRUE ){
$ Date = date ('d, d m y h: I: s O ', $ date );
}
$ This-> SetHeader ("Date: $ date ");
}
Function SetBoundary ($ boundary = NULL ){
If ($ boundary = NULL ){
$ This-> boundary = '--'. strtoupper (md5 (mt_rand (). '_ MULTIPART_MIXED ';
} Else {
$ This-> boundary = $ boundary;
}
}
Function SetBaseDir ($ dir ){
$ This-> dir_base = str_replace ("\", "/", realpath ($ dir ));
}
Function SetFirstPage ($ filename ){
$ This-> page_first = str_replace ("\", "/", realpath ("{$ this-> dir_base}/$ filename "));
}
Function AutoAddFiles (){
If (! Isset ($ this-> page_first )){
Exit ('not set the first page .');
}
$ Filepath = str_replace ($ this-> dir_base, '', $ this-> page_first );
$ Filepath = 'http: // mhtfile'. $ filepath;
$ This-> AddFile ($ this-> page_first, $ filepath, NULL );
$ This-> AddDir ($ this-> dir_base );
}
Function AddDir ($ dir ){
$ Handle_dir = opendir ($ dir );
While ($ filename = readdir ($ handle_dir )){
If ($ filename! = '.') & ($ Filename! = '..') & ("$ Dir/$ filename "! = $ This-> page_first )){
If (is_dir ("$ dir/$ filename ")){
$ This-> AddDir ("$ dir/$ filename ");
} Elseif (is_file ("$ dir/$ filename ")){
$ Filepath = str_replace ($ this-> dir_base, ''," $ dir/$ filename ");
$ Filepath = 'http: // mhtfile'. $ filepath;
$ This-> AddFile ("$ dir/$ filename", $ filepath, NULL );
}
}
}
Closedir ($ handle_dir );
}
Function AddFile ($ filename, $ filepath = NULL, $ encoding = NULL ){
If ($ filepath = NULL ){
$ Filepath = $ filename;
}
$ Mimetype = $ this-> GetMimeType ($ filename );
$ Filecont = file_get_contents ($ filename );
$ This-> AddContents ($ filepath, $ mimetype, $ filecont, $ encoding );
}
Function AddContents ($ filepath, $ mimetype, $ filecont, $ encoding = NULL ){
If ($ encoding = NULL ){
$ Filecont = chunk_split (base64_encode ($ filecont), 76 );
$ Encoding = 'base64 ';
}
$ This-> files [] = array ('filepath' => $ filepath,
'Metype '=> $ mimetype,
'Filecont' => $ filecont,
'Encoding' => $ encoding );
}
Function CheckHeaders (){
If (! Array_key_exists ('date', $ this-> headers_exists )){
$ This-> SetDate (NULL, TRUE );
}
If ($ this-> boundary = NULL ){
$ This-> SetBoundary ();
}
}
Function CheckFiles (){
If (count ($ this-> files) = 0 ){
Return FALSE;
} Else {
Return TRUE;
}
}
Function GetFile (){
$ This-> CheckHeaders ();
If (! $ This-> CheckFiles ()){
Exit ('No file was added .');
}
$ Contents = implode ("\ r \ n", $ this-> headers );
$ Contents. = "\ r \ n ";
$ Contents. = "MIME-Version: 1.0 \ r \ n ";
$ Contents. = "Content-Type: multipart/related; \ r \ n ";
$ Contents. = "\ tboundary = \" {$ this-> boundary} \ "; \ r \ n ";
$ Contents. = "\ ttype = \" ". $ this-> files [0] ['mimetype ']." \ "\ r \ n ";
$ Contents. = "X-MimeOLE: Produced By Mht File Maker v1.0 beta \ r \ n ";
$ Contents. = "\ r \ n ";
$ Contents. = "This is a multi-part message in MIME format. \ r \ n ";
$ Contents. = "\ r \ n ";
Foreach ($ this-> files as $ file ){
$ Contents. = "-- {$ this-> boundary} \ r \ n ";
$ Contents. = "Content-Type: $ file [mimetype] \ r \ n ";
$ Contents. = "Content-Transfer-Encoding: $ file [encoding] \ r \ n ";
$ Contents. = "Content-Location: $ file [filepath] \ r \ n ";
$ Contents. = "\ r \ n ";
$ Contents. = $ file ['filecont'];
$ Contents. = "\ r \ n ";
}
$ Contents. = "-- {$ this-> boundary} -- \ r \ n ";
Return $ contents;
}
Function MakeFile ($ filename ){
$ Contents = $ this-> GetFile ();
$ Fp = fopen ($ filename, 'w ');
Fwrite ($ fp, $ contents );
Fclose ($ fp );
}
Function GetMimeType ($ filename ){
$ Pathinfo = pathinfo ($ filename );
Switch ($ pathinfo ['extension']) {
Case 'htm': $ mimetype = 'text/html'; break;
Case 'html': $ mimetype = 'text/html'; break;
Case 'txt ': $ mimetype = 'text/plain'; break;
Case 'CGI ': $ mimetype = 'text/plain'; break;
Case 'php': $ mimetype = 'text/plain '; break;
Case 'css ': $ mimetype = 'text/css'; break;
Case 'jpg ': $ mimetype = 'image/jpeg'; break;
Case 'jpeg ': $ mimetype = 'image/jpeg'; break;
Case 'jpe': $ mimetype = 'image/jpeg '; break;
Case 'GIF': $ mimetype = 'image/GIF'; break;
Case 'PNG ': $ mimetype = 'image/png'; break;
Default: $ mimetype = 'application/octet-stream'; break;
}
Return $ mimetype;
}
}
?>
Comment: The disadvantage of this method is that it does not support batch download, because a page can only have one header (no matter whether it is used remotely or locally generated to generate a declaration header page, only one header can be output ), even if you generate the result cyclically, only one word is generated (you can modify the above method to implement it)
2. write word in pure HTML format
Principle:
Use ob_start to store html pages first (multiple headers can be generated in batches to solve the problem), and then use
Code:
The code is as follows:
Class word
{
Function start ()
{
Ob_start ();
Echo'Xmlns: w = "urn: schemas-microsoft-com: office: word"
Xmlns = "http://www.w3.org/TR/REC-html40"> ';
}
Function save ($ path)
{
Echo"";
$ Data = ob_get_contents ();
Ob_end_clean ();
$ This-> wirtefile ($ path, $ data );
}
Function wirtefile ($ fn, $ data)
{
$ Fp = fopen ($ fn, "wb ");
Fwrite ($ fp, $ data );
Fclose ($ fp );
}
}
The code is as follows:
$ Html ='
PHP10086 |
Http://www.jb51.net </a> |
PHP10086 |
Http://www.jb51.net </a> |
PHP10086 The most reliable PHP technology sharing website |
';
// Batch generate
For ($ I = 1; $ I <= 3; $ I ++ ){
$ Word = new word ();
$ Word-> start ();
// $ Html = "aaa". $ I;
$ Wordname = 'php tutorial website --jb51.net'. $ I. ". doc ";
Echo $ html;
$ Word-> save ($ wordname );
Ob_flush (); // refresh the cache before each execution
Flush ();
}
Personal comment:This method works best for three reasons:
The first code is concise and easy to understand.
Second, support for batch generation of word (this is important)
Third, support for complete html code
Example 1. use com components under windows 2. use PHP to write content into the doc file. the specific implementation method is as follows. 2. use the com group under windows...