Three ways to save a Web page as a Word file in PHP

Source: Internet
Author: User

  recently encountered questions about building word, now summarize the information about the three ways to build word, and the friends you need can refer to the following

First, PHP to generate word of the two ideas or principles   1. Take advantage of the COM components under Windows 2. Use PHP to write content to doc files in the following ways.   II. using the COM Components   principles of Windows: COM as an extended class of PHP, Servers that have Office installed automatically invoke Word.Application COM to automatically generate documents, PHP official documentation manual: Http://www.php.net/manual/en/class.com.php   Use official example:     code as follows: <?php//Starting Word $word = new COM ("Word.Application") or Die ("unable to instantiate word "); echo "Loaded Word, Version {$word->version}n";  //bring it to front $word->visible = 1;  //open an empty document $word->documents->add ();  //do Some weird stuff $word->selection->typetext ("This is a test ..."); $word->documents[1]->saveas ("Useless test.doc");  //closing word $word->quit ();  //free the object $word = null;?> Personal recommendation: com instance after the method will need to find the official document to know what meaning, the editor does not have code hints, very inconvenient, and this efficiency is not very high, do not recommend the use of   three, Use PHP to write content to doc files This method can be divided into two methods   1. Generate MHT format (very similar to HTML) to write to Word 2. Plain HTML is written to Word     1, The generated MHT format (similar to HTML) is written to the word     code as follows:/**  * gets Word document content based on HTML code &NBSp;* creates an essentially MHT document that analyzes the contents of the file and  * the image resource from the remote download page to rely on the class Mhtfilemaker  * the function analyzes the IMG tag to extract the attribute values of SRC. However, the attribute value of SRC must be surrounded by quotes, or you cannot extract  *   * @param string $content HTML content  * @param string $absolutePath The absolute path of the Web page. If the image path in the HTML content is a relative path, then you need to fill in the parameter to make the function automatically fill the absolute path. This parameter finally needs to end with/over  * @param bool $isEraseLink whether to remove links in HTML content  */function getworddocument ($content, $absolutePath = "", $isEraseLink = True) {    $MHT = new Mhtfilemaker ();     if ($isEraseLink)         $content = Preg_replace ('/<as*.*?s*> (s*.*?s*) </a >/i ', ' $ ', $content);  //Remove link       $images = array ();     $files = array ();     $matches = array ();    /This algorithm requires that the attribute value after SRC must be enclosed in quotes     if (Preg_match_all ('/<img[.n]*?srcs*?=s*?[') (.*?) ["'] (.*?) />/i ', $content, $matches))     {        $arrPath = $matches [1];       & nbsp for ($i =0; $i <count ($arrPath); $i + +)         {            $path = $arrPath [$i];     &N Bsp       $imgPath = Trim ($path);             if ($imgPath!= "")             {                $files [] = $imgPath;                 if (substr ($imgPath, 0,7) = = ' http://')                 {                   //absolute link, no prefix   &N Bsp                             else     &NB Sp           {                    $imgPath = $abs Olutepath. $imgPath;                                 $images [ ]= $imgPath;                     {   }     $MHT->addconten TS ("tmp.html", $mht->getmimetype ("tmp.html"), $content);       for ($i =0 $i <count ($images) $i + +)     {        $image = $images [$i ];         if (@fopen ($image, ' R '))         {          & nbsp $imgcontent = @file_get_contents ($image);             if ($content)                 $MHT ; Addcontents ($files [$i], $mht->getmimetype ($image), $imgcontent);        }         else         {        & nbsp   echo "File:" $image. "Not exist!<br/>";        }     {      return $mht->getfile ();   The main function of this function is to parse all the image addresses in the HTML code, and download them in sequence.To. After you get the contents of the picture, call the Mhtfilemaker class and add the picture to the MHT file. Details are added, encapsulated in the Mhtfilemaker class.   Use Method 1: Remote call     code is as follows: $url = http://www.***.com;   $content = file_get_contents ($url);   $fileContent = getworddocument ($content, "http://www.yoursite.com/Music/etc/"); $fp = fopen ("Test.doc", ' W '); Fwrite ($fp, $fileContent); Fclose ($FP); Where the $content variable should be HTML source code, the following link should be able to fill the HTML code in the picture relative path of the URL address     where the $content variable should be HTML source code, The following link should be a URL that fills the relative path of the picture in the HTML code   use Method 2: Local build call   code is as follows: Header ("Cache-control:no-cache, must-revalidate");   Header ("Pragma:no-cache");  $wordStr = ' PHP tutorial website--jb51.net ';  $fileContent = getworddocument ($wordStr ;  $fileName = Iconv ("Utf-8", "GBK", ' php tutorials '). '_'. $intro. '_' . RAND (100, 999));    header ("Content-type:application/doc");  header ("content-disposition:attachment; Filename= ". $fileName. ". Doc");  Echo $fileContent;   Note that before using this function, you need to include class Mhtfilemaker, which can help us generate MHT documents.     Copy code code as follows: <?php/*Class:        MHT File Maker Version:      1.2 beta Date:         02/11/2007 Author:       Wudi <wudicgi@yahoo.de> Description:  the class can make. mht file. /  class mhtfilemaker{    var $config = array ();     var $headers = array ();     var $headers _exists = Array ();     var $files = array ();     VAR $boundary;     var $dir _base;     var $page _first;       function mhtfile ($config = Array ()) {     }       function SetHeader ($ header) {        $this->headers[] = $header         $key = Strtolower substr ($ Header, 0, Strpos ($header, ': '));         $this->headers_exists[$key] = TRUE;    }       function Setfrom ($from) {        $this->setheader ("From: $from ");    }       function Setsubject ($subject) {        $this->setheader ("Subje CT: $subject ");    }       function setdate ($date = NULL, $istimestamp = FALSE) {       -If ($date = = NULL) {            $date = time ();        }         if ( $istimestamp = = TRUE) {            $date = date (' d, D M Y h:i:s O ', $date);     & nbsp           $this->setheader ("Date: $date");    }       function setboundary ($boundary = NULL) {        if ($boundary = = NULL) {            $this->boundary = '--'. Strtoupper (MD5 (Mt_rand ())). ' _multipart_mixed ';       &NBSp else {            $this->boundary = $boundary;        }   &N Bsp }       function Setbasedir ($dir) {        $this->dir_base = Str_replace ("", "/", RE Alpath ($dir));    }       function setfirstpage ($filename) {        $this->page_first = Str_replace ("", "/", Realpath ("{$this->dir_base}/$filename"));    }       function autoaddfiles () {       -if (!isset ($this->page_first) {            exit (' not set the ' the "');        }   &NBSP ;     $filepath = Str_replace ($this->dir_base, ", $this->page_first);         $filepath = ' http://mhtfile '. $filepath;         $this->addfile ($this->page_first, $filepath, NULL);         $this->adddir ($this->diR_base);    }       function Adddir ($dir) {        $handle _dir = Opendir ($dir); &nbsp ;       while ($filename = Readdir ($handle _dir)) {          if ($filename!= '.') && ($filename!= ' ... ') && ("$dir/$filename"!= $this->page_first)) {          & nbsp     if (Is_dir ("$dir/$filename")) {                    $this- >adddir ("$dir/$filename");                 ElseIf (Is_file ("$dir/$filename") {                    $filepath = Str_replace ($this->dir_base, "$dir/$filename");                     $filepath = ' http://mhtfile '. $filepath;                     $this->addfile ("$dir/$filename", $filepath, NULL);                            }        }         Closedir ($handle _dir);    }       function addfile ($filename, $filepath = null, $encoding = null) {        if ($filepath = NULL) {            $filepath = $filename;       &NB Sp }         $mimetype = $this->getmimetype ($filename);         $filecont = file_get_contents ($filename);         $this->addcontents ($filepath, $mimetype, $filecont, $encoding);    }       function addcontents ($filepath, $mimetype, $filecont, $encoding = NULL) {  &NBSP ;     if ($encoding = NULL) {          $filecont = Chunk_split (Base64_encode ($fi Lecont), 76);             $encoding = ' base64 ';        }         $this->files[] = array (' filepath ' => $filepath,   &NB Sp                             ' mimetype ' => $mimetyp E,                                 ' Filecon T ' => $filecont,                             &NBS P   ' encoding ' => $encoding);    }       function checkheaders () {     !array_key_exists   if (' Date ', $t his->headers_exists)) {            $this->setdate (NULL, TRUE);       & nbsp }         if ($this->boundary = NULL) {            $this->setbou Ndary ();        }    }       function checkfiles ({        if (count ($this->files) = 0) {            return FALSE;        } else {            return TRUE;        }    }       function GetFile () {        $this->checkheaders ();   &nbs P     if (! $this->checkfiles ()) {            exit (' No file was added. ');        }         $contents = implode ("rn", $this->headers);         $contents. = "RN";         $contents. = "Mime-version:1.0rn";         $contents. = "Content-type:multipart/related;rn";         $contents. = "tboundary=" {$this->boundary} "; RN";         $contents. = "Ttype=". $this->files[0][' mimetype ']. "RN";         $contents. = "X-mimeole:produced by Mht File Maker v1.0 Betarn ";         $contents. = "RN";         $contents. = "This is a multi-part message in MIME format.rn";         $contents. = "RN";         foreach ($this->files as $file) {            $contents. = "- -{$this->boundary}rn ";             $contents. = "Content-type: $file [Mimetype]rn];             $contents. = "Content-transfer-encoding: $file [Encoding]rn];             $contents. = "Content-location: $file [Filepath]rn];             $contents. = "RN";             $contents. = $file [' Filecont '];             $contents. = "RN";        }         $contents. = "--{$this->boundary}--rn";         return $contents;    }       function MakeFile ($filename) {        $contents = $this->getfi Le ();         $fp = fopen ($filename, ' w ');         fwrite ($fp, $contents);         fclose ($FP);    }       function GetMimeType ($filename) {        $pathinfo = PathInfo ($fi Lename);         switch ($pathinfo [' extension ']) {            case ' htm ': $mimet ype = ' text/html '; Break             case ' html ': $mimetype = ' text/html '; Break             case ' txt ': $mimetype = ' text/plain '; Break             case ' cgi ': $mimetype = ' text/plain '; Break             case ' php ': $mimetype = ' text/plain '; Break             case ' css ': $mimetype = ' text/css '; Break &nbsp           case ' jpg ': $mimetype = ' image/jpeg '; Break             case ' jpeg ': $mimetype = ' image/jpeg '; Break             case ' jpe ': $mimetype = ' image/jpeg '; Break             case ' gif ': $mimetype = ' image/gif '; Break             case ' png ': $mimetype = ' image/png '; Break             default: $mimetype = ' application/octet-stream '; Break        }         return $mimetype;    }?>   Reviews: The disadvantage of this approach is that bulk build downloads are not supported because a page can have only one header, (whether a remote or a local build Declaration Header page can output only one header), even if you loop the generation, The result is only one word generation (you can change the way you do it, of course)   2. Pure HTML is written to Word   principle:   Use Ob_start to store HTML pages first (solve multiple header issues on the page, Can be generated in batches), and then write the contents of the doc document using   code:   Code as follows: <?php class word {      function start ()     {& nbsp       Ob_start (); &nbsP       echo ' <html xmlns:o= "Urn:schemas-microsoft-com:office:office"         xmlns: w= "Urn:schemas-microsoft-com:office:word"         xmlns= "HTTP://WWW.W3.ORG/TR/REC-HTML40" >;    }     function save ($path)     {          echo "</html>" ;         $data = ob_get_contents ();         Ob_end_clean ();           $this->wirtefile ($path, $data);    }       function wirtefile ($FN, $data)     {        $FP =fopen ( $FN, "WB");         fwrite ($fp, $data);         fclose ($FP);    }   code is as follows: $html = '   <table width=600 cellpadding= "6" cellspacing= "1" bgcolor= "#336699" >&nb Sp <tr bgcolor= "White" >    <td>PHP10086</td>    <td><a href= "HTTP://WWW.JB51.NEt "target=" _blank ">http://www.jb51.net</a></td>  </tr>  <tr bgcolor=" Red ">     <td>PHP10086</td>    <td><a href= "http://www.jb51.net" target= "_blank" >http://www.jb51.net</a></td>  </tr>  <tr bgcolor= "white" >    <TD colspan=2 >    php10086<br>    Most reliable PHP technology sharing site     <img src= "http:// Www.jb51.net/wp-content/themes/WPortal-Blue/images/logo.gif ">    </td>  </tr>  </table>  ';   //batch build   for ($i =1 $i <=3; $i + +) {      $word = new word ();      $word->start ();     //$html = "AAA". $i;      $wordname = ' PHP Tutorial website--jb51.ne T '. $i. " Doc ";      echo $html;      $word->save ($wordname);      Ob_flush ();// Refresh cache before each execution       flush (); } Personal Comments: This method works best because of threeA:   The first code is relatively concise, it is easy to understand the second is to support the batch generation of Word (this is very important) the third is to support the complete HTML code  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.