Introduction to PHP code for exporting Web pages as Word documents

Source: Internet
Author: User
  1. /**
  2. * Get Word document content based on HTML code
  3. * Create a document that is essentially MHT, which parses the contents of a file and downloads the picture resource from the page remotely
  4. * This function depends on the class Mhtfilemaker
  5. * This function parses the IMG tag and extracts the SRC attribute values. However, the attribute value of SRC must be surrounded by quotation marks, otherwise it cannot be extracted
  6. *
  7. * @param string $content HTML content
  8. * @param string $absolutePath The absolute path of the Web page. If the image path in the HTML content is a relative path, you need to fill in this parameter to let the function automatically fill in the absolute path. This parameter finally needs to be/end
  9. * @param bool $isEraseLink whether to remove links from HTML content
  10. */
  11. function Getworddocument ($content, $absolutePath = "", $isEraseLink = True)
  12. {
  13. $MHT = new Mhtfilemaker ();
  14. if ($isEraseLink)
  15. $content = Preg_replace ('/(\s*.*?\s*) <\/a>/i ', ' $ ', $content); Remove link
  16. $images = Array ();
  17. $files = Array ();
  18. $matches = Array ();
  19. This algorithm requires that attribute values after src must be enclosed in quotation marks.
  20. if (Preg_match_all ('//i ', $content, $matches))
  21. {
  22. $arrPath = $matches [1];
  23. for ($i =0; $i
  24. {
  25. $path = $arrPath [$i];
  26. $imgPath = Trim ($path);
  27. if ($imgPath! = "")
  28. {
  29. $files [] = $imgPath;
  30. if (substr ($imgPath, 0,7) = = ' http//')
  31. {
  32. Absolute links, without prefixes
  33. }
  34. Else
  35. {
  36. $imgPath = $absolutePath. $imgPath;
  37. }
  38. $images [] = $imgPath;
  39. }
  40. }
  41. }
  42. $MHT->addcontents ("tmp.html", $mht->getmimetype ("tmp.html"), $content);
  43. for ($i =0; $i
  44. {
  45. $image = $images [$i];
  46. if (@fopen ($image, ' R '))
  47. {
  48. $imgcontent = @file_get_contents ($image);
  49. if ($content)
  50. $MHT->addcontents ($files [$i], $mht->getmimetype ($image), $imgcontent);
  51. }
  52. Else
  53. {
  54. echo "File:". $image. "Not exist!
    ";
  55. }
  56. }
  57. return $mht->getfile ();
  58. }
Copy Code

How to use:

    1. $fileContent = Getworddocument ($content, "http://www.yoursite.com/Music/etc/");
    2. $fp = fopen ("Test.doc", ' W ');
    3. Fwrite ($fp, $fileContent);
    4. Fclose ($FP);
Copy Code

Where the $content variable should be the HTML source code, the following link should be able to fill the HTML code in the image relative path of the URL address note, before using this function, you need to include the class Mhtfilemaker, this class can help us generate MHT document.

  1. /***********************************************************************
  2. CLASS:MHT File Maker
  3. version:1.2 Beta
  4. link:http://bbs.it-home.org
  5. Author:wudi
  6. Description:the class can make. mht file.
  7. ***********************************************************************/
  8. Class mhtfilemaker{
  9. var $config = array ();
  10. var $headers = array ();
  11. var $headers _exists = Array ();
  12. var $files = array ();
  13. var $boundary;
  14. var $dir _base;
  15. var $page _first;
  16. function Mhtfile ($config = Array ()) {
  17. }
  18. function SetHeader ($header) {
  19. $this->headers[] = $header;
  20. $key = Strtolower (substr ($header, 0, Strpos ($header, ': ')));
  21. $this->headers_exists[$key] = TRUE;
  22. }
  23. function Setfrom ($from) {
  24. $this->setheader ("From: $from");
  25. }
  26. function Setsubject ($subject) {
  27. $this->setheader ("Subject: $subject");
  28. }
  29. function SetDate ($date = NULL, $istimestamp = FALSE) {
  30. if ($date = = NULL) {
  31. $date = time ();
  32. }
  33. if ($istimestamp = = TRUE) {
  34. $date = Date (' d, D M Y h:i:s O ', $date);
  35. }
  36. $this->setheader ("Date: $date");
  37. }
  38. function setboundary ($boundary = NULL) {
  39. if ($boundary = = NULL) {
  40. $this->boundary = '--'. Strtoupper (MD5 (Mt_rand ())). ' _multipart_mixed ';
  41. } else {
  42. $this->boundary = $boundary;
  43. }
  44. }
  45. function Setbasedir ($dir) {
  46. $this->dir_base = str_replace ("\ \", "/", Realpath ($dir));
  47. }
  48. function Setfirstpage ($filename) {
  49. $this->page_first = str_replace ("\ \", "/", Realpath ("{$this->dir_base}/$filename"));
  50. }
  51. function Autoaddfiles () {
  52. if (!isset ($this->page_first)) {
  53. Exit (' not set the first page. ');
  54. }
  55. $filepath = Str_replace ($this->dir_base, ", $this->page_first);
  56. $filepath = ' Http://mhtfile '. $filepath;
  57. $this->addfile ($this->page_first, $filepath, NULL);
  58. $this->adddir ($this->dir_base);
  59. }
  60. function Adddir ($dir) {
  61. $handle _dir = Opendir ($dir);
  62. while ($filename = Readdir ($handle _dir)) {
  63. if ($filename! = ') && ($filename! = ' ... ') && ("$dir/$filename"! = $this->page_first)) {
  64. if (Is_dir ("$dir/$filename")) {
  65. $this->adddir ("$dir/$filename");
  66. } elseif (Is_file ("$dir/$filename")) {
  67. $filepath = Str_replace ($this->dir_base, "," $dir/$filename ");
  68. $filepath = ' Http://mhtfile '. $filepath;
  69. $this->addfile ("$dir/$filename", $filepath, NULL);
  70. }
  71. }
  72. }
  73. Closedir ($handle _dir);
  74. }
  75. function AddFile ($filename, $filepath = null, $encoding = null) {
  76. if ($filepath = = NULL) {
  77. $filepath = $filename;
  78. }
  79. $mimetype = $this->getmimetype ($filename);
  80. $filecont = file_get_contents ($filename);
  81. $this->addcontents ($filepath, $mimetype, $filecont, $encoding);
  82. }
  83. function addcontents ($filepath, $mimetype, $filecont, $encoding = NULL) {
  84. if ($encoding = = NULL) {
  85. $filecont = Chunk_split (Base64_encode ($filecont), 76);
  86. $encoding = ' base64 ';
  87. }
  88. $this->files[] = array (' filepath ' = = $filepath,
  89. ' MimeType ' = $mimetype,
  90. ' Filecont ' = $filecont,
  91. ' Encoding ' = $encoding);
  92. }
  93. function Checkheaders () {
  94. if (!array_key_exists (' Date ', $this->headers_exists)) {
  95. $this->setdate (NULL, TRUE);
  96. }
  97. if ($this->boundary = = NULL) {
  98. $this->setboundary ();
  99. }
  100. }
  101. function Checkfiles () {
  102. if (count ($this->files) = = 0) {
  103. return FALSE;
  104. } else {
  105. return TRUE;
  106. }
  107. }
  108. function GetFile () {
  109. $this->checkheaders ();
  110. if (! $this->checkfiles ()) {
  111. Exit (' No file was added. ');
  112. }
  113. $contents = Implode ("\ r \ n", $this->headers);
  114. $contents. = "\ r \ n";
  115. $contents. = "mime-version:1.0\r\n";
  116. $contents. = "content-type:multipart/related;\r\n";
  117. $contents. = "\tboundary=\" {$this->boundary}\ "; \ r \ n";
  118. $contents. = "\ttype=\" ". $this->files[0][' mimetype '). "\" \ r \ n ";
  119. $contents. = "x-mimeole:produced by Mht File Maker v1.0 beta\r\n";
  120. $contents. = "\ r \ n";
  121. $contents. = "This was a multi-part message in MIME format.\r\n";
  122. $contents. = "\ r \ n";
  123. foreach ($this->files as $file) {
  124. $contents. = "--{$this->boundary}\r\n";
  125. $contents. = "Content-type: $file [mimetype]\r\n";
  126. $contents. = "Content-transfer-encoding: $file [encoding]\r\n";
  127. $contents. = "Content-location: $file [filepath]\r\n";
  128. $contents. = "\ r \ n";
  129. $contents. = $file [' Filecont '];
  130. $contents. = "\ r \ n";
  131. }
  132. $contents. = "--{$this->boundary}--\r\n";
  133. return $contents;
  134. }
  135. function MakeFile ($filename) {
  136. $contents = $this->getfile ();
  137. $fp = fopen ($filename, ' w ');
  138. Fwrite ($fp, $contents);
  139. Fclose ($FP);
  140. }
  141. function GetMimeType ($filename) {
  142. $pathinfo = PathInfo ($filename);
  143. Switch ($pathinfo [' extension ']) {
  144. Case ' htm ': $mimetype = ' text/html '; Break
  145. Case ' html ': $mimetype = ' text/html '; Break
  146. Case ' txt ': $mimetype = ' text/plain '; Break
  147. Case ' cgi ': $mimetype = ' text/plain '; Break
  148. Case ' php ': $mimetype = ' text/plain '; Break
  149. Case ' CSS ': $mimetype = ' text/css '; Break
  150. Case ' jpg ': $mimetype = ' image/jpeg '; Break
  151. Case ' jpeg ': $mimetype = ' image/jpeg '; Break
  152. Case ' JPE ': $mimetype = ' image/jpeg '; Break
  153. Case ' gif ': $mimetype = ' image/gif '; Break
  154. Case ' png ': $mimetype = ' image/png '; Break
  155. Default: $mimetype = ' application/octet-stream '; Break
  156. }
  157. return $mimetype;
  158. }
  159. }
  160. ?>
Copy Code
  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.