Automatically convert plain text to Web pages with PHP

Source: Internet
Author: User
Tags join php code regular expression

Recently, an old friend of mine called me for help. He has worked as a journalist for many years and recently gained the right to publish many of his early columns. He wanted to put his work on the web, but his columns were saved as plain text files, and he had neither the time nor the desire to learn HTML in order to convert them into Web pages. Since I was the only computer-savvy person in his phone book, he called me to see if I could help him.

"Let me handle it," I said, "and call me back in one hours." "Of course, when he called me a few hours later, I had prepared a solution for him." It takes a little bit of PHP, and I harvest his endless thanks and a box of red wine.

So what have I done in the next one hours? This is the content of this article. I'll show you how to use PHP to quickly transform pure ASCII text into readable HTML markup.

First, let's look at an example of a plain text file that my friend wants to convert:

The following are the referenced contents:

Green for mars!

John R. Doe

The idea of little green men from Mars, long a staple of science fiction, may soon turn out to be less fantasy and more FA Ct.

Recent samples sent by the latest Mars exploration team, indicate a high presence of chlorophyll in the atmosphere. Chlorophyll, you'll recall, is what makes plants green. It's quite likely, therefore, that organisms on Mars would have, through continued exposure to the green stuff, developed a Greenish tinge on their outer exoskeleton.

An interview with Dr. Rushel Bunter, the head of ASDA ' s Mars colonization Project blah blah ...

What does this mean for you? So, it means blah blahblah ...

Track follow-ups to the story online at http://www.mars-connect.dom/. To-pictures of the latest samples, log on to http://www.asdamcp.dom/galleries/220/

Fairly standard text: it has a title, a signature, and many paragraphs of text. What you really need to do to translate this document into HTML is to keep the layout of the original text on a Web page using HTML's branch and paragraph tags. Special punctuation marks need to be converted into corresponding HTML symbols, and hyperlinks need to be clickable.

The following PHP code (listing a) completes all of the above tasks:

List A

Let's take a look at how it works:

The following are the referenced contents:

Set source file name and path

$source = "Toi200686.txt";

Read raw text as array

$raw = File ($source) or Die ("Cannot read file");

Retrieve and second lines (title and author)

$slug = Array_shift ($raw);

$byline = Array_shift ($raw);

Join remaining data into string

$data = Join (", $raw);

Replace special characters with HTML entities

Replace line breaks with

$html = NL2BR (Htmlspecialchars ($data));

Replace multiple spaces with single spaces

$html = preg_replace ('/ss+/', ', ', $html);

Replace URLs with elements

$html = preg_replace (' s/S (w+://) (s+)/', ', $html);

Start building Output page

Add Page Header

$output =<<< HEADER

HEADER;

Add page Content

$output. = "

$slug

";

$output. = "

by $byline

";

$output. = "

$html

";

Add page Footer

$output .=<<< FOOTER

FOOTER;

Display in Browser

Echo $output;

and/or

Write output to a new. html file

File_put_contents (basename ($source, substr ($source, Strpos ($source, '. ')). ". html", $output) or Die ("Cannot write file");

?>

The first step is to read the pure ASCII file into a PHP array. This can be done easily by using the file () function, which converts each row of the file into an element in an array with a numeric index.

Then the title and the author line (I assume that both are the first two lines of the file) are extracted from the array by the Array_shift () function and placed in a separate variable. The remaining members of the array are then concatenated into a string. This string now includes the body of the entire article.

Special symbols such as "'", "<" and ">" are converted to corresponding HTML symbols through the Htmlspecialchars () function. To preserve the original format of the article, branches and segments are converted to HTML through the NL2BR () function

Elements. Multiple spaces in the middle of a story are compressed into a space by a simple string substitution.

The URL in the text of the article is detected with a regular expression, with elements on both sides. When the page is displayed in a Web browser, it converts the URL into a clickable hyperlink.

Then use the standard HTML rules to create the output HTML page. The title, author, and body of the article are formatted with CSS style rules. Although this script does not do this, you can customize the appearance of the final page in this place, and you can add graphic elements, colors, or other dazzling content to the template.

Once the HTML page is built, it can be sent to the browser or saved as a static file with File_put_contents (). Note that when you save, the original file name is decomposed, and a new file name (called filename.html) is created for the newly created Web page. You can then publish the Web page to a Web server, save it to a CD, or edit it further.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.