The article is too long and paging is helpful for reading. Next I will introduce how to directly use the paging ID of the editor and then use the program to split it into an array when reading it in the background and then output it separately, if you need it, refer to this article.
Now the problem arises. The content of this article contains a lot of HTML tags. If you use SubString to intercept the HTML tags, the HTML tags may also be truncated, or the tags may be opened in the tag attributes, in this way, the strings we get are disordered. Therefore, HTML tags are filtered during interception.
Your own expression capabilities are limited and you can directly access the code.
The Code is as follows: |
Copy code |
/// <Summary> /// Obtain the paging data /// </Summary> /// <Param name = "param"> content </param> /// <Param name = "size"> Number of words in the article (excluding HTML) </param> /// <Returns> </returns> Public static List <string> SubstringTo (string param, int size) { Param = NoHTML (param); // filter HTML tags that cannot be displayed on Wap. Var length = param. ToCharArray (). Length; Var being = 0; Var list = new List <string> (); While (true) { String str = SubstringToHTML (param, being, size, "", out being ); List. Add (str ); If (length <= being) { Break; } } Return list; } /// <Summary> /// Truncate a string by byte length (supports truncate a string with HTML code style) /// </Summary> /// <Param name = "param"> string parameter to be truncated </param> /// <Param name = "length"> truncated bytes </param> /// <Param name = "end"> string added at the end of the string </param> /// <Returns> returns the truncated string </returns> Public static string SubstringToHTML (string param, int being, int length, string end, out int index) { String Pattern = null; MatchCollection m = null; StringBuilder result = new StringBuilder (); Int n = 0; Char temp; Bool isCode = false; // is it HTML code? Bool isHTML = false; // whether it is a special HTML character, such as & nbsp; Char [] pchar = param. ToCharArray (); Int I = 0; For (I = being; I <pchar. Length; I ++) { Temp = pchar [I]; If (temp = '<') { IsCode = true; } Else if (temp = '&') { IsHTML = true; } Else if (temp = '>' & isCode) { // N = n-1; IsCode = false; } Else if (isHTML) { IsHTML = false; } If (! IsCode &&! IsHTML) { N = n + 1; // UNICODE characters in two bytes If (System. Text. Encoding. Default. GetBytes (temp + ""). Length> 1) { N = n + 1; } } Result. Append (temp ); If (n> = length) { Break; } } Index = I + 1; Result. Append (end ); // Remove the paired HTML Tag. My regular expression is not good, so it is not well written here. You can write a regular expression to remove all Temp_result = Regex. Replace (temp_result ,@"(? Is) <p [^>] *?>. *? </P> "," $2 ", RegexOptions. IgnoreCase ); Temp_result = Regex. Replace (temp_result ,@"(? Is) <a [^>] *?>. *? </A> "," $2 ", RegexOptions. IgnoreCase ); Temp_result = Regex. Replace (temp_result ,@"(? Is) ] *> "," $2 ", RegexOptions. IgnoreCase ); Temp_result = Regex. Replace (temp_result ,@"(? Is) <br [^>] *> "," $2 ", RegexOptions. IgnoreCase ); // Use a regular expression to retrieve a tag Pattern = ("</([a-zA-Z] +) *> "); M = Regex. Matches (temp_result, Pattern ); ArrayList bengHTML = new ArrayList (); Foreach (Match mt in m) { BengHTML. Add (mt. Result ("$1 ")); } // Complete the unpaired HTML Tag For (int nn = bengHTML. Count-1; nn> = 0; nn --) { Result. Insert (0, "<" + bengHTML [nn] + "> "); } // Use a regular expression to retrieve a tag Pattern = ("<([a-zA-Z] +) [^ <>] *> "); M = Regex. Matches (temp_result, Pattern ); ArrayList endHTML = new ArrayList (); Foreach (Match mt in m) { EndHTML. Add (mt. Result ("$1 ")); } // Incomplete HTML tags For (int nn = endHTML. Count-1; nn> = 0; nn --) { Result. Append ("</"); Result. Append (endHTML [nn]); Result. Append ("> "); } Return result. ToString (); } |
Summary:
There are some differences between the paging of articles and the paging of databases. There are many ways to save files to the database in multiple parts. when reading the files, you can determine the paging, another method is to use the editor's paging character to insert the pages to be paged, and then use the sharding function to separate the pages and use for paging.