I wrote an article long ago. Article (Use C # to intercept a specified string of both Chinese and English characters), but the performance is not tested. Some people say that the performance of the method I wrote is incorrect. Later I thought, it may be necessary for BT to input a string of tens of thousands of K or even several Mb, which will affect the speed of regular match. For example, it is very likely to be used in the article system, today is a little time. I just improved it, Code As follows:
Public Static String Getstr ( String S, Int L, String Endstr)
{
String Temp = S. substring ( 0 , (S. Length < L) ? S. Length: l );
If (RegEx. Replace (temp, " [\ U4e00-\ u9fa5] " , " Zz " , Regexoptions. ignorecase). Length <= L)
{
ReturnTemp;
}
For ( Int I = Temp. length; I > = 0 ; I -- )
{
Temp = Temp. substring ( 0 , I );
If (RegEx. Replace (temp, " [\ U4e00-\ u9fa5] " , " Zz " , Regexoptions. ignorecase). Length <= L - Endstr. length)
{
ReturnTemp+Endstr;
}
}
Return Endstr;
}
The modified method adds a parameter"StringEndstr", Refers to when the string"StringS"Exceeds the specified length"IntL"For example, do you want to add the ellipsis "..." or other characters to the end.
In addition, after a ellipsis is added, the ellipsis length is included in the result length.
Usage example:
Getstr ("China 1 China Medium 1111 China", 23 ,"")
// Output: China 1 China Medium 1111 China
Getstr ("China 1 China Medium 1111 China", 23 ,"...")
// Output: China 1. China (China) 1111...
Getstr ("China 1 China Middle 1111 China", 23 ,"")
// Output: China 1 China Medium 1111 China
Getstr ("China 1 China Medium 1111 China", 23 ,"...")
// Output: China 1. China (China) 1111...
----------------------------------------------------------------------
Supplement: "Kpz" replied that the above method would intercept distortion, and I could not perform exhaustive testing, so I used another method, in order to consider the performance results, the logic was a bit dizzy and tested multiple times. The Code is as follows:
Public Static String Getstr2 ( String S, Int L, String Endstr)
{
String Temp = S. substring ( 0 , (S. Length < L + 1 ) ? S. Length: l + 1 );
Byte [] Encodedbytes = System. Text. asciiencoding. ASCII. getbytes (temp );
String Outputstr = "" ;
Int Count = 0 ;
For ( Int I = 0 ; I < Temp. length; I ++ )
{
If (( Int ) Encodedbytes [I] = 63 )
Count + = 2 ;
Else
Count + = 1 ;
If (Count <= L - Endstr. length)
Outputstr + = Temp. substring (I, 1 );
Else If (Count > L)
Break ;
}
If (Count <= L)
{
Outputstr=Temp;
Endstr="";
}
Outputstr + = Endstr;
Return Outputstr;
}
The usage and parameter meanings are the same as before. Note that the ellipsis also occupies the position and the length is calculated.