Java string interception

Source: Internet
Author: User
Tags truncated

Java string interception (when encountering a half-character interception) Method 2 (used in the project)

Method 1 is to look at someone else's, personally think Method 1 concise

Package everyday;

Import java.io.UnsupportedEncodingException;


/**
* * Title:
Write a function that intercepts a string, enter it as a string and number of bytes, and output a string that is truncated by bytes. But to ensure that Chinese characters are not truncated half, such as "I abc" 4, should be cut to "I ab", input "I ABC Han def", 6, should be output as "I abc" rather than "I abc+ Han half."
GB2312, GBK, gb18030,cp936, and CNS11643 all meet the requirements-Chinese is 2 bytes, and English is 11 bytes.
Because Chinese is converted to byte bytes, the length of the converted bytes will not pass, as encoding is UTF-8, and a Chinese string converted to byte takes three bytes.
*
*/
public class Learncsplit {
/**
* Method 1, simpler than Method 2
* @param text
* Target String
* @param length
* Intercept Length
* @param encode
* The encoding method used
* @return
* @throws unsupportedencodingexception
*/
private static string substring (string str, int length1, string code) throws Unsupportedencodingexception {
if (str==null) {
return null;
}
StringBuilder sb=new StringBuilder ();
int currentlength=0;
For (char C:str.tochararray ()) {
currentlength+=string.valueof (c). GetBytes (code). length;
if (currentlength<=length1) {
Sb.append (c);
}else {
Break ;
}
}

return sb.tostring ();
}
public static void Main (string[] args) throws Unsupportedencodingexception {
//stringbuilder sb=null;//Thread is unsafe, high performance
String str= "I abc Han def";
int length1=3;
int length2=6;
String [] codes=new string[]{"GB2312", "GBK", "GB18030", "CP936", "CNS11643", "UTF-8"};
For (String code:codes) {
System.out.println (New StringBuilder (). Append ("with"). Append (code)
. Append ("encoded intercept string--" ""). Append (str). Append ( "" ")
. Append (length1). Append ("The result of a byte is" ")
. Append (substring (str,length1,code)). Append ("" "). toString ());

System.out.println (New StringBuilder (). Append ("with"). Append (code)
. Append ("encoded intercept string--" ""). Append (str). Append ( "" ")
. Append (length2). Append ("The result of a byte is" ")
. Append (substring (str,length2,code)). Append ("" "). toString ());
}

The above is Method 1
String value= "Urumqi Test and Test Development Resource Service Co., Ltd. Dabancheng branch 1A2B3";
Number of statistics bytes
int Countbytes=conutbyte (value);
40 bytes of known field length
if (countbytes>40) {
Value=substr (value,0,40);
SYSTEM.OUT.PRINTLN ("Output a string of the specified field length:" +value);

}






}

/**
* Statistics of bytes
* @param value
* @return
*/
private static int Conutbyte (String value) {
if (value==null) {
return 0;
}
Byte[] BS;
try {
BS = value.getbytes ("GB18030");
int lenbs=bs.length;
return lenbs;
} catch (Unsupportedencodingexception e) {
TODO auto-generated Catch block
E.printstacktrace ();
}

return 0;
}
/**
* Intercept characters
* @param str
* @param begin
* @param ZDCD
* @return
*/
private static string substr (string str, int begin, int zdcd) {
if (str = = null) {
return str;
}
String str2;

Str=getsubstring (STR,ZDCD);//intercept a string of the specified byte length, and cannot return half Chinese characters 20
Zdcd=conutbyte (str);//The number of bytes from the new calculation, 19
I'm going to bad fun 123 I'm going to
Byte[] BS;
try {
BS = str.getbytes ("GB18030");
str2 = new String (BS, Begin, ZDCD, "GB18030");
return str2;
} catch (Unsupportedencodingexception e) {
E.printstacktrace ();
}
Return "";

}
/**
* <b> intercept A string of the specified byte length, cannot return half kanji </b>
* @param str
* @param ZDCD
* @return
*/
private static string getsubstring (string str, int zdcd) {
int count=0;
int offset=0;
Char[] C=str.tochararray ();

for (int i = 0; i < c.length; i++) {

if (c[i]>256) {
offset=2;
count+=2;
}else{
Offset=1;
count++;
}

if (COUNT==ZDCD) {
Return str.substring (0,I+1);
}
if ((count==zdcd+1 && offset==2)) {
Return str.substring (0,i);
}

}
Return "";
}

}

Console output Results:

Intercept a string with GB2312 encoding--"I am ABC def" The result of 3 bytes is "I A"
Using GB2312 encoding to intercept a string--"I abc def" 6 bytes result is "I abc"
Intercept a string with GBK encoding--"I am ABC def" The result of 3 bytes is "I A"
Using GBK encoding to intercept a string--"I abc def" 6 bytes result is "I abc"
Intercept a string with GB18030 encoding--"I am ABC def" The result of 3 bytes is "I A"
Using GB18030 encoding to intercept a string--"I abc def" 6 bytes result is "I abc"
Intercept a string with CP936 encoding--"I am ABC def" The result of 3 bytes is "I A"
Using CP936 encoding to intercept a string--"I abc def" 6 bytes result is "I abc"
Intercept a string with CNS11643 encoding--"I am ABC def" The result of 3 bytes is "I A"
Using CNS11643 encoding to intercept a string--"I abc def" 6 bytes result is "I abc"
Intercept string with UTF-8 encoding--"I abc def" 3 bytes result is "I"
Using UTF-8 encoding to intercept a string--"I abc def" 6 bytes result is "I abc"
Output a string specifying the length of a field: Urumqi Test and Test Development Resource Service Co., Ltd.

Method 3: Intercept a string of the specified length

public class Characterssplit {
public static void Main (string[] args) {
String value= "Ulu a Muzzi co-sheng Human Resources Services Limited liability company Dabancheng Branch 1a2b3";//24+6+2=32
Value=getsubstring (Value,value.tochararray (). length);//Ulu a wood equating
Value=getsubstring (value,89);
Value=value.substring (0, 6);//Ulu a wood equating//This if it is 89, it will be reported to cross the mark
System.out.println (value);
}
/**
*description: Intercepts a string of a specified length
* Compared to the string substring method, can not be long enough to intercept the problem of subscript out of bounds.
*/
Public static String getsubstring (string sOurce, int len) {
if (Source.isempty ()) {
Return "";
}
if (Source.length () <= len) {//32=32
Source.length () =value.tochararray (). length
return sOurce;
}
Return source.substring (0, Len);
}
}

Run output: Ulu a Muzzi Human Resources Service Co., Ltd. Dabancheng branch 1A2B3

Java string interception

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.