A problem has just been encountered that requires processing a string that replaces certain strings in the specified structure. For example:
String str= "confirmed/V 30/m example/q,/wd of which/rz death/vi 5/f case/N,/wd Cure/V discharged/vi 3/n case/n", this is a Chinese word segmentation annotation result, now I want to do some correction to this result, the annotation result, like "30, 5, 3" This The results of some numerals are different, which is not easy to handle, so it needs to be unified. So:
Requirements: the "/English letter" After the number is unified into the "digital/M" form.
This seemingly simple process has troubled the rookie a morning, first of all think of regular, matching number +/+ letter combination, the expression is very good: [0-9]+/A-z]. However, I do not know how to deal with the match, I want to keep the previous number, only to replace the letter behind the slash. First I used Num=matcher.group (1) to get the previous number, and then replace the letter.
Word = Matcher.group (1); str = str.replace ([0-9]+)/[a-z]*, Word + "/m");
The function cannot be implemented, the Replace function replaces only one, the replaceall is useless, and the number before the replacement is not the same.
So I went for a search. There are also two string substitution functions in Java: appendreplacement (stringbuffer sb,string replacement) and Appendtail (StringBuffer SB)
Appendreplacement (StringBuffer SB, String replacement)
Replaces the current matching substring with the specified string, and adds the replacement substring and the string segment that precedes the last matching substring to a StringBuffer object, while Appendtail (StringBuffer SB) Method adds the remaining string to a StringBuffer object after the last matching job.
For example, there is a string fatcatfatcatfat, assuming that both the regular expression pattern is "cat", the first match after the call Appendreplacement (SB, "Dog"), then StringBuffer SB's content is Fatdog, that is The cat in the FatCat is replaced with the dog and the contents of the matching substring are added to SB, and the second match is invoked after the Appendreplacement (SB, "Dog"), then the contents of SB become Fatdogfatdog if the last call again append Tail (SB), then the final content of SB will be fatdogfatdogfat.
So the code is as follows:
String word = null;
Pattern pattern = Pattern.compile ("([0-9]+)/[a-z]+");
Matcher Matcher = Pattern.matcher (str);
StringBuffer sb=new StringBuffer ();
while (Matcher.find ()) {
word = matcher.group (1);
Matcher.appendreplacement (SB, Word + "/m")
;
Matcher.appendtail (SB);
return sb.tostring ();
Treatment results: Confirmed/v 30/m case/q,/wd of which/rz death/vi 5/m case/N,/wd Cure/V discharged/vi 3/m case/N