Java Error Series--split

Source: Internet
Author: User

Objective

In Java projects, for example, we often use the split method of string to process text, and in Map/reduce, we also need to split the HDFs file after it has been read, and also involves split. A few days ago, the original "good" program, suddenly error in the split, then know that because there is a "dirty data" led to split "exceeded" the expectations, it led to the "impossible exception" that happened ~ but also shows that some basic knowledge is really important, So the string class split source read the next, found a "small" split also quite a lot of content.


Split in string

In string, the Split method is as follows:


650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6E/57/wKiom1V5V4vB-UifAABPIVl1KsA500.jpg "style=" float: none; "title=" String0.png "alt=" Wkiom1v5v4vb-uifaabpivl1ksa500.jpg "/>


650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6E/53/wKioL1V5WTKw1LVgAABipXk0ufk248.jpg "style=" float: none; "title=" String1.png "alt=" Wkiol1v5wtkw1lvgaabipxk0ufk248.jpg "/>

Visible, the core of Split is Pattern.compile (regex). Split (this, limit);


Java provides pattern,matcher to support the regular, you can see an example:


650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/6E/57/wKiom1V5WHjQ18udAAEmAHAt7yY052.jpg "title=" String3.png "alt=" Wkiom1v5whjq18udaaemahat7yy052.jpg "/>


The operating result is:

0,1

||

3,4

|ab|

7,8

|cef|

8,9

||

11,12

|kk|

13,14

|a|


It is important to note that:


Given a pattern (regex) by pattern, Matcher can constantly (find) match text and find every

a horse match the beginning of the content (start), end index "End index" that's start+ .

The length of the text.


Subsequence (Begin,end) is a "Baotou Non-trailer" method



Problem:

In the upper while, the largest index we can reach is the last end, and it's possible to notice

this After the end there is content, then how to deal with it?

If a regular has split the text into parts, do we just need a part?

If there is an empty string in the divided section, how does split work?



We can take these questions to look at the source code:


650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/6E/54/wKioL1V5XurQ1CqfAAJMp4wDURg874.jpg "title=" String4.png "alt=" Wkiol1v5xurq1cqfaajmp4wdurg874.jpg "/>


650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/6E/54/wKioL1V5Yt3QNxZ5AAIHWzkZ2QE768.jpg "title=" String5.png "alt=" Wkiol1v5yt3qnxz5aaihwzkz2qe768.jpg "/>


First, the effect of limit on matchlimited:


The limit < 0 or split (regex) is equivalent to split (regex,0) ==> Matchlimited:false

Limit > 0 ==> matchlimited:true



In fact, while means that if limit>0, matchlist only add a limited amount of content.

If the entire text does not match, it returns an array of length 1, with its own contents.

If limit is 0, then the last matching empty string is deleted until the string is returned.




Conclusion

In practical applications, we generally use the following:

Str.split (",") at this time limit=0, only need to pay attention to remove the last empty string can be

Str.split (",",-1) The last empty string is preserved at this time

Str.split (",", 2) limits the number of matches, while preserving the last empty string






This article is from the "Hard Struggle" blog, please make sure to keep this source http://zhangfengzhe.blog.51cto.com/8855103/1661091

Java Error Series--split

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.