Perl Split string Split function usage Guide _perl

Source: Internet
Author: User
Tags memory usage


This article focuses on the use of the Perl split function, and a very useful function in Perl is the Perl split function-dividing the string and putting the segmented result into the array. This Perl split function uses a regular expression (RE) and, if not specific, works on the $_ variable.



Perl Split function



A very useful function in Perl is the Perl split function-splitting the string and putting the segmented result into an array. This Perl split function uses a regular expression (RE) and, if not specific, works on the $_ variable.



The Perl split function can be used in this way:


Copy Code code as follows:

$info = "Caine:michael:actor:14,leafydrive";
@personal =split (/:/, $info);


The result is: @personal = ("Caine", "Michael", "Actor", "14,leafydrive");





If we have stored the information in the $_ variable, we can do this:





Copy Code code as follows:

@personal =split (/:/);





If each field is delimited by any number of colons, you can split it with the RE code:


Copy Code code as follows:

$_= "Capes:geoff::shotputter:::bigavenue";
@personal =split (/:+/);





The result is: @personal = ("Capes", "Geoff", "Shotputter", "bigavenue");



But the following code:





Copy Code code as follows:

$_= "Capes:geoff::shotputter:::bigavenue";
@personal =split (/:/);





The result is: @personal = ("Capes", "Geoff", "" "," Shotputter "," "", "", "bigavenue");



In this Perl split function, words can be divided into characters, sentences can be divided into words, and paragraphs can be divided into sentences:





Copy Code code as follows:

@chars =split (//, $word);
@words =split (//, $sentence);
@sentences =split (/\./, $paragraph);





In the first sentence, the empty string matches between each character, so the @chars array is an array of characters. >>



The part that represents the regular expression used in split (or the rule of separation)
\s is a wildcard character that represents a space
+ represents repetition or more than once.
So, \s+ represents one or more spaces.
Split (/\s+/, $line) indicates that the string is $line, separated by spaces.
For example, $line = "Hello friend Welcome to my website jb51.net";
Split (/\s+/, $line) obtained after:
Hello Friend Welcome to visit my website jb51.net



General usage: @somearray = Split (/:+/, $string);  #Parentheses can be avoided. If you do not specify $string, the default variable $_ operation, the two slash between the delimiter, you can use regular expression, strong exception.



In the Perl manual, there is a rare usage. namely: split/pattern/, EXPR, LIMIT, the key is this LIMIT parameter, can save a lot of things. If limit is used, and is a positive number, it is divided into fields that are no more than limit specified. If LIMIT is unspecified or zero, trailing null fields are stripped (which potential users of POPs would do OK to remember ). If LIMIT is negative, it's treated as if an arbitrarily large LIMIT had been. Note This splitting an EXPR this evaluates to the empty string always returns the empty list, regardless of the LIMIT spec Ified.



By making limit, it is possible to return only the field values of the key first few columns in a row-splitting operation that is long (split to produce tens of thousands of elements or fields), reducing memory usage and time consumption. For example, the general genotype data, the first column is usually the material naming, need to pass the material name of the choice, this time can be used.  My ($firstfield) = Split/\t/, $someline, 1; This is good for large files if you need the values from the previous columns: my (undef, $var 1, undef, undef, undef, $var 2) =split/\t/, $someline, 6;



Some netizens have done the test in this way, showing better. The reference is as follows:
>>>
A file, each line has 18 items, between the items with a \ t split, use the 6th item, toss a few uses





Copy Code code as follows:

My @array = Split ("\ T", $_);   My $var = $array [6]; Test file Average spents 8.2s
My ($var) = (Split ("T", $_)) [6]; Test average spents 5.1s
My (undef,undef,undef,undef,undef,undef, $var) = Split ("T", $_); Average spents 3.53s
My (undef,undef,undef,undef,undef,undef, $var) = Split ("T", $_,7); average spents 3.52s
My $var = (split ("T", $_,7)) [6]; Average spents 3.53s





It seems that the latter 3 is the kingly way, if you need to use a number of items can also be carried out to make appropriate changes. However, two if the span is relatively large, 3,4 should be a good choice, 5 can only use an intermediate array.



Test it yourself.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.