When processing text, we often encounter the following situation: we need to compare the two lines of text and then output them selectively. By default, only one row can be read at a time in the while (<filehand>) {do something} block. Here, I will give a simple example to illustrate how to handle this situation.
There is a text like this:
A 1 2 3 4
A 5 6 7 8
A 6 7 8 9
A 7 8 9 11
A 7 8 9 12
A 13 12 14 15
A 18 14 16 17
A 2 3 4 65
The requirement is as follows: if the number in the first column of the last row is greater than the number in the second column of the next row, both rows are output.
Policy 1: store the input text in the array, and use the for loop to output two rows at a time for judgment.
The script is as follows:
#! /usr/bin/perl -wuse strict;chomp(my @a=<DATA>);my @out;for(my $i=0;$i<$#a;$i++){ my ($a1,$a2)=(split/\s+/,$a[$i])[1,4]; my ($b1,$b2)=(split/\s+/,$a[$i+1])[1,4]; push @out,@a[$i,$i+1] if $a2 > $b1;}my %ha;my @new=grep {$ha{$_}++<1}@out;print $_,"\n" [email protected];__DATA__a 1 2 3 4a 5 6 7 8a 6 7 8 9a 7 8 9 11a 7 8 9 12a 13 12 14 15a 18 14 16 17a 2 3 4 65
Policy 1 is relatively simple, but if the input text is too large, memory consumption is relatively high. Of course, using the tie: file module is another matter.
Here, I use the tell and seek functions to adjust the handle location so that I can output multiple rows and multiple rows at a time in the while loop, is it convenient?
The Code is as follows:
#! /usr/bin/perl -wuse strict;my @out;while(<DATA>){ chomp; my $pos=tell(DATA); my @a=split/\s+/,$_; my $sec=<DATA>; if($sec){ chomp$sec; my @b=split/\s+/,$sec; if($a[4]>$b[1]){ push @out,$_,$sec; } } seek(DATA,$pos,0);}my %ha;my @new=grep $ha{$_}++ < 1,@out;print $_,"\n" [email protected];__DATA__a 1 2 3 4a 5 6 7 8a 6 7 8 9a 7 8 9 11a 7 8 9 12a 13 12 14 15a 18 14 16 17a 2 3 4 65
The running result is as follows:
A 5 6 7 8
A 6 7 8 9
A 7 8 9 11
A 7 8 9 12
A 18 14 16 17
A 2 3 4 65
_ End __