Perl reads the path of the required file, and then opens the corresponding file

Last Update:2018-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The following is a DNA sequence stored in F: \ perl \ data.txt under the window:

Copy codeThe Code is as follows: aaaaaaaaaaaaaagggggttttcccccccc
CCCCCGTCGTAGTAAAGTATGCAGTAGCVG
Ccccccccggggggggaaaaaaaaaaaaattttttat
AAACG

The following is a program:

Copy codeThe Code is as follows: # The following program is used to calculate the number of ATGC in a DNA sequence.

# First define the number of four bases as 0
$ Count_A = 0;
$ Count_T = 0;
$ Count_C = 0;
$ Count_G = 0;
# First, merge the sequence into a row.

# First determine the path and file name of the file to be processed (in windows, follow the example below to write
# F: \ perl \ data.txt
Print "please input the Path just like this f :\\\ perl \\\ data.txt \ n ";
Chomp ($ DNA _filename = <STDIN> );
# Open a file
Open (DNA filename, $ DNA _ filename) | die ("can not open the file! ");
# Assign a file to an array
@ DNA = <dna filename>;

# Merge all rows into one row in the following two steps, and then remove all blank characters
$ DNA = join ('', @ DNA );
$ DNA = ~ S/\ s // g;

# Break down the DNA into and assign values to the array
@ DNA = split ('', $ DNA );

# Then read the elements of the array in sequence and count the number of the four bases
Foreach $ base (@ DNA)
{
If ($ base eq 'A ')
{
$ Count_A = $ count_A + 1;
}
Elsif ($ base eq 'T ')
{
$ Count_T = $ count_T + 1;
}
Elsif ($ base eq 'C ')
{
$ Count_C = $ count_C + 1;
}
Elsif ($ base eq 'G ')
{
$ Count_G = $ count_G + 1;
}
Else
{
Print "error \ n"
}
}
# Output the final result
Print "A = $ count_A \ n ";
Print "T = $ count_T \ n ";
Print "C = $ count_C \ n ";
Print "G = $ count_G \ n ";

The running result is as follows:Copy codeThe Code is as follows: F :\> perl \ a. pl
Please input the Path just like this f: \ perl \ data.txt
F: \ perl \ data.txt
Error
A = 40
T = 17
C = 27
G = 24

F: \>

We may have observed an error. Why?

Take a closer look at the top of the original DNA sequence, marked with special colors, you can see that there is a V, so it will output an error.

Here, the DNA sequence is integrated into one line, and then all the blank characters are removed. Then, $ DNA is converted into an array through the split function and then analyzed. Is there a better way?

In fact, there is a function in perl, substr.

Let's take a look at The usage of this function. The substr is an operator for a large string (The substr function works with only a part of a larger string). It refers to a long string, perform fragmented processing and take part of it. This feature is used here.

$ Little_string = substr ($ large_string, $ start_position, $ length)

$ Small fragment = substr ($ large fragment, $ starting position of the small fragment you want to intercept, $ length of the part you want to intercept)

Here we want to count the number of various bases in the DNA, so the string to be processed is a base, so we need to set $ length to 1. In this way, we can meet our needs.

Next we will write down the modified Code:

Copy codeThe Code is as follows: # The following program is used to calculate the number of ATGC in a DNA sequence.

# First define the number of four bases as 0
$ Count_A = 0;
$ Count_T = 0;
$ Count_C = 0;
$ Count_G = 0;
# First, merge the sequence into a row.

# Merge all rows into one row in the following two steps, and then remove all blank characters
$ DNA = join ('', @ DNA );
$ DNA = ~ S/\ s // g;

# Then read the elements of the string in sequence and count the number of the four bases
For ($ position = 0; $ position <length $ DNA; ++ $ position)
{
$ Base = substr ($ DNA, $ position, 1 );
If ($ base eq 'A ')
{
$ Count_A = $ count_A + 1;
}
Elsif ($ base eq 'T ')
{
$ Count_T = $ count_T + 1;
}
Elsif ($ base eq 'C ')
{
$ Count_C = $ count_C + 1;
}
Elsif ($ base eq 'G ')
{
$ Count_G = $ count_G + 1;
}
Else
{
Print "error \ n"
}
}
# Output the final result
Print "A = $ count_A \ n ";
Print "T = $ count_T \ n ";
Print "C = $ count_C \ n ";
Print "G = $ count_G \ n ";

The result is as follows:

Copy codeThe Code is as follows: F :\> perl \ a. pl
Please input the Path just like this f: \ perl \ data.txt
F: \ perl \ data.txt
Error
A = 40
T = 17
C = 27
G = 24

F: \>

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Perl reads the path of the required file, and then opens the corresponding file

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Perl reads the path of the required file, and then opens the corresponding file

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support