How to realize the frequency statistic of characters in the Linux command line

Last Update:2017-02-28 Source: Internet

Author: User

Tags manual sort

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The Linux command line has a lot of fun and we can easily and perfectly perform a lot of tedious tasks. For example, we calculate how often characters and characters appear in a text file, which is what we intend to say in this article.

Immediately came to our minds the command of computing words and characters in a text file frequency of LINUX commands is WC command.

Before using a script to parse a text file, we must have a text file. To maintain consistency, we will create a text file, and the output of the man command is described below.

The code is as follows:

$ man mans > Man.txt

The above command is to import the use of the Man command into the Man.txt file.

We'd like to get the most common words and execute the following script for our newly created files.

The code is as follows:

Sample Output

The code is as follows:

7557

262 the

163 to

112 is

112 A

Manual

The

The above script outputs the 10 words that are most commonly used.

How do you look at a single letter? Then use the following command.

The code is as follows:

$ Echo ' Tecmint team ' | Fold-w1

Sample Output

[Code] t

Note:-W1 just set the length

Now we'll sort the results by breaking each letter from that text file to get the 10 most common characters for the desired output frequency.

$ FOLD-W1 < Man.txt | Sort | uniq-c | Sort-rn | Head

Sample Output

The code is as follows:

8579

2413 E

1987 A

1875 T

1644 I

1553 N

1522 O

1514 S

1224 R

1021 L

How do you differentiate between case? We've all been ignoring the case before. So, use the following command.

Sample Output

The code is as follows:

11636

2504 E

2079 A

The T

1729 I

1645 N

1632 S

1580 O

1269 R

1055 L

836 H

791 P

766 D

753 C

725 M

690 U

605 F

504 G

352 Y

344.

Please check the output above, the punctuation is included. Let's kill him, with the TR command. Go:

The code is as follows:

Sample Output

The code is as follows:

11636

2504 E

2079 A

The T

1729 I

1645 N

1632 S

1580 O

1550

1269 R

1055 L

836 H

791 P

766 D

753 C

725 M

690 U

605 F

504 G

352 Y

Now that we have three text, let's look at the results with the following command.

The code is as follows:

Sample Output

The code is as follows:

11636

2504 E

2079 A

The T

1729 I

1645 N

1632 S

1580 O

Next we will generate those rare words with at least 10 letters long. Here's a simple script:

The code is as follows:

Sample Output

The code is as follows:

1──────────────────────────────────────────

1 a All

1 ABC or all arguments within are optional

1 able setlocale for precise details

1 ab Options delimited by cannot is used together

1 achieved by using the less environment variable

1 A child process returned a nonzero exit status

1 act as if this option is supplied using the name as a filename

1 activate local mode format and display local manual files

1 acute accent

Note: The above. More and more, in fact, we can use. {10} Gets the same effect.

These simple scripts let us know the most frequently occurring words and the characters in English.

It's over now. Next time I'll be here to talk about another interesting topic that you should like to read. And don't forget to give us your valuable advice.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More