Linux Text Processing Three Musketeers-grep

Source: Internet
Author: User
Tags add numbers control characters diff expression engine egrep

Cat:concatenate text File Viewing tool
Cat [option] filename ...
-N: Add numbers to the displayed lines of text
-B: Non-empty line number
-V: Show ^
-E: Display line terminator $
-T: Show tabs
-A: Displays all the control characters-a=-vet
-S: Compress consecutive blank lines into a row
Example: CAT/ETC/FSTAB/ETC/PASSWD view multiple files simultaneously
DF | Cut-d:

TAC: Text File Viewing tool
Usage with cat is just the line in reverse of file content display

Rec: Text File viewing tool
Usage with cat just text file the contents of each line upside down display

Head: View the first few lines of a file (default 10 lines)
Head [option] FILE
Head-n #: Specifies the number of rows to be viewed, abbreviated as:-#

Tail: View the following lines of a file (default 10 lines)
tail [option] FILE
Tail-n #: Specifies the number of rows to be viewed, abbreviated as:-#
-F: Do not exit after viewing end of file
Follow the display of new rows, which is useful for monitoring log file growth
Tail-f-n0 test.txt & Background run log file monitoring

Split-screen view command: more,less
More:more file
Features: Automatic exit when turning the screen to the end of the file

Less:less file
Man calls the less command and uses the same man

Cut: Clip Commands--Filter display file contents
Cut [OPTION] ... [FILE] ...
-D Specify Delimiter (default tab)
-f Specifies the display field
#: Section # Fields
M,n: section m, section n
M-n: section m to nth
Mixed use: m-n,#
-C cut by character
--output-delimiter=string specifying the output delimiter
such as: CUT-D:-F1,5,7/ETC/PASSWD

Paste merge Two files The row number column is displayed after a row
Paste [OPTION] ... [FILE1] [FILE2] ...
-D delimiter: Specify Delimiter (default tab)
-S: Merge the contents of each of the two files into one line before merging
Paste File1 file2 > file produces true merge effects
Paste-s file1 file2

Wc:world Count Text Statistics command
WC [option] File
-L: Show only the number of lines in a text file
-W: Displays only the word count of the file file
-C: Show only the number of bytes in a text file
-M: Displays only the number of characters in a text file

Sort: File content sorting display
Sort [option] File
-N: Sorting by numeric size
-U: Go back after sorting
-F: Ignore character case
-R: Reverse order
-T: Specify delimiter
-K: Specify Sort Fields
such as: sort-nt:-k3/etc/passwd

Uniq: Displays duplicate rows in a file (exactly the same tangent consecutive lines)
uniq [option] File
-D: Show only duplicate rows
-C: Shows the number of rows in a file that repeats
-U: Show only rows that are not duplicates
Commonly used with the sort command:
Sort Userlist.txt | Uniq-c

diff: Compare the differences between two files
diff [option] file1 file2
such as: diff Foo.conf-broken Foo.conf-works
5c5
< Use_widgets = No
---
> use_widgets = yes
Note that the 5th line has a difference (change)

The output of the diff command is stored in a file called "patches", using the-u option to output the "unified (Unified)" diff format file, which is best for patch files.
The patch command replicates changes made in other files (use caution!). )
The-B option is used to automatically back up changed files
Diff-u Foo.conf-broken foo.conf-works > Foo.patch
Patch-b Foo.conf-broken Foo.patch

The Three Musketeers of Linux text Processing
grep: Text Filtering tool
Sed:stream Editor, Stream editors
Implementation Gawk on Awk:linux, Text Report Generator

Grep:global search REgular expression and Print out of the line
Purpose grep: Searches text based on patterns and displays lines of text that conform to the pattern
Parttern (Mode): The matching criteria for the combination of the wildcards regular of the expression with the literal character
grep, Egrep, fgrep (regular expression search not supported)

grep [OPTIONS] PATTERN [FILE ...]
--color=auto: Coloring the text to match to the display;
-V: Displays rows that cannot be matched to pattern;
-I: Ignore character case
-N: Show matching line numbers
-C: Count the number of matching rows
-O: Show only the matching string
-Q: Silent mode, does not output any information
-a #:after that shows the lines that are connected after the matching line
-B #: Before, displays the # lines connected before the matching line
-c #:context, displaying lines before and after matching lines
-W: Entire line matches Whole word
-E: Using an extended regular expression equivalent to Egrep
-E: Logical OR relationship between multiple options
Example: Grep–e ' cat '-e ' dog ' file

REGEXP: A pattern written by a class of special characters and text characters, in which some characters (metacharacters) do not represent literal meanings, but are functions that represent control or a wildcard
Program support: grep, VIM, Less,nginx, etc.
Divided into two categories:
Basic Regular Expressions: BRE
Extended Regular expression: ERE
Regular expression engine:
Using different algorithms to check the software module for processing regular expressions
PCRE (Perl Compatible Regular Expressions)
Meta-character classification: character matching, number of matches, position anchoring, grouping

Character matching: Use the time meta character outside to set another []
. : matches any single character;
[]: matches any single character within the specified range
[^]: matches any single character outside the specified range
[:d igit:] All numbers
[: Upper:] All uppercase letters
[: Lower:] All lowercase letters
[: Alpha:] All uppercase and lowercase letters
[: Alnum:] numbers and uppercase and lowercase letters
[:p UNCT:] Punctuation
[: Space:] Space

Number of matches: used after the number of characters to be specified, to specify the number of occurrences of the preceding character
*: matches the preceding character any time, including 0 times
Greedy mode: Match as long as possible
\: Escape character
. *: Any character of any length
\?: match its preceding character 0 or 1 times
\+: Matches the preceding characters at least 1 times
\{m\}: Matches the preceding character m times
\{m,n\}: Matches the preceding character at least m times, up to N times
\{,n\}: Matches the preceding character up to n times <=n
\{m,\}: Matches the preceding character at least m times >=n

Position anchoring: positioning where it appears
^: Anchor at the beginning of the line for the leftmost mode
$: End-of-line anchoring for the right-most mode
^pattern$: For pattern matching entire row
^$: Blank Line
^[[:space:]]*$: Blank Line
\< or \b: The first anchor of the word, used for the left side of the word pattern
\> or \b: the ending anchor; for the right side of the word pattern
\<pattern\>: Match Whole word

Group: \ (\): Binds one or more characters together as a whole, such as: \ (root\) \+
The contents of the pattern in the grouping brackets are recorded in internal variables by the regular expression engine, which are named: \1, \2, \3, ...
\1: From the left, the first opening parenthesis and the matching closing parenthesis to match the pattern between the characters;

Back reference: References the pattern in the preceding grouping brackets to match the character (not the pattern itself)
Find the same row as the user name and shell name in the account list

Egrep and extended Regular expressions
Egrep = Grep-e
Egrep [OPTIONS] PATTERN [FILE ...]
Extend the metacharacters of regular expressions:
Character Matching:
. Any single character
[] Specify the range of characters
[^] characters not in the specified range

Number of matches:
*: matches the preceding character any time
?: 0 or 1 times
+:1 Times or more
{m}: matches M-Times
{M,n}: At least m, up to N times

Location anchoring:
^: Beginning of the line
$: End of line
\<, \b: the first language
\>, \b: The end of the language
Group:
()

Back reference: \1, \2,
Or:
A|b
C|cat:c or Cat
(c|c) At:cat or cat

This article is from the "Love Firewall" blog, be sure to keep this source http://183530300.blog.51cto.com/894387/1834527

Linux Text Processing Three Musketeers-grep

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.