Linux is a file-based system, you can think of all the commands of Linux is the operation of the file (some operations are variables), proficient in the Linux Text Processing tool is all learning Linux What is necessary, especially for those who are ready to enter the Linux industry, is that we will encounter a lot of text with huge content, and the batch processing and precise positioning will become the daily operation of the work. Sometimes even if we have mastered all the text processing tools, the face of complex text will feel no way to start, it is very normal,Linux Text processing skills need time and practice to accumulate, so do not feel tired, below to comb the daily common to the command bar.
First, the basic Text Processing command of Linux:
1, Cat, connect files and print to standard output
Cat [OPTION] ... [FILE] ...
-A equates to:-vet
-B Non-empty lines display line numbers, which are overridden by- n
-e equivalent to -ve
-e in each line plus $
-N Display line number
-S compression continuous empty behavior a blank line (empty behavior carriage return if there is a space before the carriage return does not compress)
-T equates to -VT
-T display tab as ^i
-V use ^ and m -symbols, except LFD and TAB
similar to cat commands
TAC: Text Inverted line display
Rev: Each line of text is shown in reverse
View text:less and more:better
Head and Tail command:
Head: Default display of the first 10 lines of text
-C displays only the first few bytes
-N Displays the first few rows
Tail: Display after 10 lines
The-c and -n options are the same as head
-F : Dynamic monitoring last few lines
How do I monitor /var/log/secure and print changes only when changes occur?
[Email protected] ~]# tailf-n 0/var/log/secure &cut
-D: Specifies the delimiter, which is tab by default
-C: Cut a field by the number of characters
-F: Pick field
Paste: Merge two file companion numbers together
-D: Specifies the delimiter, which is tab by default
-S: All lines of a file are displayed as one line
WC: Data for statistical files
-L: Count rows
-W: Count the number of words
-C: Statistics of bytes
-M: Statistics of characters
Sort: Sorts the lines of a file
-B: Ignores whitespace at the beginning of the line and compares the first non-null character with other rows
-F: Compares lowercase letters to large letters, ignoring case and all uppercase comparisons
-G: sorted by regular numbers, with large numbers in the back
-N: By string numeric comparison, and-g difference
-R: Reverse order
-U: Delete duplicate rows in output
-T: Specify delimiter
-K: Select the field to compare
Uniq: report or omit duplicate rows (same as repeat),
-C: Displays the number of duplicate rows
-D: Duplicate rows are displayed
-U: Show rows that are not duplicated
In general, sort the duplicate rows together and then use Uniq to redo the statistics.
diff: Compare the contents of a file one line at a Vimdiff more intuitive
-U: Output comparison results in a uniform format for the patch command to recover files
Patch: Restore the original file with diff file and name the resulting file as the base file name (be sure to add the-B option)
-B automatically backs up the base file for File.orig
1. Find all IPV4 addresses of the ifconfig command results in this machine:
[Email protected] ~]# ifconfig |grep "netmask" |cut-d ' n '-f2|grep "\b[[:d igit:]]\{1,3\}. [[:d igit:]]\{1,3\}. [[:d igit:]]\{1,3\}. [[:d igit:]]\{1,3\}\b]-o10.1.70.102127.0.0.1192.168.122.1
2. Find out the maximum percentage value of partition space utilization
[[email protected] ~]# df |tr-s ' |cut-d '-f5|egrep-o ' [0-9]{1,2} ' |sort-nr|head-138
3, identify the user UID maximum user name, UID and shell type
[[email protected] ~]# sort-nr-t:-k3/etc/passwd|head-1|cut-d:-f1,3,7nfsnobody:65534:/sbin/nologin
4. Find out the permissions of/tmp and display them digitally
[Email protected] ~]# stat/tmp/|grep "Access" |head-1|cut-c11-13777
5. Count the number of connections to each remote host IP currently connected to this machine, and sort from large to small
[[email protected] ~]# netstat-nt|tr-s ' |cut-d '-f5|egrep ' \b ([0-9]{1,3}.) {3} [0-9] {1,3}\b "-o|sort|uniq-c 2 10.1.250.91
Second,the Linux Three Musketeers grep
grep: (Global search regularexpression and print out of the line) Text Search tool, according to the user-specified "mode" to match the target text lines to check; print matching lines.
Patterns: Filter conditions written by regular expression characters and text characters
The grep family of Unix includes grep, Egrep, and Fgrep. Egrep and Fgrep commands are only a small difference from grep. Egrep is the extension of grep, which supports more re metacharacters, and fgrep is fixed grep or fast grep, which regards all the letters as words, that is, the metacharacters in the regular expression represents the literal meaning back to itself, no longer special. Linux uses the GNU version of grep. It is more powerful and can use the Egrep and FGREP functions with the-G,-e,-f command line options.
grep has many options, with detailed options to view
Http://www.lampweb.org/linux/3/27.html
The more commonly used options are listed below
--color=auto: Coloring the text to match to a display
-V: Shows rows that cannot be matched to pattern
-I: Ignore character case
-N: Show matching line numbers
-C: Count the number of matching rows
-O: Show only the matching string
-Q: Silent mode, do not output any information, commonly used as a script to judge
-E: Implementing a logical or relationship between multiple options
-W: Entire line matches whole word, words can include letters, numbers, and underscores
-A: Display matching rows and their first few rows
-B: Show matching rows and their first few rows
-C: Display matching rows and their upper and lower rows
-L: Print filenames that match the matching pattern
-H: Add file name before matching line is displayed
Third, regular expression
REGEXP: A pattern written by a class of special characters and text characters, in which some characters (metacharacters) do not represent literal meanings, but are functions that represent control or a wildcard
Regular expressions are commonly used to match text content, and shell wildcard constants are used to match file paths
Help can be viewed through the man 7 regex
Regular expression meta-character classification: character matching, number of matches, position anchoring, grouping
Regular expressions can be divided into two categories:
1. Basic Regular Expression Bre
Character Matching:
. : Matches any single character
[]: matches any single character in square brackets
[^]: matches characters specified in non-square brackets
[[:d Igit:]] matches a single number, same as [0-9], using both brackets
[[: Alpha:]] matches any single case letter
[[: Lower:]] matches a single lowercase letter
[[: Upper:]] matches any single uppercase letter
[[: Alnum:]] matches a single case or number
[[:p UNCT:]] matches any single punctuation mark
[[: Space:]] match a single space
Number of matches:
*: matches the preceding character any time (greedy mode, followed by a question mark to cancel greedy mode)
. *: Arbitrary length of any character
\?: match the preceding character 0 or one time
\+: Matches the preceding character one or more times
\{m,n\} matches the preceding characters m to n times,
If M is zero, then the maximum match n times
If n is zero, the minimum match is m times
Location anchoring:
^: match starts with a character
$: Match ends with a character
^$: Matches lines without spaces, that is, only the return dealer
^[[:space:]]*$: Matches lines that include spaces
\< or \b: The first anchor of the word
\> or \b: Final anchoring
Group:
\ (\): Binds one or more characters together and treats them as a whole
The contents of the pattern in the grouping brackets are recorded in internal variables by the regular expression engine, which are named: \1, \2, \3, ...
\1: The character that matches the pattern between the first opening parenthesis and the matching closing parenthesis, starting from the left
Back reference: References the pattern in the preceding grouping brackets to match the character (not the pattern itself)
2. Extending the regular expression ere
GREP-E or egrep supports regular expressions
Number of matches:
*: matches the preceding character any time
? : 0 or 1 times
+:1 Times or more
{m}: matches M-Times
{M,n}: At least m, up to N times
Location anchoring:
^: Beginning of the line
$: End of line
\<, \b: The head of the word
\>, \b: suffix
Group:
() or \ (\) are supported
"or" matches:
A|b: Match A or b
C|cat:c or Cat
(c|c) At:cat or cat
For details, please see the following blog post:
Text-processing regular expressions and grep