One: Linux Text Processing tool
?: Extracting text from a tool
?: File contents: Less and Cat
Cat
-E: Display line terminator $
-N: Numbering each line displayed
-A: Show all control characters
-B: Non-empty line number
-S: Compress consecutive blank lines into a row
TAC: View content Upside-down display (up and down order)
Rev: View content upside-down display (left and right order)
Less: A page-by-page view of a file or stdin output that can be paged up or down.
MORE: A filter that browses a full screen file at a time.
?: File interception: Head and tail
Head
-C #: Specifies the previous # bytes (used when random numbers are taken)
-N #: Specifies the first # line to get
Tail
-C #: Specifies the # bytes after fetching
-N #: Specifies the # line after fetch
-F: Trace display file fd new additions, common log monitoring, equivalent to--follow=descriptor (★)
-F: Trace file name, equivalent to-follow=name--retry (delete trace files, will error, hint).
?: Extract by Column: Cut
-D DELIMITER: Indicates delimiter, default tab (Common)
?: Collect text statistics WC
?-l counts only the number of rows
?-W counts only the total number of words
?-c counts only the total number of bytes (often used to print word lengths)
?-m count number characters total
?-L Displays the length of the longest line in a file
?: Sorting Text sort
-R performs the reverse direction (top to bottom) collation (frequently used when counting IP access times)
-R random ordering (can be used for random numbers)
-N Execution by number size (combined with-R common)
The-f option ignores character capitalization in the (fold) string
-u option (unique) Delete duplicate rows in output
The-t C option uses C as the field delimiter
The-k x option can be used multiple times by using the C character Delimited X column collation
?: Uniq Command: Remove duplicate rows from the input before and after
-C: Shows the number of occurrences per line
-D: Show only rows that have been repeated
-U: Show only rows that have not been duplicated
Note: Repeat for continuous and exact same side
The. Common with the Sort command works together:
Sort Userlist.txt | Uniq-c
?: Linux Text Processing Three Musketeers
?: grep: Text filter (Pattern: pattern) Tool
?: grep, Egrep, fgrep (regular expression search not supported)
?: Sed:stream Editor, text editing tools
?: implementation Gawk on Awk:linux, Text Report Generator
?: grep command option
? --color=auto: Coloring the text to match to a display
? -V: Displays rows that are not matched by pattern (for abnormal use)
? -I: Ignore character case
? -N: Show matching line numbers
? -C: Count the number of matching rows
? -O: Show only the matching string
? -Q: Silent mode, does not output any information
? -A #: After, followed by # lines
? -B #: Before, Front # line
? -c #:context, front and back # lines
? -E: Implementing a logical or relationship between multiple options
Grep–e ' Cat '-e ' dog ' file
? -W: Matches entire word
? -E: Use ere
? -F: Equivalent to Fgrep, does not support regular expressions
? Basic regular Expression meta-character:
Character Matching:
. Match any single character
[] matches any single character within the specified range
[^] matches any single character outside the specified range
[: Alnum:] Letters and numbers
[: Alpha:] represents any English uppercase and lowercase characters, i.e. A-Z, A-Z
[: Lower:] lowercase letter [: Upper:] Uppercase
[: Blank:] white space characters (spaces and tabs)
[: Space:] Horizontal and vertical white space characters (more than [: blank:] contains a wide range)
[: Cntrl:] non-printable control characters (backspace, delete, alarm ...) )
[:d igit:] decimal digits [: xdigit:] hexadecimal digits
[: Graph:] printable non-whitespace characters
[:p rint:] printable characters
[:p UNCT:] Punctuation
? Number of matches: used after the number of characters to be specified, to specify the number of occurrences of the preceding character
Match the preceding character any time, including 0 times: Greedy mode: Match as long as possible
. * Any character of any length
\? Match its preceding character 0 or 1 times
+ match the characters in front of it at least 1 times
{n} matches the preceding character n times
{M,n} matches the preceding character at least m times, up to N times
{, n} matches the preceding character up to n times
{N,} matches the preceding character at least n times
? Position anchoring: positioning where it appears
^ Beginning of the line anchor, for the leftmost mode
$ line End anchor for the right side of the pattern
^pattern$ for pattern matching entire row
^$ Empty Line
^[[:space:]]*$ Blank Line
\< or \b The first anchor for the left side of the word pattern
\> or \b ending anchor; for the right side of the word pattern
\<pattern\> Match Whole Word
? Grouping: () binds one or more characters together as a whole, such as: (Root) +
The contents of the pattern in the grouping brackets are recorded in the internal variables by the regular expression engine, which
Some of the variables are named: \1, \2, \3, ...
? \1 represents the character that matches the pattern between the first opening parenthesis and the matching closing parenthesis from the left.
? Example: (string1+ (string2))
\1:string1+ (string2)
\2:string2
? Back reference: References the pattern in the preceding grouping brackets matches the character, not the pattern itself
? OR: |
Example: A|b:a or B c|cat:c or cat (c|c) At:cat or cat
Example: Add user Bash,basher,nologin (whose shell is specified as-s/sbin/nologin) to find the same row as the user name as the shell.
Linux Text Processing Tools