command line for data analysis file operations

Source: Internet
Author: User

WC--Statistics

Sort--sorting

Uniq--Go to the heavy

$ sort File.txt | uniq-c | Sort-nr | Head-5


Select Word, COUNT (1) CNT from File group by word order by CNT desc Limit 5;


Gzip/tar--Compression tool

Cat/zcat--File View

Less/more--File view, support GZ compression format direct view

Head/tail--View file header and tail

Du-h-c-s--View space occupancy


awk-Database manipulation tools

Join/cut/paste--Associated File/Shard Field/merge file

Fgrep/grep/egrep--Global regular expression lookup

Find-finds files and executes tasks in batches for search results

SED-stream Editor, batch modify, replace file

Split--Slicing large files, by how many lines a file, or how many bytes a file

Rename--Batch Rename


Zcat--Directly view the contents of files in a compressed file

Zgrep/zfgrep/zegrep--Find directly in compressed files


Date--time-date operation

Sort/uniq--sorting, de-weighing statistics

Comm--Compare two sorted files (common line, only on left file, only on right file)

diff--The similarities and differences of character comparison files, with Cdiff, similar to GitHub's display effect

Curl/w3m/httpie--Network requests under the command line

Iconv--File encoding conversion

SEQ-generates a sequential sequence for a For loop


Shell judgments and loops

if []; Then Passfiwhiledo passdonefor i in Xxxxxdo passdone


Perform long-time tasks, using Nohup




Use of command combinations

1 Delete the 0-byte file find. -type f-size 0-exec rm-rf {} \;

Find. Type F-size 0-delete

2 viewing processes, sorted by memory from large to small ps-e-o "%c:%p:%z:%a" |sort-k5-nr

3 by CPU utilization from large to small ps-e-o "%c:%p:%z:%a" |sort-nr

4 print says the URL in the cache grep-r-a jpg/data/cache/* | Strings | grep "http:" | Awk-f ' http: ' {print ' http: ' $ ';} '

5 Viewing the number of concurrent requests for HTTP and their TCP connection Status Netstat-n | awk '/^tcp/{++s[$NF]} END {for (a in S) print A, s[a]} '

6 sed in the root line of this article, match the root line and replace no with Yes. sed-i '/root/s/no/yes/'/etc/ssh/sshd_config

7 How to kill MySQL process PS aux |grep mysql |grep-v grep |awk ' {print $} ' |xargs kill-9

Killall-term mysqld

Kill-9 ' Cat/usr/local/apache2/logs/httpd.pid '

8 Display services running Level 3 open (learn about the purpose of cut, intercept data) ls/etc/rc3.d/s* |cut-c 15-

9 How to display multiple messages in the writing shell,with EOF cat << EOF

+ ———————————————————— –+

| = = = Welcome to tunoff services = = = |

+ ———————————————————— –+

Eof

The use of the for (such as to build a softlink to MySQL) cd/usr/local/mysql/bin

For i in *

Do ln/usr/local/mysql/bin/$i/usr/bin/$i

Done

Fetch IP address ifconfig eth0 |grep "inet addr:" |awk ' {print $} ' |cut-c 6-

Ifconfig | grep ' inet addr: ' | Grep-v ' 127.0.0.1′|cut-d:-f2 | awk ' {print '} '

Size of memory free-m |grep "Mem" | awk ' {print $} '

View 80-port connections and sort netstat-an-t | grep ": 80″| grep established | awk ' {printf '%s%s\n ", $5,$6} ' | Sort

See the number of concurrent requests for Apache and their TCP connection Status Netstat-n | awk '/^tcp/{++s[$NF]} END {for (a in S) print A, s[a]} '

Check the size of all JPG files below the server Find/-name *.jpg-exec wc-c {} \;|awk ' {print $} ' |awk ' {A+=$1}end{print a }’

Number of CPUs Cat/proc/cpuinfo |grep-c processor

Cat/proc/loadavg CPU load

CPU Load mpstat 1 1

memory space free

disk space df-h

If you find that a partition space is nearly exhausted, you can go to the mount point of the partition and use the command to find the file or directory that occupies the most space du-cks * | Sort-rn | Head-n 10

disk I/O load iostat-x 1 2

net load sar-n DEV

Network error netstat-i

Cat/proc/net/dev

Number of Network Connections Netstat-an | Grep-e "^ (TCP)" | Cut-c 68-| Sort | uniq-c | Sort-n

Total number of processes PS aux | Wc-l

View process Tree PS aufx

Number of running processes Vmwtat 1 5

Check that DNS server is working properly, take 61.139.2.69 as an example dig www.baidu.com @61.139.2.69

Check the number of currently logged in users who | Wc-l

Log view, search cat/var/log/rflogview/*errors

Grep-i error/var/log/messages

Grep-i fail/var/log/messages

Tail-f-N 2000/var/log/messages

kernel log DMESG

time Date

The number of handles already open Lsof | Wc-l

network capture packet, direct output summary information to the file. tcpdump-c 10000-i eth0-n DST port >/root/pkts

then check the number of repetitions of IP and order from small to large note that the middle of-t\ +0″ is two spaces, less command usage. Less Pkts | awk {' printf $3″\n '} | Cut-d.-F 1-4 | Sort | uniq-c | awk {' printf $1″ ' $2″\n "'} | Sort-n-t\ +0

PNS kudzu view NIC model kudzu–probe–class=network
























command line for data analysis file operations

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.