Tag: Shell
Small programs for colleagues
140822,1406181801491716879,221.203.75.168,20140822000014140823,1408051715321587060,101.28.174.242,20140822000127140823,1408051715321587060,101.28.174.242,20140822000129140824,1408051715321587060,101.28.174.242,20140822000139140824,1406031640261808247,110.254.245.82,20140822000205140825,1305230023521467300,210.73.6.180,20140822000216140825,1408181402431171048,110.243.255.56,20140822000216140825,1408131900341325654,110.248.233.239,20140822000216140825,1407071756131811923,27.213.51.178,20140822000228140826,1408171201311863011,124.67.26.134,20140822000238
The content of a data file is as follows: the first part of each line is 140822, January 1, August 22, 2014.
To split files by different dates, the shell script is as follows:
cat data.txt | while read linedoif [ -n "$line" ]thenecho $lineecho ${line%%,*}echo $line >> "${line%%,*}.txt"echo "${line##*,}"fidone
In actual operation, the speed is very low and cannot be tolerated, because the start time is from 140822 to 140826
So the program is changed
datefor((i=140818;i<=140826;i++));do echo $i; awk '{if(/^'$i'/)print $0;}' data.txt >$i.data.csvdone;date
This speed has been significantly improved. The reason for the increase in inference speed is that the number of I/O operations is reduced, and the number of sub-process creation and destruction is also reduced.
Try to use grep to replace awk command to find faster
datefor((i=140818;i<=140826;i++));do echo $i; grep "$i," data.txt > "${i}.txt"done;date
The awk command takes about 6 minutes, And the grep command takes about 3 minutes.
Even shell commands are very interesting.
Shell split File