#!/bin/sh
foo ()
{
If [$#-ne 1];
Then
echo "usage:$0 filename";
Exit-1
fi
egrep-o "http://[a-za-z0-9." +\. [A-za-z] {2,3} "website | awk ' {count[$0]++} END {printf ("%-30s%s\n", "Wensite", "Count"), for (Ind in count) {printf ("%-30s%d\n", Ind,count[ind]) ; } }' | SORT-NRK 2 | Head-n >websorted2.txt;
}
Example:
Content in file website:
http://www.google.com
Http://www.baidu.com
Http://www.sina.com
http://www.bjtu.edu.cn
Http://www.codeproject.com
Http://www.csdn.com
Http://www.sohu.com
http://www.yahoo.com
Http://mail.163.com
http://www.bjtu.edu.cn
Http://www.codeproject.com
Http://www.csdn.com
Http://www.sohu.com
http://www.yahoo.com
Http://mail.163.com
Http://www.codeproject.com
Http://www.csdn.com
Http://www.sohu.com
http://www.yahoo.com
Http://mail.163.com
Http://www.qq.com
Http://www.hao123.com
Http://www.163.com
Http://youku.com
Http://taobao/com
http://www.bjtu.edu.cn
Http://www.codeproject.com
Http://www.csdn.com
Http://www.sohu.com
http://www.yahoo.com
Http://mail.163.com
Http://www.codeproject.com
Http://www.csdn.com
Http://www.sohu.com
http://www.yahoo.com
Http://mail.163.com
Http://www.qq.com
Http://www.hao123.com
Http://www.163.com
Http://youku.com
Http://taobao/com
The resulting file content is (that is, the result)
http://www.yahoo.com 5
http://www.sohu.com &N Bsp;5
http://www.csdn.com 5
http://www.codeproject.com 5 http://mail.163.com 5
http://www.bjtu.edu.cn 3
http://youku.com 2
http://www.qq.com  2
http://www.hao123.com  2
http://www.163.com 2