其實也是在easyhadoop做第二次重構的時候用到了這個嵌入式的webserver去做伺服器狀態的監控,可以單獨摘出來寫個東西。
思路主要是用python指令碼擷取linux伺服器的各種狀態資訊,然後用webserver的方式,以json資料發給http,主控節點去讀取webserver返回的json,產生系統監控報表。代碼簡單,開發和部署都很方便。
用到的主要東西是python的第三方嵌入式web模組叫cherrypy,之所以選擇cherrypy,主要原因就是開發快速,學習也很快,基本我用了一天多就基本學會了怎麼來寫了。當然,也可以用python內建的simpleHTTPserver。不過那個確實太simple了。cherrypy的優點在於,多線程,多並發。又不像Tornado和Django那樣重量級。因為我們返回的是json,也用不到什麼html模板,資料庫的功能。當然也可以選擇web.py,不過相比還是cherrypy更好一點我認為,web.py也借鑒了cherrypy不少的思想。
其實我的叢集裡都裝了Ganglia, Cacti, Nagios。但是我想看即時的圖表產生,就自己寫了這樣一段程式。做自動化營運的朋友可以當一個參考,添加自己的方法,監控別的也是可以的。
主要流程是這樣:
想辦法讀取Linux系統的各種資料 ---> 解析資料轉成json發給http伺服器 ---> 監控伺服器掃描資料組建圖表
#!/usr/bin/python# -*- coding: utf8 -*-import sysimport cherrypyimport platformimport osimport time#python 2.4為simplejson,python 2.6以上為jsontry: import jsonexcept ImportError: import simplejson as json #假裝做一個index出來class Index(object): #下面這句表示修飾index方法,這個index方法是暴露給http server的 @cherrypy.expose def index(self): return "hello cherrypy" class Node(object): ''' url /node/dist/ ''' #擷取目標機器的發行分支,版本號碼,架構類型,主機名稱等等,返回json @cherrypy.expose def dist(self): dist_json = '' sysinstaller = '' installer = '' ostype = platform.dist() if(ostype[0] in ['Ubuntu','debian','ubuntu','Debian']): sysinstaller = 'apt-get' installer = 'dpkg' elif(ostype[0] in ['SuSE']): sysinstaller = 'zypper' installer = 'rpm' elif(ostype[0] in ['CentOS', 'centos', 'redhat','RedHat']): sysinstaller = 'yum' installer = 'rpm' machine = platform.machine() hostname = platform.node() dist_json = {'os.system':ostype[0], 'os.version':ostype[1], 'os.release':ostype[2], 'os.sysinstall':sysinstaller, 'os.installer':installer, 'os.arch':machine, 'os.hostname':hostname} return json.dumps(dist_json, sort_keys=False, indent=4, separators=(',', ': ')) ''' url /node/GetCpuInfo/ ''' #擷取CPU型號等,返回json @cherrypy.expose def GetCpuInfo(self): cpu = [] cpuinfo = {} f = open("/proc/cpuinfo") lines = f.readlines() f.close() for line in lines: if line == 'n': cpu.append(cpuinfo) cpuinfo = {} if len(line) < 2: continue name = line.split(':')[0].strip().replace(' ','_') var = line.split(':')[1].strip() cpuinfo[name] = var return json.dumps(cpuinfo, sort_keys=False, indent=4, separators=(',', ': ')) ''' url /node/GetMemInfo/ ''' #擷取記憶體使用量的詳細資料 @cherrypy.expose def GetMemInfo(self): mem = {} f = open("/proc/meminfo") lines = f.readlines() f.close() for line in lines: if len(line) < 2: continue name = line.split(':')[0] var = line.split(':')[1].split()[0] mem[name] = long(var) * 1024.0 mem['MemUsed'] = mem['MemTotal'] - mem['MemFree'] - mem['Buffers'] - mem['Cached'] return json.dumps(mem, sort_keys=False, indent=4, separators=(',', ': ')) ''' url /node/GetLoadAvg// ''' #擷取系統負載的詳細資料 @cherrypy.expose def GetLoadAvg(self): loadavg = {} f = open("/proc/loadavg") con = f.read().split() f.close() loadavg['lavg_1']=con[0] loadavg['lavg_5']=con[1] loadavg['lavg_15']=con[2] loadavg['nr']=con[3] loadavg['last_pid']=con[4] return json.dumps(loadavg, sort_keys=False, indent=4, separators=(',', ': ')) ''' url /node/GetIfInfo/eth(x) ''' 擷取指定網卡的流量資訊,這裡面有點複雜 @cherrypy.expose def GetIfInfo(self, interface): dist_json = self.dist() f = open("/proc/net/dev") lines = f.readlines() f.close() intf = {} for line in lines[2:]: con = line.split() #if部分是給centos使用的,centos在流量大的情況下,網卡資訊裡面字串會連上,所以需要單獨拆分處理,else部分則是ubuntu或者其他系統格式化很好的使用 if con[0][-1].isdigit() == True: offset = con[0].split(':') intf['interface'] = str(offset[0]) intf['ReceiveBytes'] = str(offset[1]) intf['ReceivePackets'] = str(con[1]) intf['ReceiveErrs'] = str(con[2]) intf['ReceiveDrop'] = str(con[3]) intf['ReceiveFifo'] = str(con[4]) intf['ReceiveFrames'] = str(con[5]) intf['ReceiveCompressed'] = str(con[6]) intf['ReceiveMulticast'] = str(con[7]) intf['TransmitBytes'] = str(con[8]) intf['TransmitPackets'] = str(con[9]) intf['TransmitErrs'] = str(con[10]) intf['TransmitDrop'] = str(con[11]) intf['TransmitFifo'] = str(con[12]) intf['TransmitFrames'] = str(con[13]) intf['TransmitCompressed'] = str(con[14]) intf['TransmitMulticast'] = str(con[15]) else: intf['interface'] = str(con[0]) intf['ReceiveBytes'] = str(con[1]) intf['ReceivePackets'] = str(con[2]) intf['ReceiveErrs'] = str(con[3]) intf['ReceiveDrop'] = str(con[4]) intf['ReceiveFifo'] = str(con[5]) intf['ReceiveFrames'] = str(con[6]) intf['ReceiveCompressed'] = str(con[7]) intf['ReceiveMulticast'] = str(con[8]) intf['TransmitBytes'] = str(con[9]) intf['TransmitPackets'] = str(con[10]) intf['TransmitErrs'] = str(con[11]) intf['TransmitDrop'] = str(con[12]) intf['TransmitFifo'] = str(con[13]) intf['TransmitFrames'] = str(con[14]) intf['TransmitCompressed'] = str(con[15]) intf['TransmitMulticast'] = str(con[16]) return json.dumps(intf, sort_keys=False) #擷取全部網卡的介面和流量資訊 @cherrypy.expose def GetIfTraffic(self): ifs = [] nettraffic = {} f = open("/proc/net/dev") lines = f.readlines() f.close() for line in lines[2:]: con = line.split() ifname = con[0].split(':') if(ifname[0].strip() != 'lo'): ifs.append(ifname[0].strip()) else: continue for interface in ifs: nettraffic[interface] = self.GetIfInfo( interface) return json.dumps(nettraffic) #擷取硬碟的分區資訊和使用量 @cherrypy.expose def GetHddInfo(self): hdds = [] mount = {} file_system = [] type = [] size = [] used = [] avail = [] used_percent = [] mounted_on = [] hdds = os.popen('df -lhT | grep -v tmpfs | grep -v boot | grep -v usr | grep -v tmp | sed \'1d;/ /!N;s/\\n//;s/[ ]*[ ]/\\t/g;\'').readlines() for line in hdds: file_system.append(line.replace('\\n','').replace('\\t',' ').split()[0]) type.append(line.replace('\\n','').replace('\\t',' ').split()[1]) size.append(line.replace('\\n','').replace('\\t',' ').split()[2]) used.append(line.replace('\\n','').replace('\\t',' ').split()[3]) avail.append(line.replace('\\n','').replace('\\t',' ').split()[4]) used_percent.append(line.replace('\\n','').replace('\\t',' ').split()[5]) mounted_on.append(line.replace('\\n','').replace('\\t',' ').split()[6]) mount['file_system'] = file_system mount['type'] = type mount['size'] = size mount['used'] = used mount['avail'] = avail mount['used_percent'] = used_percent mount['mounted_on'] = mounted_on dist_json = json.dumps(mount) return dist_json #擷取CPU的使用量資訊,需要系統安裝sysstat支援 @cherrypy.expose def GetCpuDetail(self): dist_json = self.dist() dist = json.loads(dist_json) if(dist['os.system'] in ['CentOS', 'centos', 'redhat', 'RedHat']): if(int(dist['os.version'].split('.')[0]) < 6): #For CentOS only cmd = 'mpstat 1 1 | sed \'1d;2d;3d;4d\' | awk \'{print "{\\\"user\\\":\\\"\"$3\"\\\",\\\"nice\\\":\\\"\"$4\"\\\",\\\"sys\\\":\\\"\"$5\"\\\",\\\"iowait\\\":\\\"\"$6\"\\\",\\\"irq\\\":\\\"\"$7\"\\\",\\\"soft\\\":\\\"\"$8\"\\\",\\\"steal\\\":\\\"\"$9\"\\\",\\\"idle\\\":\\\"\"$10\"\\\"}"}\'' else: cmd = 'mpstat 1 1 | sed \'1d;2d;3d;4d\' | awk \'{print "{\\\"user\\\":\\\"\"$3\"\\\",\\\"nice\\\":\\\"\"$4\"\\\",\\\"sys\\\":\\\"\"$5\"\\\",\\\"iowait\\\":\\\"\"$6\"\\\",\\\"irq\\\":\\\"\"$7\"\\\",\\\"soft\\\":\\\"\"$8\"\\\",\\\"steal\\\":\\\"\"$9\"\\\",\\\"idle\\\":\\\"\"$11\"\\\"}"}\'' else: cmd = 'mpstat 1 1 | sed \'1d;2d;3d;4d\' | awk \'{print "{\\\"user\\\":\\\"\"$3\"\\\",\\\"nice\\\":\\\"\"$4\"\\\",\\\"sys\\\":\\\"\"$5\"\\\",\\\"iowait\\\":\\\"\"$6\"\\\",\\\"irq\\\":\\\"\"$7\"\\\",\\\"soft\\\":\\\"\"$8\"\\\",\\\"steal\\\":\\\"\"$9\"\\\",\\\"idle\\\":\\\"\"$11\"\\\"}"}\'' cpu = os.popen(cmd).readline().strip() return cpuif "__main__" == __name__: #伺服器配置 settings = { 'global': { #綁定連接埠 'server.socket_port' : 60090, #ip地址設定,覺得夠安全就用0.0.0.0,否則就單獨寫那台伺服器的ip 'server.socket_host': '0.0.0.0', 'server.socket_file': '', 'server.socket_queue_size': 100, 'server.protocol_version': 'HTTP/1.1', 'server.log_to_screen': True, 'server.log_file': '', 'server.reverse_dns': False, 'server.thread_pool': 200, 'server.environment': 'production', 'engine.timeout_monitor.on': False } } #使用配置和映射路由並啟動webserver cherrypy.config.update(settings) cherrypy.tree.mount(Index(), '/') cherrypy.tree.mount(Node(), '/node') cherrypy.engine.start()
圖表產生端隨便拿什麼語言寫就無所謂了,反正資料都是json格式的。
當然,我也用他監控hadoop和hbase。代碼加點跟hadoop和hbase相關的就可以了。
本文出自 “實踐檢驗真理” 部落格,謝絕轉載!