1.python小項目：大資料統計

最後更新：2017-06-11 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

標籤：ext 正則表達返回 line turn 表達寫入 blog rssi

大資料統計1.項目需求，統計海量資料中某一參數的機率分布

2.實現過程

#!/usr/bin env python# -*- coding:utf-8 -*-import redef preprocess(fileName, pattern):    ‘‘‘    將資料集進行預先處理，比如取出RSSI那一列的資料    :param fileName: 接收相對路徑    :param pattern:  接收Regex的模板    :return:         返回Region of interest資料集    ‘‘‘    with open(fileName, ‘r‘, encoding=‘utf-8‘) as f, open(‘laterText.txt‘, ‘w‘, encoding=‘utf-8‘) as f2:        for line in f:            result = re.findall(pattern, line)    #‘.*(-\d{2}),‘            if result:                newContent = result[0] + ‘\n‘                f2.write(newContent)    return ‘laterText.txt‘def sort(fileName):    ‘‘‘    將Region of interest資料集內容取出來放進一個列表    再將列表進行排序，然後再對列表的內容進行統計    :param fileName: ROI資料集的路徑    :return:    ‘‘‘    s1 = []    s_result = []    with open(fileName, ‘r‘, encoding=‘utf-8‘) as f:        for line in f:            line = line.split()[0]            s1.append(line)    s1 = sorted(s1)    for i in s1:        flage = False        for j in s_result:            if i in j:                a, b = j.split(‘:‘)                new_j = a + ‘:‘ + str(int(b) + 1)                s_result.remove(j)                s_result.append(new_j)                flage = True            else:                continue        if flage == False:            new_str = i + ‘:‘ + ‘1‘            s_result.append(new_str)    return s_resultdef finalText(list1):    ‘‘‘    將統計後的列表寫入檔案，結果更加直觀    :param list1: 統計之後的列表    :return: True    ‘‘‘    with open(‘result.txt‘, ‘w‘, encoding=‘utf-8‘) as f2:        for i in list1:            new_line = i + ‘\n‘            f2.write(new_line)    return Trueif __name__ == ‘__main__‘:    inputFile = input(‘Enter a file path:‘)  # 輸入檔案的相對路徑    例  trainText.csv    pattern = input(‘Enter a re expression:‘) #輸入Regex       例  .*(-\d{2}),    laterText = preprocess(inputFile, pattern)  # laterText接收預先處理檔案的路徑 ‘laterText.txt‘    list1 = sort(laterText)  # 將預先處理後的檔案內容取出，放入列表進行排序並統計列表中各個元素出現的次數，並返回一個列表    if finalText(list1):  # 將列表裡面的元素放入一個result.txt裡面        print(‘統計完畢，結果參考result.txt‘)

3.Demo

-47:1-48:2-49:7-50:7-51:23-52:22-53:33-54:58-55:157-56:81-57:200-58:149-59:214-60:269-61:603-62:256-63:636-64:427-65:525-66:585-67:1233-68:483-69:1127-70:654-71:676-72:735-73:1133-74:432-75:766-76:418-77:411-78:395-79:519-80:184-81:321-82:137-83:146-84:138-85:128-86:110-87:96-88:36-89:38-90:20-91:7-92:11-93:1

1.python小項目：大資料統計

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More