Write a script in Python to extract data from the data file exported by Wireshark

Source: Internet
Author: User
Tags time 0

The previous article builds the foundation of a UDP multicast program. The so-called Foundation is to look at it. I can write a simple multicast program and start working on it.

 

Where will the multicast content come from and what content will be broadcast? Haha, there is a device that does not have a communication protocol. It uses Wireshark to capture packets, analyze protocols, and program implementation. This is the task of this multicast.

 

Start Wireshark, capture data packets, export it as a text file, 3 MB file, ultraedit search, observation, eyes are straight, there are device communication data, and online browsing data, there are QQ data, MSN data, and some unknown data. I looked dizzy and felt that the structure of the Communication Data of the device was somewhat regular. I am a programmer. What's the problem? Can I solve it by programming? I can use python to extract data, which makes data analysis easier and can also be tested using extracted data.

 

Let's take a look at the wireshark export data format. Each data segment is "No. "At the beginning, the next line is the source IP address and the target IP address, followed by a data segment identified by" data: ", which is regular. I can use python to extract the data. If the rule is not obvious, I will be confused.

 

Functions to be completed by the python program:

L extract data from the specified IP address and save the data packet to file a. This facilitates observation of the data sent by the device and defines the relevant structure.

L then extract the data part from File A and save it as file B. I read this file B in the VC program and try to parse the data. If the parsing succeeds, my work is almost done.

 

Check the program.

 

# Coding = gb18030 </P> <p> Import fileinput <br/> # bug in this program: when exporting binary data, string Matching starts with "0000" <br/> #, which is the same as the time 0.000000 of the first package of Wireshark exported file <br/>, so first, you need to manually change 0.000000 to 0.000001. <Br/> # This program is used to export data for analysis, you don't have to go around for half a day to fix the above bug <br/> # </P> <p> # before introducing the features of this program, first understand the format of the wireshark file. <Br/> # Wireshark export file format, "no. "is the start flag. <br/> # The next line of the Start flag is the IP address and protocol information, followed by some output of its protocol analysis <br/> # The last is data. <Br/> # No. time source Destination Protocol info <br/> #16 58.5 192.168.0.66 234.5.6.7 UDP source port: 1024 Destination Port: synchronet-DB <br/> # This is the output of Wireshark analysis <br/> # data (40 bytes) <br/> #0000 E9 24 00 00 FF 01 00 02 00 00 00 D2 04 00. $ .............. <br/> #0010 14 00 00 00 00 42 03 01 00 00 00 00 05 00 00 00 .... B ........... <br/> #0020 02 00 00 00 33 00 00 00 .... 3... <br/> # </P> <p> # ----------------------------------------------------- <br/> # the function of this program is to read the export file of Wireshark, filter data of the specified IP address <br/> #1. print_data_text export from a "No. "To the next" no. "content <br/> #2. print_data_bin exports the content of "data" in the following format: <br/> # E9 24 00 00 FF 01 00 02 00 00 00 D2 04 00 00 <br/> #14 00 00 00 00 42 03 01 00 00 00 00 00 00 05 00 00 00 <br/> #02 00 00 00 33 00 00 00 <br/> # ---- <br/> # E9 24 00 00 00 FF 01 00 02 00 00 00 D2 04 00 <br/> #14 00 00 00 00 42 03 01 00 00 00 00 05 00 00 00 <br/> #02 00 00 00 00 33 00 00 00 <br/> # Insert the same "----", </P> <p> #3. print_ip_line exports only the rows containing IP addresses, that is, "No. "The following line </P> <p> # ------------------------------------------------------- <br/> # How to set conditions for data export: <br/> # conditions for data export <br/> # the content of the and field must all meet <br/> # or field content, at least one <br/> # The and field and the or field both meet <br/> # If the and field does not exist, the conditions are met. <br/> # If the or field does not exist, recognize <Br/> # No. time source Destination Protocol info <br/> #14 9.949685 192.168.0.66 234.5.6.7 UDP source port: onehome-remote destination port: synchronet-DB <br/> # example: <br/> # export data that meets the following conditions: <br/> # No. the following row of data, including "234.5.6.7" and "UDP ", <br/> # and include "192.168.0.202" and "192.168.0.22" <br/> # cond = [] <br/> # cond. append ({<br/> # "and": ["234.5.6.7", "UDP"], <br/> # "or": ["192.168.0.202 ", "192.168.0.22"], <br/> # }) <Br/> # ----------------------------------------------------- </P> <p> # export data that meets the conditions, in order to see data in the exported file from <br/> # The source file, the exported data can contain row number information, row number <br/> # Before each row of Data <br/> # filename: name of the original data file <br/> # saveto: name of the exported data storage file <br/> # cond: conditions required for data <br/> # with_ln_number: whether to export data with a row number <br/> # Only data with a row number can be used by the print_data_bin function <br/> def print_data_text (filename, saveto, Cond, with_ln_number = true ): <br/> F1 = open (SA Veto, 'W + ') </P> <p> Ln = [] <br/> myfile = fileinput. input (filename) <br/> for X in myfile: <br/> ln. append (X) <br/> start_flag = 0 <br/> start_ln = 0 <br/> end_ln = 0 <br/> end_flag = 0 <br/> find_count = 0 <br/> Print "" <br/> for I in range (0, len (LN): </P> <p> If ln [I]. find ("no. ") = 0 and start_flag = 1: <br/> start_flag = 0 <br/> If ln [I]. find ("no. ") = 0 and start_flag = 0: <br/> If check_e Xpr (Ln [I + 1], Cond) = true: <br/> start_ln = I <br/> start_flag = 1 <br/> find_count = find_count + 1 <br/> MSG = "" <br/> MSG = ("/ b/B % 02d ") % (find_count) <br/> Print MSG, </P> <p> If start_flag = 1: <br/> MSG = "" <br/> If with_ln_number = true: <br/> MSG = ("% 08d:/T % s ") % (I + 1, Ln [I]) <br/> else: <br/> MSG = ("% s") % (Ln [I]) </P> <p> f1.write (MSG) </P> <p> f1.close () </P> <p> # export the content of the data field in the data packet <br/> # Search for "0000" as the start flag <br/> # search for "data:" As the end flag <br/> def print_data_bin (filename, saveto, data_start, data_end, with_ln_number = true): <br/> F1 = open (saveto, 'W + ') <br/> Ln = [] <br/> myfile = fileinput. input (filename) <br/> for X in myfile: <br/> ln. append (X) <br/> start_flag = 0 <br/> start_ln = 0 <br/> end_ln = 0 <br/> end_flag = 0 <br/> find_count = 0 <br/> Print "" <br/> for I in range (0, len (L N): </P> <p> If start_flag = 0 and check_expr (Ln [I], data_start) = true: <br/> start_flag = 1 <br/> MSG = ("% s/n") % (Ln [I] [16: 63]) <br/> f1.write ("----/N") <br/> If start_flag = 1 and check_expr (Ln [I], data_end) = true: <br/> start_flag = 0 </P> <p> If start_flag = 1: <br/> MSG = "" <br/> If with_ln_number = true: <br/> MSG = ("% s/n") % (Ln [I] [16: 63]) <br/> else: <br/> MSG = ("% s/n") % (Ln [I] [6:53]) </P> <P> f1.write (MSG) <br/> # If I> 100: <br/> # Break <br/> f1.close () </P> <p> def check_expr (LN, expr): <br/> expr_flag = false <br/> for X in expr: <br/> and_flag = true <br/> or_flag = false <br/> If 'and' in X: <br/> for Y in X ["and"]: <br/> If ln. find (y) =-1: <br/> and_flag = false <br/> Break </P> <p> If 'or' in X: <br/> for Y in X ["or"]: <br/> If ln. find (y )! =-1: <br/> or_flag = true <br/> Break <br/> else: <br/> or_flag = true </P> <p> If and_flag = true and or_flag = true: <br/> expr_flag = true <br/> Break <br/> return expr_flag </P> <p >#< br/> # For ease of viewing the data network, only <br/> # print the row of data with an IP address <br/> # <br/> def print_ip_line (filename, saveto, Cond ): <br/> F1 = open (saveto, 'W + ') <br/> Ln = [] <br/> myfile = fileinput. input (filename) <br/> for X in myfile: <br/> ln. append (X) <br/> start_flag = 0 <br/> start_ln = 0 <br/> end_ln = 0 <br/> end_flag = 0 <br/> find_count = 0 <br/> Print "", <br/> for I in range (0, Len (LN): </P> <p> If ln [I]. find ("no. ") = 0: <br/> If check_expr (Ln [I + 1], Cond) = true: <br/> start_ln = I + 1 <br/> start_flag = 1 <br/> find_count = find_count + 1 <br/> MSG = "" <br/> MSG = ("/B % 02d ") % (find_count) <br/> Print MSG, </P> <p> If start_flag = 1: <br/> MSG = "" <br/> MSG = ("% 08d:/T % s") % (start_ln + 1, Ln [start_ln]) <br/> f1.write (MSG) <br/> start_flag = 0 </P> <p> f1.close () </P> <p> If _ name _ = '_ main _': </P> <p> src_data_file = "C: // ws.txt "<br/> txt_data_file_11 =" D: // re_11.txt "<br/> bin_data_file_11 =" D: // bin_11.txt "<br/> txt_data_file_66 =" D: // re_66.txt "<br/> bin_data_file_66 =" D: // bin_66.txt "<br/> # conditions for data export <br/> # the content of the and field must all meet <br/> # or field content, at least one <br/> # The and field and the or field both meet <br/> # If the and field does not exist, the conditions are met. <br/> # If the or field does not exist, <br/> # example: <br/> # No. time source Destination Protocol info <br/> #14 9.949685 192.168.0.202 234.5.6.7 UDP source port: onehome-remote destination port: synchronet-DB <br/> # export data that meets the following conditions: <br/> # No. the following row of data, including "234.5.6.7" and "UDP ", and contains "192.168.0.202" and "192.168.0.22" <br/> # cond = [] <br/> # cond. append ({<br/> # "and": ["234.5.6.7", "UDP"], <br/> # "or": ["192.168.0.202 ", "192.168.0.22"], <br/>#}) <br/>#</P> <p> cond = [] <br/> cond. append ({<br/> "and": ["234.5.6.7", "UDP"], <br/> "or": ["192.168.0.202", "192.168.0.22"], <br/>}) </P> <p> getdata_start = [] <br/> getdata_start.append ({<br/> "and": ["0000"], <br/>}) <br/> getdata_end = [] <br/> getdata_end.append ({<br/> "and": ["data:"], <br/>}) </P> <p> # print getdata_start <br/> # print_ip_line ('C: // re.txt ', Cond) <br/> filter_data_11 = [] <br/> filter_data_11.append ({<br/> "and": ["192.168.0.22", "234.5.6.7", "UDP"], <br/>}) <br/> filter_data_66 = [] <br/> filter_data_66.append ({<br/> "and": ["192.168.0.202", "234.5.6.7 ", "UDP"], <br/>}) </P> <p> print_data_text (src_data_file, bytes, filter_data_11) <br/> print_data_bin (bytes, bin_data_file_11, getdata_start, getdata_end) <br/> print_data_text (src_data_file, delimiter, filter_data_66) <br/> print_data_bin (delimiter, delimiter, getdata_start, getdata_end) <br/> print_data_bin (src_data_file, "D: // ttxx1.txt ", getdata_start, getdata_end, 0) <br/> print_ip_line (src_data_file," d: // ttxx2.txt ", Cond) <br/>

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.