Faster than 12306! Use Python to write a train ticket viewer ~, Python 12306

Source: Internet
Author: User
Tags virtualenv pprint

Faster than 12306! Use Python to write a train ticket viewer ~, Python 12306

Author: Wayne Shi
Link: https://zhuanlan.zhihu.com/p/22235740


When you are going out to play and want to query the train ticket information, are you still using the 12306 official website? Next we will use Python to write a line-of-command line train ticket viewer. You only need to type a line-of-command on the command line to get the train ticket information you want!


1. Experiment Introduction

1.1 knowledge points

  • Comprehensive application of Python3 basic knowledge

  • Use of docopt, requests, and prettytable Databases

  • Use of setuptools


1.2 Results


Ii. Interface Design

Let's give a name to this small application first. Now that we can query the ticket information, we can call it tickets. We hope that users can get the desired information as long as they enter the origin site, arrival site, and date. Therefore, tickets should be used as follows:

$ Tickets from to date

In addition, there are various types of trains: high-speed trains, bullet trains, express trains, fast trains, and direct trains. We hope to provide options to query only one or more specific trains, we should have the following options:

  • -G high-speed trains

  • -D EMU

  • -T express

  • -K fast

  • -Z direct

These options should be used in combination. Therefore, our interface should look like this:

$ Tickets [-gdtkz] from to date

The interface has been fixed, and the rest is to implement it.


Iii. Code Implementation

A good practice for writing Python programs is to use virtualenv to create a virtual environment. Our program is developed using Python3. Here we will create a folder named tickets in your working directory, create a virtual environment, and activate it:


$ Virtualenv-p/usr/bin/python3 venv $. venv/bin/activate


Install the library required for the experiment:

$ Sodo pip install requests prettytable docopt
  • You don't need to talk about requests. Use Python to access the necessary libraries of HTTP resources.

  • Docopt, Python3 Command Line Parameter Parsing tool.

  • Prettytable, a formatting tool that allows you to print data like MySQL.


3.1 resolution Parameters

Python has many Command Line Parameter Parsing tools, such as argparse, docopt, and click. Here we use docopt, a simple and easy-to-use tool.

Docopt can parse parameters according to the format defined in the document string. For example, in tickets. py:

# Coding: UTF-8
"Train tickets query via command-line.

Usage:
Tickets [-gdtkz] <from> <to> <date>

Options:
-H, -- help: display the help menu
-G high-speed trains
-D EMU
-T express
-K fast
-Z direct

Example:
Tickets Shanghai Beijing
""
From docopt import docopt

Def cli ():
"Command-line interface """
Arguments = docopt (_ doc __)
Print (arguments)

If _ name _ = '_ main __':
Cli ()

Run the following program:

$ Python3 tickets. py Shanghai Beijing

We get the following result:

{'-D': False,'-G': False, '-k': False,'-t': False, '-Z': False, '<date>': '2017-12-05 ',' <from> ': 'shanghai',' <to> ': 'beijing '}


3.2 obtain data

The parameters have been parsed. The following describes how to obtain the data, which is also the most important part. First, open 12306 to go to the ticket remaining query page. If you use Chrome, press F12 to open the developer tool, select the Network column, and enter Shanghai to Beijing in the query box, on December 5, click query. We found in the debugging tool that the query system actually requested this URL:

Https://kyfw.12306.cn/otn/lcxxcx/query? Purpose_codes = ADULT & queryDate = 2017-12-05 & from_station = SHH & to_station = BJP

The returned data is in JSON format!


Next, the problem is simple. We only need to construct the request URL and parse the returned Json data. However, we found that from_station and to_station in the URL are not Chinese characters or pinyin, but a code. What do we want to input is Chinese characters or pinyin? How do we get the code? Open the web page source code to see if there is any discovery.

Sure enough, we found this link in the web page: https://kyfw.12306.cn/otn/resources/js/framework/station_name.js? Station_version = 1.8955 it may contain Chinese names, Pinyin, abbreviations, codes, and other information of all stations. But this information is crowded together, and we only want the station Pinyin and uppercase letters of the code information, what should we do?


The regular expression is the answer. Let's write a small script to match and extract the desired information, in parse_station.py:

# Coding: utf-8import reimport requestsfrom pprint import pprinturl = 'https: // kyf11212306.cn/otn/resources/js/framework/station_name.js? Station_version = 1.8955 'text = requests. get (url, verify = False) stations = re. findall (R' ([A-Z] +) \ | ([a-z] +) ', text) stations = dict (stations) stations = dict (zip (stations. values (), stations. keys () pprint (stations, indent = 4)

Note: After the matching result of the above regular expression is converted into a dictionary, the dictionary key is uppercase letters and large numbers. This is obviously not the expected result. Therefore, we use a transform to reverse the key value.


We run this script and it will return all stations and their uppercase letters and codes in a dictionary. We will redirect the results to stations. py,

$ Python3 parse_station.py> stations. py

We add the name stations to the dictionary and enter the Chinese name of the station. Then we can get its letter code from the dictionary:

...

From stations import stations


Def cli ():

Arguments = docopt (_ doc __)

From_staion = stations. get (arguments ['<from>'])

To_station = stations. get (arguments ['<to>'])

Date = arguments ['<date>']

# Construct a URL

Url = 'https: // kyf201712306.cn/otn/lcxxcx/query? Purpose_codes = ADULT & queryDate ={}& from_station ={}& to_station ={} '. format (

Date, from_staion, to_station

)

Everything is ready. Let's request this URL for data! Here we use the requests library, which provides very easy-to-use interfaces,

...
Import requests

Def cli ():
...
# Add the verify = False parameter to not verify the certificate
R = requests. get (url, verify = False)
Print (r. json ())
From the results, we can see that the information related to the ticket needs to be further extracted: def cli ():
...
R = requsets. get (url );
Rows = r. json () ['data'] ['datas']

3.3 parse data

We encapsulate a simple class to parse data:

From prettytable import PrettyTableclass TrainCollection (object ): # display the number of trains, departure/arrival stations, departure/arrival time, duration, first-class sit, second-class sit, soft sleeper, hard sleeper, hard seat header = 'train station time duration first second softsleep hardsleep hardsit'. split () def _ init _ (self, rows): self. rows = rows def _ get_duration (self. row): "Get train run time" duration = row. get ('lishi '). replace (':', 'H') + 'M' if duration. startswith ('00'): return duration [4:] if duration. startswith ('0'): return duration [1:] return duration
@ Property def trains (self): for row in self. rows: train = [# train Number row ['Station _ train_code '], # departure and arrival station' \ n '. join ([row ['from _ staion_name '], row ['to _ station_name']), # Start Time, arrival time '\ n '. join ([row ['start _ time'], row ['arrive ']), # time self. _ get_duration (row), # first-class sit row ['zy _ num'], # second-class sit row ['ze _ num'], # Soft Sleeper row ['rw _ num'], # soft seat row ['yw _ num'], # hard seat row ['yz _ num'] yield train
Def pretty_print (self): "the data has been obtained. The rest is to extract the information we want and display it. The 'prettytable' library allows us to format and display data like the MySQL database. "Pt = PrettyTable () # Set the title pt for each column. _ set_field_names (self. header) for train in self. trains: pt. add_row (train) print (pt)

3.4 Display Results

Finally, we will summarize the above process and output the result to the screen:

... Class TrainCollection :...... def cli (): arguments = docopt (_ doc _) from_staion = stations. get (arguments ['<from>']) to_station = stations. get (arguments ['<to>']) date = arguments ['<date>'] # Build URL url = 'https: // kyf201712306.cn/otn/lcxxcx/query? Purpose_codes = ADULT & queryDate ={}& from_station ={}& to_station = {}'. format (date, from_staion, to_station) r = requests. get (url, verify = False) rows = r. json () ['data'] ['datas'] trains = TrainCollection (rows) trains. pretty_print () if _ name _ = '_ main _': cli ()

The above is the experiment today. You can try it yourself ~


Click "read full text" below to learn Python now

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.