The train ticket for the Spring Festival of 2018 is on sale today. You can use Python to grab a ticket and go home for the New Year.
Author protream
Original article: http://www.jianshu.com/p/f411d7e10c41
Note: This article is an article edited by protream and marvin.
First, let's take a look at how to quickly view the remaining train tickets?
When you want to query the train ticket information, do you still go to the 12306 official website? Or open the APP on your phone? Next, let's use Python to write a line-of-command train ticket viewer. You only need to type a line of command on the command line to get the desired train ticket information! If you have just mastered the basics of Python, this will be a good little exercise.
Interface Design
An application is ultimately intended for use, even if it is for your own use. So first, you should think about how you want to use it? Let's give a name to this small application first. Since it is used to query ticket information, it is calledtickets
Okay. We hope that users can get the desired information as long as they enter the origin site, arrival site, and date.tickets
It should be used as follows:
$ tickets from to date
In addition, there are various types of trains: high-speed trains, bullet trains, express trains, fast trains, and direct trains. We hope to provide options to query only one or more specific trains, we should have the following options:
-G high-speed trains
-D EMU
-T express
-K fast
-Z direct
These options should be used in combination. Therefore, our interface should look like this:
$ tickets [-gdtkz] from to date
The interface has been fixed, and the rest is to implement it.
Development Environment
A good practice for writing Python programs is to usevirtualenv
This tool creates a virtual environment. Our program is developed using Python3. Next we will create a folder in your working directory.tickets
To create a virtual environment:
$ virtualenv -p /usr/bin/python3 venv
Run the following command to activate it:
$ . venv/bin/activate
Resolution Parameters
Python has many tools for writing command line applications, such as argparse, docopt, options... Here, we use the simple and easy-to-use tool docopt. We should install it first:
$ pip3 install docopt
Docopt can parse parameters according to the format defined in the document string.tickets.py
Medium:
# Coding: UTF-8
"Train tickets query by command-line.Usage: tickets [-gdtkz] <from> <to> <date> Options:-h, -- help display help menu-g -d -t -k z-z Example: tickets Nanjing Beijing 2016-07-01 tickets-dg Nanjing Beijing 2016-07-01 """
From docopt import docopt
Def cli (): "command-line interface" arguments = docopt (_ doc _) print (arguments)
If _ name _ = '_ main _': cli ()
Run the following program:
$ Python3 tickets. py Shanghai Beijing
The following Parameter Parsing result is obtained:
{'-D': False,'-G': False, '-k': False,'-t': False, '-Z': False, '<date>': '2017-07-01 ',' <from> ': 'shanghai',' <to> ': 'beijing '}
Get Data
The parameters have been parsed. The following describes how to obtain the data, which is also the most important part. First, open 12306 to go to the ticket remaining query page. If you use chrome, pressF12
Open the developer tool and selectNetwork
In the query box, enterShanghai
ToBeijing
, Date2016-07-01
Click query. We found in the debugging tool that the query system actually requested this URL.
https://kyfw.12306.cn/otn/lcxxcx/query?purpose_codes=ADULT&queryDate=2016-07-01&from_station=SHH&to_station=BJP
And the returned result isJson
Format data! Next, the problem is simple. We only need to construct the request URL and parse the returned Json data. But we found thatfrom_station
Andto_station
It is not a Chinese character, but a code, and the user inputs a Chinese character. How do we get the code? Open the web page source code to see if there is any discovery.
Aha! Sure enough, we found this link in the web page: https://kyfw.12306.cn/otn/resources/js/framework/station_name.js? Station_version = 1.8955. It seems to contain Chinese names, Pinyin, abbreviations, and codes of all stations. We save itstations.html
. However, this information is crowded together, and we only want the code information of Chinese names and uppercase letters. What should we do?
BINGO! Regular Expression. Let's write a small script to match and extract the desired information. In parse. py:
# coding: utf-8
import re
from pprint import pprint
with open('stations.html', 'r') as f: text = f.read() stations = re.findall(u'([\u4e00-\u9fa5]+)\|([A-Z]+)', text) pprint(dict(stations), indent=4)
We run this script, which returns all the stations and their uppercase letters and codes in a dictionary and redirects the resultsstations.py
Medium,
$ python3 parse.py > stations.py
We add a name for this dictionary,stations
, Eventually,stations.py
The file is as follows:
Stations = {'one Fort ': 'yjt', 'upgrade': 'ypetab',... 'long Zhen ': 'lza', 'Keel dian': 'lgm'
}
Now, if you enter the Chinese name of the station, we can directly obtain its letter code from the dictionary:
...
From stations import stations
Def cli (): arguments = docopt (_ doc _) from_staion = stations. get (arguments ['<from>']) to_station = stations. get (arguments ['<to>']) date = arguments ['<date>'] # Build URL url = 'https: // kyf201712306.cn/otn/lcxxcx/query? Purpose_codes = ADULT & queryDate ={}& from_station ={}& to_station ={} '. format (date, from_staion, to_station)
Everything is ready. Let's request this URL for data! Here we userequests
Install this library first:
$ pip3 install requests
It provides very easy-to-use interfaces,
...
Import requests
Def cli ():... # Add the verify = False parameter. Do not verify the certificate r = requests. get (url, verify = False) print (r. json ())
From the results, we can see that the information related to the ticket needs to be further extracted:
def cli(): ... r = requsets.get(url); rows = r.json()['data']['datas']
Display result
The data has been obtained, and the rest is to extract the information we need and display it.prettytable
This database allows us to format and display data like the MySQL database.
$ pip3 install prettytable
Use it like this:
...
From prettytable import PrettyTable
Def cli ():... headers = 'train station time lasted business first class second class soft sleeper hard sleeper soft seat no seat '. split () pt = PrettyTable () pt. _ set_field_names (headers)
For row in rows:
# Filter information from the row according to headers, and then call pt. add_row () to add it to the table... print (pt)
Next, let's take a look at how to automatically grab:
From: Python Chinese community (No.: python-china)
Author: marvin, Internet practitioner, now lives in Zhangjiang, Shanghai
You can use Python + Splinter to automatically refresh the ticket to get the ticket. (Depending on your own network environment is too powerful, there are good or bad machines)
Splinter is an open-source Web application testing tool developed using Python. It can help you automatically browse the site and interact with it. When executing Splinter, it will automatically open your specified browser, access the specified URL. Then, any simulated behavior you develop will be automatically completed. You just need to sit in front of your computer and watch the various actions on the screen like watching a movie and then collect the results.
Python code snippet for 12306 ticket snatching
1. Functions for Automatic Logon:
2. Functions for ticket purchase
Source code download: Https://pan.baidu.com/s/1eSClOXW
Sharing a circle of friends is another kind of appreciation
The more we share, The more we have
Welcome to the efficient data analysis community
Add me to the big data dry goods group: tongyuannow
More than 100000 people are interested in joining us