Recently I saw the use of Python to realize the train ticket query, I also realized, feel a lot of harvest, below I will each step of the detailed to share out. (Note that the Python3 is used)
First I show the final result:
Execute at cmd command line: Python tickets.py-dk Shanghai Chengdu 20161007 > Result.txt
Check the train information of Shanghai-Chengdu 2016.10.07 D and K, and save it to Result.txt file; The following is the result of the Result.txt file:
The following will be the implementation steps:
1, install the third party library pip install installation: requests,docopt,prettytable
2, docopt can be used to resolve the parameters entered from the command line:
"" "
Usage:
test [-gdtkz] <from> <to> <date>
Options:-
h,--Help shows the menu-
g High speed
-D Train
-T Express
-K fast
-Z Direct
Example:
tickets-gdt Beijing Shanghai 2016-08-25
"" "
Import docopt
args = docopt.docopt (__doc__)
print (args)
# above "" "" "Contains:
#Usage:
# Test [-gdtkz] <from> <to> <date>
#是必须要的 test is freely written and does not affect parsing
The result of the final print is a dictionary that is easy to use later:
3, get the information of the train
We are in 12306 of the remainder of the ticket query interface:
Url:https://kyfw.12306.cn/otn/lcxxcx/query?purpose_codes=adult&querydate=2016-10-05&from_station=cdw &to_station=shh
Method is: Get
Parameters for transmission: querydate:2016-10-05, FROM_STATION:CDW, To_station:shh
Among them the city correspondence abbreviation is to need additional interface query to arrive
3.1 Query the city corresponding abbreviation:
The URL of this interface = ' https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version=1.8968 '
The method is get, using a regular expression on the return result, taking out the value of the city name and abbreviation (the returned value is similar: 7@cqn| Chongqing South | crw|chongqingnan|cqn|, we need is: CRW, Chongqingnan), the code is as follows
parse_stations.py:
#coding =utf-8 from prettytable import prettytable class Traincollection (object): "" "" The Train Information "" "# shows the train, departure/arrival station, departure/arrival time, duration , first-class sitting, second-class sitting, soft sleeper, hard sleeper, hard seat header = ' Serial number of train departure/arrival time/Arrival time duration of business seat second-class soft sleeper seat, and so on. Split () def __init__ (Self,rows,traintypes) : self.rows = Rows self.traintypes = Traintypes def _get_duration (self,row): "" To get the time of the train Run "" "duration = Row.get (' Lishi ') . Replace (': ', ' hour ') + ' minutes ' if Duration.startswith ('): Return duration[4:] elif duration.startswith (' 0 '): return DURATION[1:] Return duration @property def trains (self): result = [] flag = 0 for row in self.rows:if row[' Station_train_ Code '][0] in Self.traintypes:flag = 1 train = [# serial number flag, # train row[' Station_train_code '], # Departure, arrival site '/'. Join ([row[' from
_station_name '],row[' to_station_name ']), # Success, arrival time '/'. Join ([row[' start_time '],row[']], # arrive_time time Self._get_duration (Row), # Business block row[' Swz_num ', # first-class row[' Zy_num ', # second-class row[' Ze_num '], # soft sleeper row[' Rw_num '], # hard sleeper row[' yw_ Num '], # hard seat row[' Yz_num ', # no seats row[' wz_num ']] result.append (train) return result def print_pretty (self): "" "" Print Train Info "" "pt = prettytable () pt._set_field_names (s Elf.header) for train in Self.trains:pt.add_row (train) print (PT) If __name__ = = ' __main__ ': t = traincollection ()
Which pprint this module can be printed information, more convenient to read:
Running in cmd: Python parse_stations.py > stations.py
will be in the current directory to get stations.py file, the file is the site name and abbreviation, in the stations.py file to add "stations =" This is a dictionary, convenient after the value, the following is the contents of the stations.py file:
3.2 Now get the train information parameters are ready, next is to get the return value of the train, to parse out the information they need, such as: Train number, first-class ticket number and so on. , myprettytable.py
#coding =utf-8 from prettytable import prettytable class Traincollection (object): "" "" The Train Information "" "# shows the train, departure/arrival station, departure/arrival time, duration , first-class sitting, second-class sitting, soft sleeper, hard sleeper, hard seat header = ' Serial number of train departure/arrival time/Arrival time duration of business seat second-class soft sleeper seat, and so on. Split () def __init__ (Self,rows,traintypes) : self.rows = Rows self.traintypes = Traintypes def _get_duration (self,row): "" To get the time of the train Run "" "duration = Row.get (' Lishi ') . Replace (': ', ' hour ') + ' minutes ' if Duration.startswith ('): Return duration[4:] elif duration.startswith (' 0 '): return DURATION[1:] Return duration @property def trains (self): result = [] flag = 0 for row in self.rows:if row[' Station_train_ Code '][0] in Self.traintypes:flag = 1 train = [# serial number flag, # train row[' Station_train_code '], # Departure, arrival site '/'. Join ([row[' from
_station_name '],row[' to_station_name ']), # Success, arrival time '/'. Join ([row[' start_time '],row[']], # arrive_time time Self._get_duration (Row), # Business block row[' Swz_num ', # first-class row[' Zy_num ', # second-class row[' Ze_num '], # soft sleeper row[' Rw_num '], # hard sleeper row[' yw_ Num '], # hard seat row[' Yz_num ', # no seats row[' wz_num ']] result.append (train) return result def print_pretty (self): "" "" Print Train Info "" "pt = prettytable () pt._set_field_names (s Elf.header) for train in Self.trains:pt.add_row (train) print (PT) If __name__ = = ' __main__ ': t = traincollection ()
Prettytable This library is able to print out similar MySQL query data displayed in the format,
4, Next is the integration of various modules: tickets.py
"" "Train tickets query via command-line. usage:tickets [-gdtkz] <from> <to> <date> Options:-H,--Help shows the menu-G high Speed-D train-T-Express-fast-Z Direct-Access Examp LE:TICKETS-GDT Beijing Shanghai 2016-08-25 "" "Import requests from docopt import docopt from stations import stations # From Pprint import pprint from myprettytable Import traincollection class Selecttrain (object): Def __init__ (self): "" "Get The command line input parameter "" "Self.args = docopt (__doc__) #这个是获取命令行的所有参数, returns a dictionary def CLI (self):" "Command-Line Interface" "" # Get the departure site and target station Dot from_station = Stations.get (self.args[' <from> ']) #出发站点 to_station = Stations.get (self.args[' <to> ']) Destination site Leave_time = Self._get_leave_time () # departure Time URL = ' https://kyfw.12306.cn/otn/lcxxcx/query?purpose_codes=adult&
QUERYDATE={0}&FROM_STATION={1}&TO_STATION={2} '. Format (leave_time,from_station,to_station) # stitching the URL of the requested train information # Get train Query Results r = Requests.get (url,verify=false) Traindatas = R.json () [' Data '] [' datas '] # Returns the result, converts to JSON format, takes out the datas, To facilitate the analysis of train information with the following #Analysis of train information traintypes = Self._get_traintype () views = Traincollection (traindatas,traintypes) views.print_pretty () def _get_ Traintype (self): "" To get the train model, this function is intended to be: when you enter-G is simply returned to the high speed rail, enter the-GD to return to the train and the high speed rail, when the parameter is not lost, return all trains information "" Traintypes = [' G ', ' d ', ' t ', ' k ', '-Z '] # result = [] # for Traintype in Traintypes: # if Self.args[traintype]: # result.append (Traintype[-1].upper () ) trains = [Traintype[-1].upper () for Traintype in Traintypes if Self.args[traintype]] if Trains:return trains RN [' G ', ' D ', ' T ', ' K ', ' Z '] def _get_leave_time (self): "" "to get the departure time, the function is to: time can enter two formats: 2016-10-05, 20161005" "" Leave_ Time = self.args[' <date> '] if len (leave_time) = = 8:return ' {0}-{1}-{2} '. Format (leave_time[:4],leave_time[4:6), Leave_time[6:]) If '-' in Leave_time:return leave_time if __name__ = = ' __main__ ': CLI = Selecttrain () cli.cli ()
Well, basically it's over, and by the beginning, you'll be able to check the information you want.
The above is a small set to introduce the Python script to achieve 12306 train ticket query system, hope to help everyone, if you have any questions please give me a message, small series will promptly reply to everyone. Here also thank you very much for the cloud Habitat Community website support!