Analysis of SRT in Python

Source: Internet
Author: User

Desperate_housewives _-_ season_1 has only Chinese subtitles on iQiYi, which is a shortcoming for those who want to practice English listening. No suitable tools can be found online to display external subtitles. Just recently I was learning python, so I thought that it would be better to have a hand to do it.

Everything has to be done. My ideas are as follows:
1. Analyze SRT files;
2. extract time information and characters to be displayed. This is the most important part. The best way is to call the python regular expression to extract related information;
3. Call the pyosd display function, which is similar to the lyrics display function of the QQ music player;

For more information about SRT, see http://en.wikipedia.org/wiki/subrip. However, due to frequent access to external Subtitles at work, you also have a certain understanding of SRT.
The SubRip file format is "perhaps the most basic of all subtitle formats. "[10] SubRip files are named with the extension. SRT, and contain formatted plain text. the time format used is hours: Minutes: seconds, milliseconds. the decimal separator used is the comma, since the program was written in France. the line break used is often the CR + LF pair. subtitles are numbered sequentially, starting at 1.

Subtitle number // equivalent to index, marking the serial number of subtitle
Start Time --> end time // the start time and end time. The duration can be calculated accordingly.
Text of subtitle (one or more lines) // subtitle Information
Blank line [11] [10] // blank line

The following is the implementationCodeVery rough. I am still modifying it, but I have implemented some functions:

Import reimport pyosdimport sysimport getoptimport timeclass srtparsing (): Index = 0 # hour minute sec = 0 duration = 0 print time. time () def srtgetindex (self, line): Reg = Re. compile ('\ D') if (Reg. search (line): print line def srtgettimestamp (self, line): Reg = Re. compile ('\-\>') P = pyosd. OSD () if (Reg. search (line): print line time = line. split ('-->') # Start Time: hour_end = time [1]. split (':') minute_end = int (hour_end [1]) sec_end = hour_end [2]. split (',') hour_end = int (hour_end [0]) mis_end = int (sec_end [1]) sec_end = int (sec_end [0]) print "end --> H: % d M: % d S: % d, MIS: % d "% (hour_end, minute_end, sec_end, mis_end) # End Time: hour_start = time [0]. split (':') minute_start = int (hour_start [1]) sec_start = hour_start [2]. split (',') hour_start = int (hour_start [0]) mis_start = sec_start [1] sec_start = int (sec_start [0]) time_start = hour_start * 60*60 + minute_start * 60 + sec_start print "Start Time: % d "% time_start time_end = hour_end * 60*60 + minute_end * 60 + sec_end print" End Time: % d "% time_end duration = time_end-time_start print duration p. set_timeout (duration) def srtgetsubinfo (self, line): Reg = Re. compile (R' ^ [A-Za-Z] ') P = pyosd. OSD () p. set_pos (pyosd. pos_bot) p. set_colour ("yellow") p. set_align (1) # P. set_shadow_offset (10) p. set_vertical_offset (100) if (Reg. search (line): print line P. display (line) p. wait_until_no_display () If _ name _ = "_ main _": SRT = srtparsing () F = open ("/home/workspace/subtitle/src/DH. SRT ") for line in F: SRT. srtgettimestamp (line) SRT. srtgetsubinfo (line)

The next step is to control the time. You need to obtain the time from the system and compare it with the tag to precisely control the display.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.