最過在看Desperate_Housewives_-_Season_1,奇藝上只有中文字幕,對於我等希望練習英語聽力的人來講是一大缺憾。網上遍尋不到合適的工具來顯示外掛字幕。正好最近在學習Python,於是心想求人不如求已,自已動手做一個得了。
凡事得有步驟,我的構想如下:
1. 分析SRT格式檔案;
2. 提取時間資訊和要顯示的字元,此為最重要的部分,最好的方式是調用Python的Regex來提取相關的資訊;
3. 調用pyosd顯示,類似於QQ音樂播放器的歌詞顯示功能;
關於SRT的說明,可以參考http://en.wikipedia.org/wiki/SubRip。不過因為工作中經常接觸外掛字幕,所以對於SRT也有一定的瞭解.
The SubRip file format is "perhaps the most basic of all subtitle formats."[10] SubRip files are named with the extension .srt, and contain formatted plain text. The time format used is hours:minutes:seconds,milliseconds. The decimal separator used is the comma, since the program was written in France. The line break used is often the CR+LF pair. Subtitles are numbered sequentially, starting at 1.
Subtitle number //相當於index,標記subtitle的序號
Start time --> End time //開始與結束時間,duration可以據此計算出來
Text of subtitle (one or more lines) //字幕資訊
Blank line[11][10] //空白行
以下是實現的代碼,很rough, 我還在修改中,只是實現了部分功能:
import reimport pyosdimport sysimport getoptimport timeclass srtParsing(): index = 0 #hour minute sec = 0 duration = 0 print time.time() def srtGetIndex(self, line):reg = re.compile('\d')if(reg.search(line)): print line def srtGetTimeStamp(self, line):reg = re.compile('\-\-\>')p = pyosd.osd()if(reg.search(line)): print line time = line.split('-->') #START TIME: hour_end = time[1].split(':') minute_end = int(hour_end[1]) sec_end = hour_end[2].split(',') hour_end = int(hour_end[0]) mis_end = int(sec_end[1]) sec_end = int(sec_end[0]) print "end-->h:%d m:%d s:%d,mis:%d" %(hour_end, minute_end, sec_end, mis_end) #END TIME: hour_start = time[0].split(':') minute_start = int(hour_start[1]) sec_start = hour_start[2].split(',') hour_start = int(hour_start[0]) mis_start = sec_start[1] sec_start = int(sec_start[0]) time_start = hour_start * 60 * 60 + minute_start * 60 + sec_start print "start time :%d" %time_start time_end = hour_end * 60 * 60 + minute_end * 60 + sec_end print "end time:%d" %time_end duration = time_end - time_start print duration p.set_timeout(duration) def srtGetSubInfo(self, line):reg = re.compile(r'^[a-zA-Z]')p = pyosd.osd()p.set_pos(pyosd.POS_BOT)p.set_colour("YELLOW")p.set_align(1)#p.set_shadow_offset(10)p.set_vertical_offset(100)if(reg.search(line)): print line p.display(line) p.wait_until_no_display()if __name__ == "__main__": srt = srtParsing() f=open("/home/workspace/subtitle/src/dh.srt") for line in f: srt.srtGetTimeStamp(line) srt.srtGetSubInfo(line)
下一步工作是對時間的控制,需要從系統中擷取時間與標籤對比,從而精確控制顯示.