It's easy to capture XML files. You just need to check the source file on the playback page to determine the URL of the XML file for capturing.
This article mainly describes the process of subtitle switching in XML.
Apart from some things starting and ending with an XML file, the bullet screen subject is like this:
<D p = "51.593, 576008622, 16742580, 1408852480, 0, 7fa769b4,"> try again! </D> <D p = "10.286, 576011065, 16777215,1408852600, 0, a3af4d0d,"> Yan Yi? </D> <D p = "12.65, 576014281, 16777215,1408852761, 0, 24570b5a,"> my goddess! </D> <D p = "19.033, 576014847, 16777215,1408852789, 0, cb20d1c7,"> before !!! </D> <D p = "66.991, 576016806, 16777215,1408852886, 0, a78e484d,"> renewed </D>
If it separately expresses the various attributes of the bullet screen, I will use the encoding/XML package to decode it, but I put all the attributes of the bullet screen in P, so I use a regular expression to extract it.
The first bullet screen in the preceding table is used as an example. Obviously, the floating point number starting with the p attribute can be compared with the playback time, indicating the playing time of the bullet screen.
Skip the Next Steps 1 and 25;
16777215. The visual indicator should be color (because the hexadecimal value is ffffff );
1408852480, it is increasing in the bullet screen, and it should be UNIX time. Use this number (d) to calculate: D/86400/365 .2425 + 1970, and the result is about 2014.6. It seems that it is indeed UNIX time. The estimated time is the time when the bullet screen was created.
0. I don't know. I captured a lot of video bullet screens. This location is 0. I don't care about it for the time being.
7fa769b4 is estimated to be the Creator's ID. Because the same XML file appears multiple times and looks like a hexadecimal number, some hash functions return 4-byte integers.
576008622 is also incremental. You don't have to guess it. This is definitely the ID of the bullet screen.
After the event, check again. Sure enough, 1 indicates the type of the bullet screen (from right to left, shown below or above ......), 25 indicates the font size and 16777125 indicates the font color.
Therefore, we only need to capture the time, type, size, color, and text of each bullet screen.
Regular Expression:
<d\sp="([\d\.]+),([145]),(\d+),(\d+),\d+,\d+,\w+,\d+">([^<>]+?)</d>
The key to capturing a bullet screen is to arrange the bullet screen as a subtitle algorithm.
I am confused about this algorithm. It adopts the principle of fixed moving speed and minimum overlapping layout.
When you move a game to the screen, you will tend to select the position of the following line. If there is overlap, select the next line (the lowest row loops to the top row). If there are no non-overlapping rows, rows with the least number of overlapping texts are selected.
For a fixed bullet screen that is hidden from the current or the next, the rows that are closest to the top or bottom and do not overlap are selected. If no rows do not overlap, select the rows with the shortest overlap time and place subtitles in the center.
The default font is. The default size is 25, and the default value is white black. The entire screen is filled with 12 rows by default. The default screen size is 640x360.
In this way, we mainly aim to bring the ass subtitle effect closer to the original bullet screen effect.
The advanced bullet screens are really out of my capacity and are ignored.
The Go source code is as follows:
// Convert the XML bullet screen file of bilibili into an ass subtitle file. // In the XML file, the bullet screen format is as follows: // <D p = "32.066, 16777215,1409046965, 0, 017d3f58, 579516441 "> floor praise </D> // P attributes include time, bullet screen type, font size, font color, Creation Time, and ,?, Creator ID, bullet screen ID. // P attributes, the last four items are useless for ass subtitles and discarded. The text enclosed by <D> and </D> is the bullet screen text. // Only normal bullet screens of the Right to left, top to bottom, and bottom to bottom are processed. Package mainimport ("FMT" "Io" "IO/ioutil" "math" "OS" "Regexp" sort "" strconv "strings ") // ass file header const header = '[script info] scripttype: v4.00 + Collisions: normalplayresx: 640 playresy: 360 [V4 + styles] format: name, fontname, fontsize, primarycolour, secondarycolour, outlinecolour, backcolour, bold, italic, underline, strikeout, scalex, scaley, spacing, angle, borderstyle, outline, shadow, alignment, Ma Rginl, marginr, marginv, encodingstyle: Default, Microsoft yahei, 25, & h00ffffff, & h00ffffff, & h00000000, & h00000000, 0, 0, 0, 0,100,100, 0.00, 0.00, 1, 1, 0, 2, 10, 10, 10, 0 [events] format: layer, start, end, style, name, marginl, marginr, marginv, effect, text' // obtain the original bullet screen information using regular expression match var line = Regexp. mustcompile ('<D \ sp = "([\ D \.] +), ([145]), (\ D +), (\ D +), \ D +, \ D +, \ W +, \ D + "> ([^ <>] + ?) </D> ') // used to keep the bullet screen information. Type danmu struct {text stringtime float64kind bytesize intcolor int} // enable [] danmu to implement sort. interface to sort type danmus [] danmufunc (d danmus) Len () int {return Len (d)} func (d danmus) less (I, j INT) bool {return d [I]. time <D [J]. time} func (d danmus) Swap (I, j INT) {d [I], d [J] = d [J], d [I]} // enter the data matched by the regular expression in the danmu type func fill (D * danmu, s [] [] Byte) {d. time, _ = strconv. parsefloat (string (s [1] ), 64) d. kind = s [2] [0]-'0' D. size, _ = strconv. atoi (string (s [3]) BGR, _: = strconv. atoi (string (s [4]) d. color = (BGR> 16) & 255) | (BGR & (255 <8) | (BGR & 255) <16) d. TEXT = string (s [5])} // returns the length of the text. Assume that all ASCII characters are 0.5 characters long, and the rest are 1 character long func length (s string) float64 {L: = 0.0for _, R: = range s {if r <127 {L + = 0.5} else {L + = 1} return l} // The Ass format of the generated time point is: '0: 00: 00.00 'functimespot (F float64) string {H, F: = math. MODF (F/3600) M, F: = math. MODF (F * 60) return FMT. sprintf ("% d: % 02d: % 05.2f", INT (H), INT (M), f * 60 )} // read the file and obtain the bullet screen func open (name string) ([] danmu, error) {data, err: = ioutil. readfile (name) If Err! = Nil {return nil, err} Dan: = line. findallsubmatch (data,-1) ans: = make ([] danmu, Len (DAN) for I: = Len (DAN)-1; I> = 0; I -- {fill (& Ans [I], Dan [I])} return ans, nil} // arrange the bullet screens and write them to W, func save (W Io. writer, dans [] danmu) {p1: = make ([] float64, 12) P2: = make ([] float64, 12) P3: = make ([] float64, 12) SP: = 0for _, Dan: = range dans {L: = length (Dan. text) * 25if L = 0 {continue} s: = "" If Dan. Size! = 25 {S + = FMT. sprintf ("\ FS % d", Dan. Size)} If Dan. color! = 0x00ffffff {S + = FMT. sprintf ("\ C & H % 06x", Dan. color)} switch Dan. kind {Case 1: // right to left PLS: = Dan. time * 90-l/2 m, K: = PLS + 10000, 0for I: = 0; I <12; I ++ {T: = (I + SP + 1) % 12If P1 [T] <= PLS {k = tbreak} If M> P1 [T] {k = TM = P1 [T]} sp = kp1 [Sp] = pls + LS + = FMT. sprintf ("\ move (% d, % d)", 640 + int (l/2), SP * 30 + 27, -int (l/2), SP * 30 + 27) FMT. fprintf (W, "Dialogue: 1, % s, % s, default, 0000 ,,{ % S} % s \ n ", timespot (Dan. time), timespot (Dan. time + (L + 640)/90), S, Dan. text) case 4: // bottom hidden M, K: = Dan. time + 10000, 0for I: = 0; I <12; I ++ {T: = (I + 1) % 12If P2 [T] <= Dan. time {k = tbreak} If M> P2 [T] {k = TM = P2 [T]} P2 [k] = Dan. time + 3.6 S + = FMT. sprintf ("\ pos (% d, % d)", 320, (11-K) * 30 + 27) FMT. fprintf (W, "Dialogue: 2, % s, % s, default, 0000, {% s} % s \ n", timespot (Dan. time + 0), timespot (Dan. (Time + 3.6), S, Dan. text) case 5: // The upper hidden M, K: = Dan. time + 10000, 0for I: = 0; I <12; I ++ {T: = (I + 1) % 12If P3 [T] <= Dan. time {k = tbreak} If M> P3 [T] {k = TM = P3 [T]} P3 [k] = Dan. time + 3.6 S + = FMT. sprintf ("\ pos (% d, % d)", 320, K * 30 + 27) FMT. fprintf (W, "Dialogue: 3, % s, % s, default, 0000, {% s} % s \ n", timespot (Dan. time + 0), timespot (Dan. time + 3.6), S, Dan. text) }}// main function, implementing the command line func main () {if Len (OS. ARGs) <= 1 {OS. Exit (0)} For _, name: = range OS. ARGs [1:] {dans, err: = open (name) If Err! = Nil {OS. Exit (1)} If N: = strings. lastindex (name, "."); n! =-1 {name = Name [: N]} name + = ". Ass" file, err: = OS. Create (name) If Err! = Nil {OS. Exit (2)} file. writestring (header) sort. Sort (danmus (Dans) Save (file, dans) file. Close ()}}
Over. Thank you for your comments.
Bilibili bullet screen to ass