The address of this article is http://www.cnblogs.com/kemaswill/. the contact person is kemaswill@163.com.
DTW is a method to measure the similarity between two time series. It is mainly used in the speech recognition field to identify whether two speech segments represent the same word.
1. DTW method principle
In time series, the length of the two time series that require Similarity comparison may not be equal. In the field of speech recognition, the speech speed varies with people. In addition, the pronunciation speed of different Phoneme in the same word is also different. For example, some people may drag the 'a' sound for A long time or send 'I' for A short time. In addition, different time series may only have the displacement on the timeline, that is, the two time series are consistent in the case of reduction displacement. In these complex cases, the distance (or similarity) between two time series cannot be effectively obtained using the traditional Euclidean distance ).
DTW extends and shortens the time series to calculate the similarity between the two time series:
As shown in, the upper and lower solid lines represent two time series, and the dotted lines between time series represent similar points between the two time series. DTW uses the sum of distance between all these similar points, which is calledNormalized Path Distance(Warp Path Distance) to measure the similarity between two time series.
2. DTW calculation method:
So that the two time series for similarity calculation are X and Y, respectively. The length is | X | and | Y |.
2.1 normalize the Path (Warp Path)
The path is in the form of W = w1, w2 ,..., wK, where Max (| X |, | Y |) <= K <= | X | + | Y |.
The wk format is (I, j), where I represents the I coordinate in X, and j Represents the j coordinate in Y.
The normalized path W must start from w1 = () and end with wK = (| X |, | Y |) to ensure that every coordinate of X and Y appears in W.
In addition, the I and j of W (I, j) in w must be monotonically increased to ensure that the dotted lines in Figure 1 do not overlap. The so-called monotonic increase refers:
The final path we want to get is the shortest path:
Dist (wki, wkj) is a classic distance calculation method, such as Euclidean distance. Wki refers to the I data point of X, and wkj refers to the j data point of Y.
3. DTW implementation
When implementing DTW, we adopt the idea of dynamic planning. D (I, j) indicates the sum Path Distance between two time series of I and j:
The final path distance obtained is D (| X |, | Y |), which is solved using dynamic planning:
For the Cost Matrix (Cost Matrix) D, D (I, j) represents the sum Path Distance between the two time series of I and j.
3.1 The pseudocode implemented by DTW is:
1 int DTWDistance(s: array [1..n], t: array [1..m]) { 2 DTW := array [0..n, 0..m] 3 4 for i := 1 to n 5 DTW[i, 0] := infinity 6 for i := 1 to m 7 DTW[0, i] := infinity 8 DTW[0, 0] := 0 9 10 for i := 1 to n11 for j := 1 to m12 cost:= d(s[i], t[j])13 DTW[i, j] := cost + minimum(DTW[i-1, j ], // insertion14 DTW[i , j-1], // deletion15 DTW[i-1, j-1]) // match16 17 return DTW[n, m]18 }
Python code implemented by DTW 3.2:
1 def dtw(X,Y): 2 X=[1,2,3,4] 3 Y=[1,2,7,4,5] 4 M=[[distance(X[i],Y[i]) for i in range(len(X))] for j in range(len(Y))] 5 l1=len(X) 6 l2=len(Y) 7 D=[[0 for i in range(l1+1)] for i in range(l2+1)] 8 D[0][0]=0 9 for i in range(1,l1+1):10 D[0][i]=sys.maxint11 for j in range(1,l2+1):12 D[j][0]=sys.maxint13 for j in range(1,l2+1):14 for i in range(1,l1+1):15 D[j][i]=M[j-1][i-1]+Min(D[j-1][i],D[j][i-1],D[j-1][i-1]+M[j-1][i-1])
4. DTW Acceleration
Although DTW can be quickly solved using linear programming, the time complexity of O (N2) is very large in the face of long time series. There are already many improved fast DTW algorithms, such as FastDTW, SparseDTW, LB_Keogh, and LB_Improved.
References:
[1]. FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space. Stan Salvador, Philip Chan.
[2]. Wikipedia: Dynamic Time Warping
[3]. Speech Recognition: 11.2 Dynamic Time Warping