Shanghai Waitan Stampede accident has been over for more than half a month, the bitter experience, from ordinary people to expert professors, have through the media on the issue of their own opinion, hoping to find the real cause of the accident, to avoid a recurrence of tragedy.
Baidu Research Institute of Large Data Laboratory BDL (DA Data Lab), adhering to the "data Talk" concept, based on Baidu data and large data intelligent analysis technology, trying to the situation at the time of the data description, hoping to provide some reference to the relevant people.
Figure 1 shows the Nanjing East Road subway station near the area (lower left blue box), Waitan source near the area (upper right blue box), the incident area near Chen Yi Square (lower right black box) and Waitan area (right red box) location in the December 31, 2014 the crowd thermodynamic map. The redder the color means the more dense the crowd, the more sparse the blue. The discussion below will focus on three issues.
Fig. 1 Thermal force of Waitan area in the 2014.12.31
How big is the flow of people at that time? Was it the time of the night when the traffic was greatest?
Through large data analysis, we can see:
1) as shown in Figure 2, the night of the incident, the Waitan area (including Chen Yi Square) is indeed very crowded, the volume of people has reached the usual maximum value of more than 3 times times.
Fig. 2 Trend of human flow in 2014.12.29-2015.1.2 Waitan area
2) as shown in Figure 3, on the night of 31st, about 20:30, Nanjing East Road subway station (Purple Line) also appeared a peak of the crowd. And the incident at that time (black dotted line), not Chen Yi Square (red) People's largest flow of time, its two times the peak of human traffic in 21 and 24 points.
Fig. 3 2014.12.31-2015.1.1 Traffic Trend Chart
Second, what is the extent of the flow of people's hedges
Some experts say that the crowd may be a stampede is a big reason, the use of large data technology combined with map positioning information, from the historical location and trajectory data can be seen in the incident in the direction of the flow of people relative to other holidays is indeed more complex. We use the Mid-Autumn Festival, National Day and three festivals across the year to compare the data.
(1) Mid-autumn Eve (2) National Day (3) the Night of the Year
Fig. 4 Thermal diagram of population distribution in Waitan and Waitan source area (2 hours)
According to the 2-hour crowd distribution heat of fig. 4, the flow of people in the three festivals was quite similar, but the distribution was different. The Mid-Autumn Festival (Figure 4 (1)) and National Day (Fig. 4 (2)) are mainly distributed in the vicinity of Waitan Avenue and Chen Yi Square, and in the year after 22 points (Figure 4 (3)), the crowd is mainly located in Shandong, Chen Yi Square and Waitan source near.
(1) Mid-autumn Eve (2) National Day (3) the Night of the Year
Fig. 5 schematic diagram of population flow in Waitan and Waitan source area (partial sampling)
Fig. 5 Sampled some people to indicate their direction of motion. In the diagram, each arrow represents a pedestrian, and the color and direction of the arrow indicate its way forward. Figure 5 (3) can be seen, across the night the crowd from Nanjing East Road to Chen Yi Square, resulting in the night around 21 o'clock, Chen Yi Square of the flow of people reached a peak (Figure 3). After that, more people began to flow from the city of Chen Yi Square along the way to the northern Waitan source, that is the scene of the day of the light show location.
(1) Mid-autumn Eve (2) National Day (3) the Night of the Year
Fig. 6 distribution pattern of population flow in Waitan area
We further analyze the flow of people in the beach area of Fig. 5, and get the distribution map of the population flow in Figure 6. Each sector section in Fig. 6 represents a different direction of flow of people, and the sector RADIUS indicates the amount of traffic in that direction. Fig. 6 (1-2) shows the situation of the Mid-Autumn Festival and National Day, which can be seen that the direction of the people is relatively simple and clear, that is, the north-south to more people, the other direction less people. Figure 6 (3) shows the direction of the flow of people in the Waitan area on the night of the year. In addition to the north-South two-way flow of people, there are many other directions, the flow of the crowd direction distribution chaos.
For the cause of the flow direction of the complex population, some experts speculate that, the Mid-Autumn festival, National Day tourists just simple Waitan tour, and on the night of the year, many tourists are to watch the light show, but to the plaza after Chen Yi found the Lights show site changes (in previous years in Chen Yi Square, this year changed to Waitan Source). From Baidu search keyword analysis inside also see this trend. Around 23:20 that night, the number of keywords to search for "lights show canceled" and "light show tickets" increased dramatically (Figure 7).
Figure 7 Search for "light show canceled" and "light show tickets" keyword index
From the use of mobile map habits, visitors to the destination, the general will be used in advance map search destinations and planning routes, lighting show location in the Waitan source, then users should search the "Waitan source" and planning the path. We studied the location of the night visitors to search through the Baidu map "Waitan source", found that most of the concentration in the vicinity of Waitan (Fig. 8 in the red area), which means that the user originally did not know the light show changed to Waitan Source, to the Waitan later found that changed the place, so just pulled out the mobile phone map search.
Fig. 8 The map search initiated by the "Waitan source" as the target
Third, group aggregation is a sudden situation, can be early warning?
China's large population, major sports activities, holiday rallies and other activities, prone to overcrowding caused by the crowd of the dangers and even accidents. So is it possible to predict ahead of time to do advance warning? Large data laboratory to Baidu's positioning data, search data for in-depth excavation, explore the possibility of early warning.
Fig. 9 Waitan map search and crowd aggregation trend map
Figure 9 represents the historical trend of the Waitan map search request and the crowd gathering situation from December 25, 2014 to 31st. Since the two curves have been normalized and aligned, it is not difficult to see their basic consistent fluctuations. At ordinary times, Waitan's map search and population convergence level is basically stable, but on the last day of 2014, both reached the highest peak.
Fig. 10 Correlation analysis of Waitan map search request and personnel arrival quantity
Through the location of Baidu data, search data mining. Further analysis of the December 31, 2014 map search request and the number of personnel arrived. From Fig. 10, in the Baidu map, the relevant location of the request data and the actual arrival of the number of people in the site has a very high correlation, the correlation coefficient of more than 0.9 (the closer to 1, the more relevant description). This shows that the user to the destination, the general will advance the use of Baidu map search sites and planning routes. In order to excavate the user's time advance amount, including Waitan data, the large data lab further analyses the data of a large number of historical group gatherings, including the Bird's Nest football match.
Fig. 11 The cross correlation curve between Waitan map search and population number
By analyzing a large number of historical data, it was found that the peak of map search requests in the relevant locations would occur 10 minutes earlier than the peak population density (see Figure 9). In Fig. 11, we give the relative Yu Shiyan curve of the correlation between the number of searches and the population, where the value of the x-axis is the delay, and the negative is the amount of advance, for example, the value of the 10 pair up curve, which is the correlation between the 10-hour search and the number of It can be found that the two-magnitude cross correlation curve peaked at 1.5 hours, which means that we could at least predict the arrival of peak traffic by a few 10 minutes in advance, depending on the number of requests being searched for the location on the map.
(Responsible editor: Mengyishan)