Ebola virus--epidemic prevention and control in large data age

Source: Internet
Author: User
Keywords Cloud computing big data Ebola virus

While paying tribute to the annual figures of the 2014 ERA magazine, which were received by Ebola patient caregivers, let's review last year's global pandemic, which has been a concern and sustained this year.

The Ebola virus, originating in West Africa, has come into the global public eye through traditional media and digital media following the failure of the MA Airways airliner in 2014. According to information provided by the World Health Organization [1], the Ebola virus first appeared in two outbreaks in 1976, together in Sudan and the Democratic Republic of the Congo. The latter occurred in a village near the Ebola river, which is named after the disease. The Ebola outbreak in Congo was a standard subspecies of Zaire, cumulative 318 people sick, 280 deaths, death rate of 88%; in Sudan, the outbreak of the Sudan subspecies, the cumulative 284 people, 151 deaths, death rate of 53%, as well as Reston, Côte d ' Ivoire, the State of the cloth honors three subspecies, The harm to animals and humans is relatively mild. It is reported that the Ebola virus, which is currently ravaging the world, is the highest fatality rate of the Zaire standard subspecies.

Ebola virus outbreaks in a few years after the outbreak, but each of the previous outbreaks are relatively small, mainly concentrated in one area outbreak, and limited in Central Africa. In particular, there have been numerous outbreaks of Ebola in the history of the Democratic Republic of the Congo.

The scale of the outbreak of the Ebola virus, which began in March 2014, has aroused international concern and has been listed by the World Health Organization as an "international Public health Emergency" (third time in history). First, the epidemic involves a number of countries and regions. Affected countries throughout the territory include Guinea, Liberia and Sierra Leone. Parts of the territory affected include Cahit in Mali, Madrid in Spain, Dallas in the United States, Texas and New York, the Scottish region of Scotland, Glasgow in Nigeria, Lagos, Harcourt Port, and Dakar, Senegal. Second, the Ebola virus outbreak has seen more cases and deaths than all other outbreaks combined. As of December 31, 2014, a total of 20206 people were sick and 7905 people were killed [2]. And the numbers are growing. All of the Ebola caregivers were selected by the American Times as the 2014 year figure.

Looking back over the past few decades, mankind has undoubtedly achieved the same results in the fields of information, technology, biology and medicine. In view of the outbreak, transmission, media coverage and control of the 2014 Ebola virus outbreak, we can not help but start to think in this large data age, data, statistics, rational thinking, critical thinking to human disease prevention and control benefits? This article attempts to explain in three ways how big data is closely linked to epidemic prevention and control. The first part of this paper discusses how to pass traffic data, non-traditional public health data, such as mobile communication data and social media data, are used to measure and predict the risk of the epidemic; The second part focuses on the different cognition of the risk of epidemic caused by different methods of estimating mortality; The third part focuses on the treatment and control expenditure data of Ebola virus outbreak.

Data-driven epidemic forecast

1. Prediction of outbreaks by traffic data [3]

With the increasing frequency of the global population, the outbreak of a certain region will bring the potential epidemic risk to other countries and regions of the world, so how to predict and evaluate the risk of such epidemic effectively becomes a topic worth exploring. One of the major features of this Ebola outbreak is that the epidemic broke through the border through transport, and was rampant in many countries outside the source of Guinea. For international population mobility, aircraft is clearly the most important means of transport, so the statistical analysis of the airport population flow data has become the most important.

In fact, the airport data has already been paid attention by researchers in many fields, and the case based on this kind of data analysis modeling is not uncommon. For the analysis of the spread of Ebola, some researchers have also given their methods, one of which is to quantify the possible impact of Ebola on an area by estimating the risk of introducing the import disorientated.

The most central question for the introduction of risk estimates is how to quantify risk through dynamic models or statistical models. This paper introduces a more intuitive approach to estimating the introduction of risk into the relative introduction of risk (relative import disorientated) and the absolute introduction of risk (differs import disorientated). If X is an airport in the outbreak area, and Y is any area of the world, then we can define the relative risk by the form of conditional probability, i.e. p (y| X). For the absolute introduction of risk, we can define it by joint probability, i.e. P (x,y) =p (y| x) p (x), where it should be noted that P (x) is often much less than P (y| X), so P (x,y) will also be much smaller than P (y| X). In practical applications, relative risk is more valuable than absolute risk, the main reason is that the estimate of P (x) is difficult to achieve in most of the time, in fact, the estimate of P (x) relies on a large number of parameters to describe the various factors in the region of X, while the estimate of absolute risk relies on p (x). In contrast, the calculation of relative risk requires only data from the airport population, that is, the relevant factors of the region itself need not be taken into account.

With the integration of airport data from around the world, people can get estimates of the relative risk of each location, and the next step is to consider how to present the results to the public. Obviously, data visualization is often the best way to show the results of analysis intuitively and effectively, and someone provides an interactive network analysis chart based on D3 implementation. (Figure I shows a screenshot of an interactive chart.) Interested readers can flip through the original artwork. Chart Links: http://rocs.hu-berlin.de/D3/ebola/)

  

Figure 1

2. Prediction of outbreaks by mobile communication data

The outbreak of the Ebola virus in West Africa has aroused worldwide concern, with attention being drawn to public places such as airports that cause population mobility, and as the previous section said, airport data do have an extremely high analytical value for researchers such as epidemiologists, but beyond that, The data generated from mobile terminals also has great application potential.

Each time the user uses the phone to call the process will produce the corresponding call record data, which naturally also contains the phone number, call time and the approximate location of communications and other important information. For operators, these data can provide references to the deployment of base stations to promote communication networks. On the other hand, for urban planners, it can be based on the data to determine whether the relevant sites need to expand the corresponding public transport facilities.

However, in addition to these relatively common applications, the epidemiological application is more exciting and more hopeful. In fact, the usual methods of modeling disease diffusion so far are still based on census data and related surveys. However, in the case of communication data, people can get updated data in real time, that is to say, there is no need to estimate whether the population of a certain area will migrate in practical application. Fortunately, there is no shortage of similar success stories in recent years. In the 2009 outbreak of swine flu in Mexico, researchers used communication data to monitor public responses to government-issued health-alert information. After 2010 years of cholera outbreaks in Haiti, the researchers also modeled on mobile communication data and gave the best estimate of where the aid was needed.

More complex in the actual operation of the Ebola virus, one of the main reasons is that most people in West Africa do not have cell phones or other communications equipment. Still, in a way, it is better than statistical analysis based on stale data. In fact, if researchers could trace the flow of people from an outbreak of an infectious disease, there would be a more effective estimate and forecast for the next most likely outbreak, thus allowing for a reasonable and effective allocation of resources ahead of schedule. Unfortunately, while many of the agencies involved have done a great deal of work, telecoms operators still do not allow researchers to use this part of the data for privacy reasons.

3. Predicting outbreaks from social media data [4]

Internet and social media data have played a major role in the early warning of the Ebola virus. HealthMap is a website/application that uses large data to respond to outbreaks, using certain algorithms to crawl data from social media sites, local news and government websites, social networks of infectious disease physicians and other sources to detect and track disease outbreaks. March 14, 2014, healthmap through its own system, warning the outbreak of "mysterious hemorrhagic fever" in Guinea. March 19, 2014, HealthMap confirmed the Ebola virus and issued a warning to the World Health Organization about its rough locations and routes in the tropical rainforest region of southeastern Guinea. March 23, 2014, the World Health Organization officially announced the Ebola outbreak and reported the first confirmed case. At this point, HealthMap has tracked 29 confirmed and 29 deaths in Guinea-all data and reports from social media and local government websites.

HealthMap uses complex algorithms to filter irrelevant data and, with the help of experts in the field, classify the relevant information, determine the type of disease and locate the outbreak site on the map. In response to the Ebola outbreak, the World Health Organization announced the day, HealthMap on the line of a dedicated page, which contains a real-time interactive map. This interactive map allows global users to learn about outbreaks free of charge, including specific outbreaks and tracking new cases and the number of deaths. The system can also record public attention. Users can enlarge specific countries and regions on the map, marking the main case report. The user clicks on the tag to point to the breaking news story. Also, the scroll bar at the bottom of the map allows you to track the progression by clicking on the key date.

This is not the first time HealthMap. Founded in 2006 by a team of researchers, epidemiologists and software developers, the organization uses a wide range of data sources on the Internet to monitor and predict disease outbreaks and to achieve real-time monitoring of public health threats. They bring together disparate data sources, including online news centers, eyewitness reports, expert-planning discussions and official validation reports. In addition to real-time and interactive rendering data, HealthMap is also committed to predicting disease risk. There have been reports that the group successfully used models such as boosted regression tree to predict the death rate of SARS in China.

HealthMap's official website claims that their main source of data is promed (an International infectious Disease Association, a group of front-line physicians and researchers) mailing lists, who's official website, Geosentinel ( Clinical physicians from the International Travel Medical Association and the United States Centers for Disease Control and Prevention (CDC), the World Organisation for Animal Health, the FAO, EUROSURVEILLANCE (the information platform for peer-reviewed communicable disease surveillance and exchange in Europe), wildlife Data integration Receptacle (a global wildlife basic news source), Google News search, Baidu News and search information. Another published paper showed that most of the data came from ProMED (61.58%), while other search engine news, such as Google, contributed 25.24%, and other important sources include RSS subscriptions (12.11%), pushing the principal social media (8.7%) [5]. There seems to be a gap between this and the news, which is strongly touted as a complete social media prediction of the Ebola epidemic. The social media is not what the public thinks of as a common public social media, but a social network established by front-line health care workers around the world. In fact, Google and other social media have tried to monitor and predict disease information by grabbing Web keywords, but this has not worked so well. Google once claimed that its system was a good predictor of flu outbreaks in the United States each season, and that real data showed that their systems often overestimated the prevalence rate. Ordinary people's perception of disease and their sharing of social networks is not as accurate as the actual condition. On the other hand, Twitter's data show that Ebola has caused unprecedented panic among American users, and that the state's discussion numbers have gone red and internet extremism has been rife. But in fact the United States has only 4 confirmed cases and one death case so far.

Dr. Brad Crotty, a clinical information expert from the Beth Medical Center in Israel, said in an interview that HealthMap actually had a lot of work to do to eliminate "background noise". The expert is not a member of this organization.

"You can really get early warning, but they're not always right," Dr. Crotty said. ”

Dr Sumiko Mekaru, who is in charge of healthmap operations, said the intention was to supplement traditional and official health reports rather than replace them.

Ii. calculation and estimation of epidemic mortality rate [6]

The calculation of mortality is very important, especially for communicable diseases. The fatality rate is a very important predictor of epidemiology because it tells us the probability of death after a particular disease. If the estimate is accurate in the outbreak, it can even help us determine whether the virus has mutated in terms of harmfulness and to study the most appropriate treatment options.

The Ebola virus, which is raging in West Africa and the world, has a well-known high fatality rate. In previous outbreaks, up to 90% of those infected were killed by Ebola. The previous average mortality of the Zairian-type Ebola virus was 80%. That is why the data in the World Health Organization's latest epidemic report looks a bit like good news-although Ebola infection rates are rising at an alarming rate, the reported overall mortality rate is only 53%, from 39% in Sierra Leone to 64% in Guinea. This is a little more moderate than in previous outbreaks. Is this outbreak a bit less lethal than it used to be? Or do we already have more effective treatment options?

In fact, there is a case in point: the significantly lower death rate may be due to the official method of measurement, not the death of the virus, or the level of treatment that patients receive. In fact, the sharp increase in the number of infections in recent weeks is one of the main reasons for reporting that the death rate does not look so high.

The official so-called death rate, or an outbreak of disease, has a number of methods of calculation. The simplest of these is the current death toll divided by the current total number of infections. The death rate recently reported by the World Health Organization is calculated in this way.

But this approach does not take into account that many surviving patients-especially those who have just been diagnosed and are very ill-may not survive. So this underestimates the actual mortality rate. Especially when disease is expanding rapidly, this underestimation effect will be increased. Andrew Rambaut, an evolutionary biologist who studies infectious diseases at the University of Edinburgh, said the calculations also ignored the deaths of those diagnosed with Ebola who had left the hospital before being approved for recovery and discharge. Many of these patients died later, but were not counted in the official death figures.

Another method of calculation is to consider only those patients who have been identified for recovery and who no longer need to be treated and who have died from the disease, rather than those who are currently under treatment. Such results seem to be more restrained and precise. According to the latest outbreak report of the Sierra Leone Health and Health ministry on November 5, 841 confirmed patients were discharged from hospital and 1,103 confirmed patients had died. The Ebola mortality rate in the land should be 57%, not 39% of the World Health Organization report. But Lipsitch, an epidemiological expert at Harvard School of Public Health, says the calculation is still not accurate. Patients who have been cured are usually more likely to stay in hospital longer than those who die from the disease. In other words, patients who are not counted in the treatment are actually more likely to be cured. So the calculation would overestimate the actual mortality rate.

Another more precise approach is to calculate the end result of people who have been infected at the same time and have lived long enough to either die of illness or return to hospital. Such calculations are naturally closer to the definition and nature of mortality. Rambaut notes that in a report on the latest variation of the Ebola virus in science, a regional outbreak, which began in late May this year, confirmed 78 Ebola patients and 23 survived. In other words, the actual fatality rate of the Ebola virus this year should be 70%. Notably, the report was completed by more than 50 front-line medical workers from four countries, five of whom did not wait until the article was published in Science. Because they were already infected with Ebola when they were studying and waiting to come online.

Christopher Dye, head of the World Health Organization's strategy department, said the organization was turning to the methodology and was working to organize the records of each patient into a nursing case. "We need the most effective estimates," Dye said. We would like to know whether the death rate of Ebola in this outbreak is different from the previous outbreaks in Central Africa, and whether different treatment options are in effect in the current outbreak. ”

But even this method is still imperfect. In most outbreaks, statistical cases are missing because the patient does not seek the help of a medical institution, which leads to a fatal estimate. Lipsitch that the deviation may be large or small. Many cases are relatively mild--infected people who do not need to see a doctor to recover themselves--so the statistics overestimate the actual mortality rate (which is what happens when the Mexican flu outbreak, experts suspect, also exists in the Middle East Respiratory Syndrome outbreak). However, Lipsitch also said that the moderate version of Ebola is unlikely to be as difficult to find as the mild version of the flu, only in view of the overall lack of medical conditions in these places, there may be a certain number of self-healing patients are not counted.

On the other hand, the researchers note that many Ebola patients have not yet gone to the hospital and are dead at home (often infected with other family members and caregivers). This means that their deaths are not counted-thus underestimating the fatality rate.

We will never know how many recorded Ebola deaths have occurred. Health officials are tracking suspected and possible cases, many of them dead before being diagnosed as Ebola. Whether to put these cases in the death rate calculation is another potential source of error. In addition, diagnostic tests have different patterns in different regions: for example, more autopsy tests have been done in some places. "It's always a big problem to keep a balance in these errors," Lipsitch said.

"We are not unaware of the difficulties in estimating mortality," Dye wrote in an e-mail. Nor do I believe that the Ebola mortality rate in Sierra Leone (39%) is lower than in Guinea (64%). Although the current data is on the surface, we need to eliminate all estimates to believe this to be true. ”

In addition, the first healthmap of the global outbreak of Ebola this year, based on large data from social networks, is another way to measure mortality. They believe that the most accurate mortality rate until the outbreak is fully controlled, all infected persons either died or confirmed to survive before they can be. The current report of 53%, is only the proportion of death cases (proportion of fatal cases, hereinafter referred to as PFC).

Although Ebola is notorious, it is not an infection that immediately dies. Without proper adjustments, the current estimate (i.e. the simplest and currently used PFC in the World Health Organization) does not take into account the lag time from infection to death-an estimate of this outbreak, based on the original variance optimization method adopted by HealthMap, The average time is about 16 days. This means that the 2,296 deaths reported in September 8 actually correspond to the infection cases reported on August 23. This hysteresis-adjusted PFC provides us with a better approximation of the real fatality rate. The following figure is an adjusted mortality chart that they calculate and draw based on data provided by the World Health Organization:

  

According to the figure above, the adjusted Ebola mortality rate-between accounted-is much higher than the results of the actual report. However, this adjusted mortality rate is consistent with the data provided by Médecins Sans Frontières (MSF). Since March this year, Médecins Sans Frontières has treated 2,077 suspected infected persons, 1038 of whom have been diagnosed, 241 of whom have been diagnosed and discharged, meaning that mortality is as high as 77%.

Iii. reading of Ebola: medical expenditure

For the global cost of combating the Ebola pandemic, there are now a few data that might give us a ballpark point.

The first data to be mentioned is the United Nations survey report for September this year. Https://docs.unocha.org/sites/dms/CAP/Ebola_outbreak_Sep_2014.pdf

The estimated cost of the next six months against the Ebola virus was about 1 billion dollars (987.8M). The cost is mainly in five aspects:

Stop proliferation (Stop the outbreak) [5m+23.8m]

Treatment patients (TREAT the infected) [331.2M + 14.0M]

Guaranteed Key Services (ensure essential service) [107.7m+97.1m+2.5m+64.8m]

Maintain stability (PRESERVE stability) [42.6m+23.4m+45.8m+3.2m]

Prevention of infection in uninfected countries (prevent outbreaks in countries currently unaffected) [11.9M]

This is, of course, the September estimate, but it may not be accurate at the moment, but at least we can see what the most expensive places are, presumably. It also provides information on the needs of countries and individuals with donations.

The second data is about the cost of treating patients. U.S. Business Media Bloomberg published an article titled: Bill for Ebola Adds up as Care Costs $1,000 a Hour (the cost of treating Ebola adds up to $1000 per hour). This sounds a bit sensational. After reading the article, the hospital was talking about the cost of treating Thomas Eric Duncan, the Ebola patient who died in Texas, USA. During the treatment of the patient, the daily cost was about 18,000 to 24,000, and the coverage was capped and an estimate of $1000 per hour was given. I think the cost of treating patients in Africa is certainly much smaller than this number.

The third data is about who pays the money. Here is a list of the 987.8M dollars that should be needed for the current six months. Http://data.163.com/14/1020/02/A8VGQE1600014MTN.html in those donations, the World Bank, the United States, and the African Development Bank accounted for the bulk. Most countries in the world have contributed more or less. Transparent disclosure of funding sources and uses can help countries, organizations and individuals in a timely manner to understand the rational use of funds, which will also promote more power to join the fight against disease.

  

Summary

Although recent reports about Ebola are not as lively as they were a few months ago, the global campaign against the Ebola pandemic is still under strain. In this information age, with the explosion of our ability to collect data and use data, every corner of the world is intimately connected to this information network. Data and appropriate analysis have become an important force for human beings to conquer nature and adapt to nature.

[1] http://www.who.int/mediacentre/factsheets/fs103/en/

[2] Http://apps.who.int/ebolaweb/sitreps/20141231/20141231.pdf

[3] Http://rocs.hu-berlin.de/publications/ebola/index.html

[4] http://www.dailymail.co.uk/sciencetech/article-2722164/ Ebola-flagged-computer-software-nine-days-before-announced-healthmap-used-social-media-spot-disease.html

[5] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4198292/

[6] Http://news.sciencemag.org/africa/2014/09/how-deadly-ebola-statistical-challenges-may-be-inflating-survival-rate

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.