The injury of large data--small data thinking

Source: Internet
Author: User
Keywords Large data large data small data large data small data we large data small data we intuition large data small data we intuition so

Until 1980, clinicians relied primarily on "experience", "Intuition" and "touch-less cues" to determine whether a child with a fever was caused by a lighter disease (such as a cold) or by a more severe disease, such as acute pneumonia or meningitis. In other words, they rely on intuition to see a doctor. In 1980, a team of researchers examined how the experienced pediatricians diagnosed their patients. They found that the outstanding physicians had intuitively referred to "input information", while those inexperienced physicians were overly subjective in trying to test these "input messages" reliably.

In the ensuing study, the researchers strengthened their systems from two aspects of accuracy and objectivity. In this system, the pediatricians who are being trained have access to many children with severe illnesses that cause fever, as experienced physicians do. Things have changed fundamentally: the establishment of intuition has been formed in a form that is qualitative and quantitative, and it can be exploited by inexperienced physicians. Today, almost all doctors who are treating fever children are confirming this exquisite discovery.

If we identify the goal of providing the best treatment for every child's visit, we need more than intuition and professional skills, because nobody is perfect. Evidence-based Medical Methods (EBM) help physicians improve treatment by incorporating clinical research into therapeutic guidelines. However, in general terms, EBM is generally based on "small data" research-unlike the hundreds of thousands of or millions of http://www.aliyun.com/zixun/aggregation/14294.html "Big Data" >, A large EBM is a system that contains thousands of cases. The input of information in such a small sample size system must be well defined and formalized, with the result that the treatment guidelines that contain all this information are inadequate to explain the difference between patients and the patient. As a result, EBM is sometimes ridiculed as "cookbook-style Therapy", and doctors are only mechanically following the "recipe" of these treatments. Chicken and spinach may be delicious for some people, but what happens when we serve a vegetarian?

Large data volumes are sufficient to create more personalized "therapeutic recipes". With a data set of 500 million people, you can tailor a treatment to a 35-year-old man who has to take aspirin and a good dose of cholesterol every day, or a person who is exactly the same but underweight.

Large data can also allow us to find small but powerful clues through a rough set of unprocessed data, and then analyze it. Small datasets usually do not deal with rough raw data because it does not distinguish between "Mi" and "myocardial infarction", even if they refer to the same thing. And because we can only use a single term in a small dataset, we can't make conclusive generalizations. At the same time, the small dataset can not support the need to identify the "heart infarction" and "myocardial infarction" is the same term of the study. Small datasets also fail to support our use of very detailed clues as inputs because they are too random in the event of a dataset-a conclusive generalization is not available from such a small sample dataset.

There is a growing debate over whether big data is replacing intuition in medical status. In any case, the big data is still our greatest hope that computers can emulate the intuition of human experts, and then we will no longer have to rely on small datasets like EBM. The real problem is not that big data is threatening medical intuition, but on the contrary, I can not do it. We don't rely too much on big data in the medical field now because it does require large amounts of data, and medical researchers do not have a real large clinical dataset in their hands.

The cost of establishing, maintaining, labeling and keeping a clinical clinical dataset is too high. The penalty for leaking dataset information is heavy, and the benefits of establishing such a dataset are almost non-existent. Even government-backed health-information-flow projects usually do not have data statistics. Instead, these systems are used to allow the lander to enter an external system, retrieving only one patient's data at a time, and the resulting data is usually in the form of a digest. Large data analysis is not possible in such a system.

However, the biggest barrier to large data-volume medical data sets is the so-called "best practices" prevalent in medical information, which has lagged behind other industries for more than ten or twenty years. The health-care information system continues to strengthen the use of outdated data barriers, which are the basis for maintaining a "small dataset" study. In this system, only audited, standard, edited data can be received-there is no rough raw data! The resulting dataset is a small dataset, because the barrier-type process is the bottleneck for hardening the data source, and many of the available data are shut out because of lack of consistency. This barrier creates homogeneous data, and apart from the diversity that makes the system truly useful, it is like white bread-an empty purifying box filtered to the best nutrients of cereals. If such a barrier is used in large numbers, Google and Amazon will not succeed, and the original big data is the reason for their success.

Unless every doctor has unparalleled intuition at the same time, computers should be used to provide better medical care. If we discard small data thinking in the process and start building real big data, large data will play a bigger role in medical support.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.