Application of data Mining in medicine

Source: Internet
Author: User
Keywords Data Mining discipline
Tags advanced analysis application application cases applications applied based basic

This article will talk about the application of data mining in medicine, hope to be interested in the friends have inspiration, but also engaged in other industries data mining applications colleagues reference.

Data mining, also known as Knowledge Discovery (KDD), is the process of extracting potential and valuable knowledge from a large number of data. The pattern explored by data mining is an objective, but hidden knowledge that is not found in data. For example, data mining can directly excavate the patients with high disease, discover the unknown relation between the disease and symptom, explore the influence relationship between the test index and the potential influence between the test index and the disease, predict the Unknown laboratory index value, and explore the relationship between the complications. You can also automatically find an exception for a set of high-dimensional lab indicator variables, and so on. Another example, in the design of scientific research using clustering analysis, we can conduct a scientific grouping of data, through the study of multiple factors of different impact weights, can help determine the analysis or nested analysis of different research design and so on. Data mining is widely used in medical science, and it is a new frontier technique which can not be matched with traditional methods in medical clinic and scientific research.

The case of foreign data mining in medical application

Data mining is widely used in many foreign industries, medical field is no exception, a lot of data mining technology has been successfully applied to medical clinical and scientific research, the following list a few simple cases.

1. The application of cluster analysis in medicine

Diabetes is a common disease in the world, with more than 180,000 Americans suffering from diabetes and 160,000 diabetes in the early stages of diabetes. The clinical diagnosis of diabetes is often based on abnormal physical symptoms and laboratory values. Some abnormalities include body mass index (BMI), blood pressure (BP) index, etc. Using the clustering analysis tool, we can analyze the diagnosis data of the patients, so as to conduct exploratory data analysis and investigate the significance of the clustering results. As for patients with diabetes, the cluster analysis tool attempts to generate clustering patterns based on age, race, gender, body mass index and BP index, and divides the data into corresponding natural groups.

The basic index data of diabetes mellitus patients were analyzed by using cluster analysis tool, and clustering was produced by means of good dividing-class mean value. In this case, the clustering analysis of the existing 3 different datasets results in the number of clustering between 5 and 8, the number of patients in each cluster is small, and the computational clustering takes about 5 seconds to 4 minutes.

Through cluster analysis, experts have a total of 4 types of patients in all 3 datasets:

• The patient is obese (BMI > 56), but the blood pressure is normal;
• Patient basic indicators (BMI,BP) are normal;
• The patient's blood pressure is within normal range, but the body weight index is abnormal;
• Patient Basic Index (BMI,BP) abnormalities;

The results of the above 4 types of diabetes mellitus revealed the typical four types of diabetes mellitus, which is of great significance in clinic.

2. Application of association rule Analysis in medicine

Association rules are a promising technique for discovering hidden association models in medical data. In general, association rules dig up a large number of rules in medical data, and not only are the numbers large, but most of the rules are medically irrelevant. For some of the rules that are useful, medical experts find it very slow and it is difficult to explain the rules later. In this work, we introduce search constraints to discover only medically meaningful association rules and make rule searches more efficient.

For example, association rule analysis found that cardiac perfusion measurement and patient risk factors were closely related to the severity of four special arterial stenosis. We often use the support, confidence, and lift indices of association rules to evaluate their medical significance, as shown in figure one.


3. Application of predictive analysis in medicine

Prostate cancer screening can detect cancer early, but not all patients benefit from subsequent treatment. Therefore, the identification of which patients are most likely to have invasive cancer, will significantly reduce the prostate biopsy test. We collected 1,563 patients receiving prostate biopsy data, collected 10 μg/ml or less serum PSA data, and used predictive models to analyze invasive prostate cancer. The predictive model is trained with randomly selected 70% of the data, and the remaining 30% is used to test the predictive model. Of the 1,563 cases, 406 were suffering from cancer (26.1%), of whom 130 had invasive prostate cancer (8.3%). The predictive model creates the following invasive prostate cancer risk group rules:

1. Psad is larger than 0.165ng/ml/cc.

2. Psad is greater than 0.058 ng/ml/cc and less than 0.165 ng/ml/cc, age is greater than 57.5 years and prostate mass is greater than 22.7 cc.

The predictive model was validated by test data, and the sensitivity of the model to invasive prostate cancer was 91.5%, and the specificity was 33.5%. In the test data, when Psad was 0.058 ng/ml/cc or less, the incidence of invasive prostate cancer was 1.1%. Therefore, predictive models can effectively identify invasive prostate cancer risk groups. When a single high prostate cancer diagnosis will lead to subsequent treatment, predictive models can reduce the 33.5% of unnecessary biopsy tests.

Application of foreign data mining in medicine

Many of the theories and techniques of data mining originate from European and American countries, the research and application of data mining technology in these countries are very early, so there are years of data mining technology accumulation and experience accumulation. European and American countries to the data mining technology research and development of a large investment, not only put a lot of money, but also equipped with a strong team of research and development. These countries are more aware of the application of data mining technology, so they have a high degree of research on data mining technology, it is urgent to apply the newest technology to the demand of science and commerce, so there are a lot of mature and reliable application cases of data mining. Because of their early use of cutting-edge intelligent information technology to carry out health and medical research, now no matter from the depth and breadth of data mining research and application in the forefront of the world, and a lot of scientific research has been transformed into tangible technology and products, direct access to a wide range of applications, and produced significant social and economic benefits. For example, data mining is used in the following areas in medicine.

1. Prediction of disease and disease risk

Through the excavation of large medical data, analysis and application of intelligent decision making technology to predict hereditary diseases and multiple multifactor diseases by predicting the occurrence of common diseases such as angina pectoris, myocardial infarction, cerebrovascular disease, diabetes mellitus, hypertension, tumor, asthma, connective tissue disease and other diseases. has significant clinical significance and extensive social benefits. As shown in Figure II, the exploratory analysis of patients with unstable angina pectoris is performed using data mining techniques.
2, the population health, quality of life prediction

Modern people have to cope with fast-paced learning, work and life, and to deal with all kinds of complex social relationships. In the face of competition and challenges, People's physiology and psychology are constantly weakening, aging and pathological changes. The latest epidemiological survey shows that some of the urban population even 70% of people in sub-health, and sub-health people, the disease population is increasing. Through the analysis of a large number of medical data mining and the application of intelligent decision technology, not only can we find all kinds of health risk factors and correlations, but also can make individualized prediction, and set up a perfect, careful and personalized health management system based on the related excavation results, to help healthy people and sub-health people to establish orderly and Healthy lifestyle, reduce risk status, stay away from the disease, and can help the sub-health population of early detection of disease, early prevention, early diagnosis, early treatment, early surgery, improve survival rate, reduce morbidity and mortality, improve the quality of life. As shown in figure three, the predictive model of data mining is used to predict and analyze the hemoglobin index of "overweight and not abnormal blood lipid".

3. The prediction of the occurrence probability of various defects in medical treatment

Through the mining analysis of a large number of medical data and the application of intelligent decision technology, the causes, trends and related factors of medical defects can be revealed to make scientific management, reduce and even eliminate medical defects and disputes. For example, the Canadian Ontario Prov. Cancer Prevention and control Center has developed, the implementation of Ontario Prov. preventive Medicine and cancer control system, data mining of large tumor data in the province, the prevention of patient safety and accident, that is, using data mining method to reveal the trend of clinical accidents, research and identify the key factors that cause various accidents, and guide the preventive measures.

4, reduce medical expenses, optimize medical resources

Through the mining of large medical data and the application of intelligent decision-making technology, medical expenses can be reduced greatly. Based on a large number of medical data analysis on the basis of scientific health management, the medical costs can be greatly reduced, medical expenses can be reduced to the original 10%. As Dr. Dee.w.edington, director of the Center for Health Management Studies at the University of Michigan, puts it 90% and 10%, that health management has such a secret for any business or individual, that is, 90% and 10%. Specifically, 90% of individuals and businesses through health management, medical costs to the original 10%;10% of individuals and enterprises did not do health management, medical expenses than the original increase of 90%. Therefore, the application of data mining in medicine has remarkable economic benefit. Through the mining and application of large medical data, clear understanding of the incidence of disease and clinical Prevention and treatment of the focus, can optimize the existing equipment and talent, clear the direction of the introduction of talent and new technology, promote medical updates and construction, adjust the medical layout, optimize the medical resources, correct medical decision-making.

Application of domestic data mining in medicine

The application of data mining in China has been more and more attention and more and more widely recognized, we can predict that the application of data mining will be popularized in all walks of life!

In general, in China, data mining has been a lot of medical experiments, people are constantly exploring and progress. We are exploring the field of health and disease using data mining technology, and some of the leading countries in the industry have a certain gap, mainly reflected in the following aspects:

1, from the data mining theory and technology, we have a lot of awareness and awareness is more traditional and obsolete. Many people on the theory and technology of data mining, but also just stay in a few commonly used techniques and algorithms, the data mining is more narrow understanding. In fact, the development of data mining to today, although it is still only the initial stage, but the connotation and extension of data mining has been quite expanded, data mining is no longer a common understanding of a number of techniques and algorithms, but all can be used to find large data hidden laws of the technology and means. Since the lack of understanding, awareness, it will affect the data mining research and application of the effect, this is our primary need to improve.

2, from the data mining research and development and application of personnel structure, many of our data mining practitioners are from the tertiary institutions of the teachers, or medical research institutions of the technical staff, or other IT technical staff, most people are not systematically engaged in medical data mining professional research and application, It is difficult to understand the world's advanced data mining complete system and system application methods, and even many people are limited to some traditional algorithms to explore, leading to data mining technology research and application of the starting point is not high. Especially in the application level of data mining, data mining is a large collection of knowledge and financing, it not only needs to have a deep grasp of data mining algorithms, but also need to have a deep understanding of large data technology, including database technology, data modeling technology, data integration technology, super large-scale data optimization technology, etc. Of course, there is a need for in-depth knowledge of medical expertise. Therefore, the application of data mining in medicine, should need to compound talents, they should be math experts, information experts and medical experts trinity of personnel or the Trinity of highly integrated team.

3, from the application of data mining experience, many of the domestic workers do not have years of technical accumulation, but not mature scientific research and application experience, so the application of data mining is limited to a certain part of the exploratory application, few mature and stable actual application cases.

However, we firmly believe that, as long as we know each other, absorbing, the courage to explore, perseverance, we will be able to data mining applications and the cause of medical progress!

Application demand of data mining in medicine

Medical science is a large and complex knowledge system, there are too many new knowledge and new laws to be excavated. As an active discovery tool, data mining has been widely used in clinical medicine and scientific research. For example

1, the medical data and patient medical data, the application of data mining technology to explore the potential law of medicine, the study of various human body indicators in the health of the weight, as well as in different populations of distribution.

2, the application of data mining technology to study the relationship between human physiological indicators, more in-depth understanding of the comprehensive significance of human physiological indicators, explore the internal relationship between physiological data and the relationship between health, can be found in the health effects of comprehensive factors, so as to explore the cause of health.

3, through the Health examination data and patient data mining analysis, found how to comprehensively identify health status, analyze the factors that lead to disease, establish an evaluation model to predict the risk, and further establish the disease prediction model.

Especially in medical research, data mining is very useful. In a large number of medical research support and service projects, we have a profound understanding of the plight of scientific researchers, as well as their needs and seek help. Many medical researchers, for example, often feel the exhaustion of scientific ideas and fret about the lack of a novel scientific proposition. Because, the key point and difficulty of scientific research is scientific research innovation. Some medical scientists in the use of precise, rigorous statistics for scientific research and analysis feel powerless, the application of statistics has become a bottleneck in scientific research work. Some scholars feel it is difficult to break through academically, they want to improve the level and grade of scientific research. All of these can be applied to the technology and methods of data mining to help their work in scientific research.

On the other hand, the medical leaders also hope that the work of the Unit's scientific research will prosper. But in fact, leaders often feel helpless for the low research enthusiasm and indifferent academic atmosphere of the Unit. In order to change the situation above the lack of effective methods and means, always feel powerless, for the annual progress of scientific research work is not so worried, for the scientific research in the quality and quantity of the backward situation and feel if the needle felt. And to change this situation, on the one hand, we need to work scientific research personnel, on the other hand, the technology and means of scientific research to improve vigorously. The improvement of scientific research talents is difficult to have remarkable effect in short time under the existing manpower and material resources, the promotion of scientific research technique and method is relatively better, and the application of data mining technology is a method to improve scientific research technology.

An Intelligent medical research tool with data mining as the core

In order to improve the methods of medical research and increase the quantity and quality of medical research, we draw on the relevant technology and experience from abroad, and put forward and develop an intelligent medical scientific research system with data mining as the core. We provide a whole set of programs for medical research, build a complete intelligent scientific research platform, from the accumulation of a large number of clinical data to refine the necessary scientific research materials, all-round way to provide intelligent scientific research tools, many, fast, good, provincial and comprehensive promotion of scientific research work.

Specifically, in our intelligent medical research system, the new applied mathematics, computational science and intelligent Computing and other disciplines applied to medical research, borrowed from foreign intelligent medical research technology and experience, our years of success in North America and industry-leading technology combined, and integrated the wisdom of Chinese medical experts, For Chinese medical users tailored high-end intelligent research platform. The Intelligent medical scientific research system is the application of the existing electronic medical data of the hospital (his/lis/pacs/electronic case/medical examination system, etc.) and the construction of various medical specialist databases, to carry out the multi subject medical research on the online sharing of the local network, to provide intelligent tools to make the medical research work novel, scientific, rigorous, efficient and low-cost, It is expected to comprehensively improve the overall scientific research level of large hospitals and scientific research units. As shown in Fig. Four, the Intelligent statistical analysis interface of Intelligent medical scientific research system.

The Intelligent medical Research system has the following characteristics:

· The intelligent Analysis System, which is based on data mining technology, can directly excavate new medical knowledge and help researchers accelerate the achievement of scientific research and even major scientific discoveries.

· Using various data mining techniques to explore data law, provide scientific basis for scientific research design, point out the direction for scientific research, and ensure the success rate of scientific research.

· Direct multiple subjects overlapping the use of accumulated existing medical data, so that the cost of scientific research is greatly reduced, so that the use of savings in scientific research funds to strive for more scientific research.

· Powerful and Easy-to-use sample screening system makes the collection of scientific research data efficient and accurate, and can meet the stringent requirements of scientific research data. The online scientific research platform provides a package of tools for the whole process of scientific research, eliminating tedious and complicated manual data processing.

· The Intelligent scientific research statistic process based on the classical scientific research design makes the researcher not short the scientific research because of the mistake of design or misuse of statistical methods. The automatic operation result of the statistical algorithm embedded in the system makes the researcher get rid of the complicated special statistic software annoyance.

It has been proved that the application of intelligent medical scientific research system in hospital can obtain remarkable work benefit and make the hospital's scientific research and clinical work develop healthily. For example, the overall research capacity of the hospital has been strengthened, the level of scientific research has been improved, the number and quality of scientific research and papers have been improved, and published in the national and international level of papers and achievements have increased, the impact index of scientific research has also been upgraded, at the same time, access to more and more important national, provincial and municipal issues more opportunities. In a word, the improvement of the whole scientific research makes the academic authority of the hospital improved, has a wider social influence, the hospital's soft power is strengthened, the competition of the same medical market strengthens, and the hospital's economic benefit is improved correspondingly.

Of course, to enhance scientific research technology and methods is an inevitable means to improve the effectiveness of scientific research work, but more important is to play the initiative of scientific research personnel, with data mining as the core of intelligent medical research tools is only a good tool. If researchers have no incentive for scientific innovation, lack of enthusiasm, or urgent work, or engage in pseudo scientific research, even because of the complexity of the internal personnel of the battle, even if the construction of a good, advanced scientific research tools, and no one can use it, the real improvement of scientific research can only be nonsense!

Author:

The founder of the Hon Songlin (f&e Data Marvell Corp.), the co-founder of the Bureau of Foreign experts, the Canadian OCP certified expert, and a 20-year intelligent computing (Data Warehouse, Research, design, development and training experience in business intelligence and data mining. Master of Advanced project experience in North America, has been involved in a number of large-scale intelligent computing projects in large institutions such as Ontario Prov. Health Canada (OMH), Bank of Montreal (BMO), Canadian Institute of Technology (TELUS), Ontario High Education Commission (OCAS). In recent years, in China to host a number of intelligent computing products, the overall design and development work, combining intelligent computing and business experience in North America with China's professional needs and data environment, we developed an intelligent data analysis product based on data Warehouse, data mining and data statistics as the core of technology, and in Beijing, Tianjin and other fields have been successfully applied.

In addition, the author has "Data Mining technology and engineering practice," a book.

(Responsible editor: Mengyishan)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.