December 2014 12-14th, hosted by the China Computer Society (CCF), CCF large data Expert committee, the Chinese Academy of Sciences and CSDN co-organizer of the 2014 China Large Data Technology conference (DA data Marvell Conference 2014,BDTC 2014 will be opened at Crowne Plaza Hotel, New Yunnan, Beijing. The General Assembly lasts three days to promote the development of large data technology in industry applications. To set up a "large data infrastructure", "large Data ecosystem", "large Data Technology", "large Data Application", "large data internet finance technology", "intelligent information processing" and many other theme forums and industry summits. Sponsored by the China Computer Society, CCF large data committee of experts, Nanjing University with the co-organizer of the "2014 second CCF large data academic conference" will also be convened, and the technical conference to share the theme of the report.
The Conference will invite top experts and front-line practitioners in nearly 100 foreign data technology fields to discuss the latest development of OSS, YARN, Spark, Tez, HBase, Kafka, oceanbase, etc., Nosql/newsql, memory calculation, The development trend of flow calculation and graph computing technology, OpenStack ecosystem for large data computing needs, and large data visualization, machine learning/depth learning, business intelligence, data analysis, the latest industry applications, sharing the actual production system of technical characteristics and practical experience.
Among them, this conference "Big Data Application" sub-forum invited to CCF large data expert committee members, Ant Financial services Group large security and Safety Intelligence Director/Senior data expert Chen Jidong, sharing "Large data analysis in network security and fraud risk management application."
Prior to the meeting, CSDN and Chen Jidong A simple communication about the trend of large data technology and the content of his speech. Chen Jidong has focused on large-scale data management and analysis of research and advanced development applications, the use of Greenplum, MapReduce, HBase, Hive, Kafka, Storm and Spark, and many other technologies, the current focus on distributed real-time map architecture, Real-time CEP complex event management applications. In his view, the current financial level of security and wind control system, the challenge is also on the real-time processing capacity of massive data.
On the December 14 large data Application sub-forum, Chen Jidong will focus on sharing the large data wind control system of ant-Suit, how to forecast and model the risk of transaction and account based on the massive user behavior and relational network data, and realize the advance identification of the risks of transactions and accounts; and the newest security cloud service product of ant gold Clothing-safety treasure, How to use large data to help banks and other financial institutions to manage the risk of fraud. Click to sign up for face-to-face communication with Chen Jidong!
Chen Jidong
Ant gold Suit large security and safety Intelligence Director/Senior data expert, CCF committee member of large data
Dr. Chen Jidong, now the Director/Senior data expert of the large Security and safety Intelligence Department of Ant Financial Services Group, is responsible for the security control and transaction risk management system of Alipay account based on large data, former chief data scientist of the Big Data Research Center for Everyone, and director of the large Data laboratory of EMC China Research Institute, Has focused on large-scale data management and analysis research and advanced development applications, especially for mobile internet and financial risk management of large data analysis. In 2007, Chen Jidong received a Ph. D. In computer Science from Renmin University of China, and in 2012, from the Computer College of Fudan University, a postdoctoral mobile station, and joined the Chinese Computer Society (CCF) Committee of Experts on data, In the large data analysis related fields have applied for 5 U.S. patents and 2 Chinese patents.
Chen Jidong Interview and answer the following:
about large Data practice
CSDN: First, please describe your company's business, the value of large data to the company's business, and your department's responsibilities?
Chen Jidong: Ant to small micro-enterprises and ordinary consumers as the main users, the establishment of data, technology, services, the three open platform as the core of the financial ecology, support and help partners to create value for users, the company's business including Alipay, Alipay Wallet, balance treasure, recruit treasure, ant small loan and the preparation of the net Business Bank. The big data is the core of Ant's suit, from data operation to operation data, establish the credit system with data as the core.
The security Intelligence Department mainly through the massive user behavior and the relational network data carries on the predictive analysis and the modelling, through the big Data wind control system realizes the transaction and the account risk real-time monitoring and the advance recognition. At the same time through the security data product, realizes in the DT (data Marvell) times the financial cloud Platform Security Cloud service, helps the merchant, the Bank and other third party financial institutions to solve the network risk and the fraud problem.
CSDN: Have you ever been a data scientist in a different business, and can you tell me what large data technologies you have used in your project implementation? What are your satisfaction with these technologies and where are you dissatisfied?
Chen Jidong: I have used a variety of mainstream data technology, including: MPP database such as Greenplum;hadoop ecological mapreduce,hbase,hive;kafka,storm,spark.
A comprehensive understanding of the use of these technologies:
satisfaction: Large-scale off-line data analysis, quasi real-time data query and analysis, the advantages of streaming data processing is obvious. Not satisfied: 1 lack of distributed system architecture and mass data mining fusion system; 2 lack of real-time distributed graph framework and system needed for massive graph data mining.
CSDN: What are the main difficulties that big data has landed in your industry?
Chen Jidong: This is also my dissatisfaction with the above technology, financial security and wind control system of massive data real-time processing capacity requirements are very high:
requires High-performance, high-reliability, and highly available, large-scale, real-time computing infrastructures such as data-processing closed loops for real-time data acquisition, transmission, computation and analysis in milliseconds, flexible configurable, resilient scalable models and rule platforms, support for real-time event processing and variable computing, distributed rule engines, On-line and off-line model development and deployment, a mass distributed graph framework is needed to support real-time query and real-time analysis mining on massive graph data.
CSDN: Based on your experience, what are the easiest mistakes your organization can make that cause large data projects to fail?
Chen Jidong: Following a few common misunderstanding, will make big Data project pay the price:
blindly pursue the "big" data, ignoring the quality of data, the timeliness of data, the integration of different data, the pursuit of a single technology such as Hadoop, hoping to solve all the problems of large data processing, the excessive pursuit of the old system through large numbers, all-inclusive large data systems and strategies, Do not consider how to migrate from the original database schema to the new large data schema.
about large data technology trends
CSDN: New technologies in large data areas are developing rapidly, and what are some of the technical trends that you think are worth paying attention to throughout the big data industry?
Chen Jidong: There are many large data processing technologies, including batch processing, real-time flow calculation, interactive query analysis, distributed memory, graph computing framework, etc. Relative to a system and tool, I am more optimistic about the full range of large data ecosystems, such as Hadoop and spark open source biosphere, on the one hand, including data acquisition, storage, processing, access to high-level analysis and visualization of the data lifecycle of the various aspects, as well as metadata management and workflow tasks.
In addition, the need for in-depth analysis of large data (such as predictive analysis) will lead to a new generation of real-time large data analysis platform, the real data storage management (distributed storage and SQL) and mining analysis (parallel machine learning) and other organic integration, to form a unified end-to-end solution.
CSDN: For your industry, which technology you are currently the main observation and research, why are you optimistic about these technologies?
Chen Jidong: From the angle of ant costume, the current concerns include: Distributed real-time graph architecture, real-time CEP complex event management, large data security and privacy, large data value assessment, and large data innovation applications.
I think the future of big data lies in the more extensive integration of data from different data sources to analyze and use, from traditional retail, media to finance, to more new areas, based on data mining more knowledge and insights. The quality of data, the security of data, and the open thinking of data will be major challenges in the future of large data analysis.
about Big Data talent
CSDN: Talent is also important for the successful implementation of large data projects, what experience can you share in building large data teams?
Chen Jidong: Large data personnel should need analysis ability and engineering ability combination, analytical ability and business ability combination: Through application-driven large data analysis practice, to cultivate large data talent, data analysis and mining needs to have a strong business understanding and business ability, while cultivating a certain degree of engineering realization ability.
CSDN: What qualities do you think good data scientists need? If a college graduate is determined to grow into a data scientist, what advice do you have for him?
Chen Jidong: I understand data scientists are a kind of integrated talents with business and business understanding, data analysis and mining, distributed system. For graduates, from the application of practice, from the simplest and most boring data cleaning and business learning, and gradually cultivate the ability to analyze and excavate, exercise more sensitive data and business sense, can use the idea of data to solve practical problems and create value.
about BDTC
CSDN: Please talk about the topic you are about to share at this conference.
Chen Jidong: My topic is big Data security and wind control: in the face of hundreds of millions of accounts and transactions, how to identify a very small number of high-risk data in these data, and combine business understanding and data analysis, the account, the risk of the transaction in advance identification, in the transaction before the theft of transactions to determine the risk, to prevent misappropriation of the occurrence, This is the most important binding point for large data and security. I will share how to set up a data-driven wind control system to transform the account risk identification method from the traditional account password authentication method to the analysis and prediction based on the mass user behavior.
CSDN: Which listeners should know these topics best? What topics can you share to help your audience solve problems?
Chen Jidong: Suitable for Internet financial practitioners, especially wind control analysts, data analysis and mining engineers, and Internet security analysts and engineers, can help them understand how to use large data for fraud risk identification and management, how to model transactions and account risk analysis, based on large data wind control system needs.
CSDN: What do you expect from BDTC2014?
Chen Jidong: This is a large data technology field at home and abroad, top experts and front-line practitioners gathered in the gathering, will be in-depth discussion of the latest development of large data technology and practical experience, I personally look forward to the advanced real-time large data analysis infrastructure, innovative large data analysis applications to share.
The National large Data Innovation Project selection activity is now in full swing, details click here.
The 2014 China Large Data Technology Conference (Marvell conference 2014,BDTC 2014) will be held at Crowne Plaza Hotel, New Yunnan, December 12, 2014 14th. Heritage since 2008, after seven precipitation, "China's large Data technology conference" is currently the most influential, the largest large-scale data field technology event. At this session, you will not only be able to learn about Apache Hadoop submitter uma maheswara Rao G (a member of the project Management Committee), Yi Liu, and members of the Apache Hadoop and Tez Project Management Committee Bikas Saha and other shares of the general large data open source project of the latest achievements and development trends, but also from Tencent, Ali, Cloudera, LinkedIn, NetEase and other institutions of the dozens of dry goods to share. There are a few discount tickets for the current ticket purchase.
Free Subscribe to the "CSDN large data" micro-letter public number, real-time understanding of the latest big data progress!
CSDN large data, focus on large data information, technology and experience sharing and discussion, to provide Hadoop, Spark, Impala, Storm, HBase, MongoDB, SOLR, machine learning, intelligent algorithms and other related large data views, large data technology, large data platform, Large data practice, large data industry information and other services.