Design and implementation of user behavior analysis system based on Hadoop
Beijing Jiaotong University Shang
Under the background of large data, this paper focuses on the design and development of user behavior Analysis system based on Hadoop, aiming at the problem of not analyzing network user behavior comprehensively and accurately, using the Network Security development package Libnids and distributed platform Hadoop key technology. The system realizes the functions of mass data packet grabbing and distributed storage, TCP recombination and Application layer HTTP behavior analysis, which not only helps the service provider to provide better recommendation service according to the user's behavior characteristics, but also lays an effective technical support for the network related department to monitor the network public opinion reasonably. This paper uses the user behavior analysis method based on Hadoop, first utilizes the high-speed capture tool pf_ring to crawl the network entrance data as the data source of the user behavior analysis, and the distributed storage, then calls the network Security Development Package tool Libnids to reorganize the packet, realizes the tcp/ IP reorganization, implementation of the application layer HTTP restore, and then invoke the Hadoop cluster, using distributed MapReduce programming to analyze the user application layer of the network behavior activities, to achieve from the physical layer to the application layer of the full layer of analysis, from the user's search terms, shopping trends, Website message and general website behavior four dimensions to the user's overall positioning. Timely understanding of user behavior and needs, and then strategy to control user behavior and optimize network services to achieve network intelligence. This paper studies and designs the user behavior Analysis system based on Hadoop by using the mature behavior analysis technology and the massive data processing platform in the existing network. The main research contents are as follows: (1) Research on data packet capture technology in large data environment, packet capture based on pf_ring technology, (2) research and development of data storage technology for storing high-speed packet capture system output file; (3) The Technology of HTTP protocol reduction under MapReduce framework is studied.
Design and implementation of user behavior analysis system based on Hadoop
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.