Network Content Filtering Technology

Source: Internet
Author: User

With the rapid popularization of the Internet, the Internet content "junk" has begun to intrude into our lives, like a lot of bad information on the Internet, as well as spam, virus emails, leaked emails, and online chats have gradually penetrated people's souls. How can we extract the best from the Internet and get rid of its dregs, so as to protect ourselves and the needy teenagers? A new technology-content filtering is born because of this, attracting people's attention.

Open Pandora's box without delay

We know that the negative problems brought about by Internet content are generally divided into two aspects: first, the waste of entertainment content on people's time; and second, the harm of bad information on people's soul.

For the former, countless entertaining content on the Internet is consuming our precious time, these work-independent activities include online games, online shopping, stock trading, online radio, streaming media, and MP3 downloads. They are brand new temptations for our online users. According to a new survey published by Websense, 1/4 of U.S. employees spend at least one workday each week surfing the internet to view work-related content. In addition, a survey conducted by the Administration Association of the United States also showed that more than 50% of all Internet activities of enterprise employees are irrelevant to work, this means that some of the salaries these employees receive each month are irrelevant to their work. For this reason, the United States will pay billions of dollars a year. In addition, experts who specialize in internet addiction symptoms said that 25% to 50% of internet addiction users are surfing the internet in the office. If enterprises ignore the Internet access during work hours, and do not prohibit some bad websites, it is likely to cause a series of serious consequences.

If this is not absolutely harmful. The latter is different. According to a survey by relevant institutions, 34.6% of teenagers admit that they have browsed pornographic websites, and 4.9% of them admit that they are "regular. As a result, many teenagers have neglected their studies and become "Online heroin" eaters.

Two Axes of technology for content filtering"

Appropriate technical measures can be taken to filter out bad information on the Internet, which can prevent the infringement of bad information on people and adapt to social ideological requirements. At the same time, by standardizing users' online behaviors, improving work efficiency, making rational use of network resources, and reducing viruses against the network, this is the fundamental connotation of content filtering technology.

In general, content filtering technology includes the list filtering technology, keyword filtering technology, image filtering technology, template filtering technology, and intelligent filtering technology, currently, content filtering technology is mainly divided into two types: Gateway-based and proxy-based.

First, content filtering based on the gateway is usually embedded in a dedicated security gateway or firewall and other gateway devices. Such network devices are generally filtered through static and dynamic content. Static filtering allows you to customize trusted sites and prohibit sites. For example, static filtering can block access to the "dating community" to deny access to the content of the "dating community" website. Dynamic Filtering is also important because the Internet and Web are not static. On the contrary, new Web pages are being added to the Web at a rate of hundreds of millions every year, with new sites and pages appearing every minute.

In addition, Web pages are not a single entity, but composed of many independent components. Each component has its own URL, which can be obtained independently and independently by browsers. Each component can be directly accessed through its URL, so it may also be a filter object. Dynamic Content Filtering: You can set keywords in the URL to filter websites containing the keyword to determine whether the user should obtain the URL of a request, even if the URL is not clearly defined. For example, dynamic filtering can deny access to all sites with the words "Porn" in the URL. The ideal firewall should not only support static content filtering, but also allow you to select a list of wide categories that can block on your own, such as auction, chat, job search, games, hate/discrimination, history, jokes, news, stocks, swimsuits, etc. This feature allows office administrators and parents to allow or block access to any site type. In addition, because the Internet is always changing, you should regularly update the category list with a new URL that is classified as the site type.

Second, proxy-based content filtering. It is mainly implemented by dedicated hardware proxy devices. Generally, the devices are configured as proxy cache servers and deployed between enterprise users and the Internet, these optimized dedicated devices can intelligently manage users' content requests. When a user requests a URL, the request first reaches the corresponding port security dedicated device of the device for authentication and authorization. If the objects on the requested page are already in the local cache of the dedicated device, they are directly accessed from the local device to the user. If the objects are not in the local cache, the Security dedicated device acts as the user's proxy, communicate with the source server over the Internet. When the object is returned from the source server, it is stored in the local cache to serve the subsequent access request, and a copy is sent to the accessed user. The entire process is monitored throughout the process and recorded for Access Report statistics and to provide a basis for the enterprise plan.

There is a long way to go. The Internet is still in the midst of evil.

Currently, content filtering products use blacklist, keywords, and simple templates to filter bad content. However, due to the rapid changes in content on the Internet, this requires that the list and template be updated in a timely manner. Therefore, a very important indicator of the Technical Advancement of products is the blacklist library size provided by the manufacturer and the effective ratio of filtering.

Experts also believe that most of the current filtering technologies are implemented at the application layer of network processing, with poor adaptability and security. The implementation based on the network layer has two major challenges: first, the application layer analysis technology must be comprehensive, because the application layer analysis of network packets is directly carried out, it is necessary to fully understand how all applications that need to be filtered are implemented at the network layer, how many States are there, and whether there are special implementations. Second, compatibility is achieved, to integrate with the underlying processing of the operating system network, you need to fully understand the operating system network implementation mechanism and even replace some functions. It is quite difficult to avoid affecting the original functions of the operating system, especially when the Windows environment lacks the underlying information.

However, despite the difficulties and bottlenecks faced by content filtering technologies and products, with the development of the network, people are calling for "Green Network Space" based on their own needs, it has greatly promoted the development of the "Content Security" industry. According to statistics, the annual turnover of content filtering software in the United States reaches billions of dollars.

Even though the two main content filtering technologies based on proxy and gateway, including the list filtering technology, keyword filtering technology, image filtering technology, template filtering technology, and intelligent filtering technology, has been relatively mature, and the product mainly includes the standalone version of the home version), the Internet Bar version, the enterprise version, the domestic version, the hotel version, the ISP version, the telecommunications version, and so on, basically covering various fields, however, it is worth mentioning that the content filtering technology is still in its infancy, and the practical technology is relatively simple, mainly manifested in the mature list filtering and keyword filtering technologies, the image filtering and template filtering technology is still in its infancy and faces obstacles to the negative impact of Intelligent Image Recognition and filtering on machine or network performance. Currently, the content filtering technology mainly filters fixed content such as URL filtering and webpage text, and cannot be judged intelligently. This is the current situation of the content filtering technology.

  1. Introduction and development trend of Content Filtering Technology
  2. Detailed description of Content Filtering Technology
  3. Content Filtering: different content in China and Abroad

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.