On the Improvement of Web server performance-thesis 2: Application of Digital Library

Source: Internet
Author: User
Tags ibm db2
【Abstract]
A large and medium-sized library information system involves many technologies and solutions. This article focuses on some content related to Web server performance.
I have the honor to be one of the project owners to participate in the design of a digital information system in a large library and the development of Web application software. Most of the information circulating in the digital library information system is digital indexing, summarization, full text, images, audio and video, and other multimedia information, which has high requirements on the Performance of web servers.
Based on practical engineering experience, this article will introduce the hardware implementation methods (Cache Server, Server Load balancer device, Web dual-machine image, CPU and nic upgrade, and network bandwidth expansion) and software implementation methods (three-layer C/S software structure design, application deployment) and other two major aspects of how to improve the performance of web servers, this allows you to use the application system more quickly, efficiently, and securely.
[Body]
With the development of intranet information technology, the digital information engineering of libraries is imperative in order to give full play to the functions of its book circulation, data retrieval and academic exchange. A library has started some digital library projects in order to step into the ranks of the world's advanced libraries as soon as possible.
The digital library project mainly includes the external information web publishing system, interactive Retrieval Network, background collection information management system, multimedia data collection and production, and VOD On-Demand System. I am fortunate to be one of the project owners and have participated in the overall design of the entire digital information system, he also participated in the development of some Web-based applications (such as external information publishing systems, image/full-text hybrid retrieval systems, and VoD Systems.
A digital information system of a library is divided into multiple network segments in terms of network environment: (1) Intranet access, using a ddnleased line of 2 MB; (2) Public Network Segment (non-military zone ), it mainly includes front-end database server, web server, e-mail/FTP/DNS server, retrieval server, and San area storage device. (3) it is an internal LAN, includes intranet web servers, backend database servers, and OA servers. (4) VOD private network, including audio and video on-Demand Servers. Due to the establishment of strict network-level and application-level access permissions, through a high-performance switch with three-layer switching capability and Security Authorization authentication system, etc., effectively control the anti-question permission, ensures data security and integrity. Taking into account the cost and quality of personnel and future maintenance and management operations, the operating system uses the Windows NT platform, the server uses the Dell high-end series, and the database uses IBM DB2. The backbone network is a gigabit fast-switched Ethernet network, with a LAN of MB to the desktop, and a VoD network of 10 MB to the desktop.
In this network environment, applications are mainly divided into three parts: (1) external web publishing system and external book-assisted Retrieval System; (2) Background collection information management system and image/full-text hybrid retrieval system; (3) VOD system. Because most applications adopt the Browser/Server structure, end users only need to install IE or Netscape Web browsers locally, and request and access various application services through web pages with the support of the background database server. In addition, most of the information circulating in the library information system is multimedia information such as index, abstract, full text or audio and video, which has higher requirements on Web server performance and network bandwidth.
Through continuous testing and practice, we find that the following aspects can effectively improve the performance of web servers;
(1) The use of cache servers and server Load balancer devices can alleviate access bottlenecks, increase network bandwidth, and achieve load balancing.
A cache server, also known as a cache server, can store static cache content such as web pages, multimedia on-demand resources, and meeting Status (compressed, with certain format requirements. In addition, cashflow cache servers in the United States can store dynamic content such as cache databases and ASP. The cache server is usually placed outside the firewall, before the Internet Web server, so Internet users do not directly access the website Web server, but access the Cache Server when they click a webpage.
Because the cache server has multiple CPUs, high-speed, large-capacity I/O channels, and independent OS, it can greatly alleviate Internet access bottlenecks and defend against hacker attacks.
A library uses this method to place large volumes of static images, on-demand resources, and virtual 3D applications on the cache server in advance, even if there is only 2 m internet access bandwidth, the playback speed and effect of the above applications are still satisfactory to users.
Another method is to use a Server Load balancer device or a Web dual-machine image. In this way, load balancing is used to achieve optimal web access performance. Web dual-host images are a popular method earlier in the past. Although they can improve system reliability, they always ask each other about the status of each other, which may affect access performance. The server Load balancer is a hardware independent of web servers. It is connected to the same vswitch as the Web server and other servers on the website. The Server Load balancer allocates workload to each server through the load scheduler, it can fully utilize resources and improve access performance. However, because a library currently has relatively few external resources and only uses three web servers, the current load balancing device is not significant.
(2) According to the Web server configuration, the number and speed of the web server's CPU, the number of NICs, and the relationship between the Web server and the firewall will affect the performance of the Web server.
In terms of Web server hardware, the increase in the number of CPUs, the increase in the number of NICs, and the expansion of I/O channels can undoubtedly directly improve the performance of web servers. In addition, because the gigabit-port firewall is currently relatively small and expensive, if you place the web server in the firewall, it will definitely affect the Internet access performance. A library adopts IDS (Intrusion Detection) + Web servers (server firewalls, relatively low-end, without affecting traffic) + application servers + database servers (firewalls, high-end) and hierarchical security modes, this ensures system security and improves network access performance.
In addition, a library also uses San-based regional storage to speed up server access.
(3) layer-3 C/S software architecture design and proper application deployment will also improve the performance of web servers.
The business logic, universal access interfaces, and data are separated from each other and placed on the web server, application server, and database server respectively. Through the rational deployment of program functions and logic, it can also greatly improve the performance of web servers.
The general principle is that the Web server only needs to accept HTTP access requests from the Internet, so that the Web only has the least number of tasks. The actual processing is handed over to each application server for processing, and then the result is returned to browser. A library has developed a search engine application server and a hybrid search application server in this way, achieving good application results.
In fact, there are still many methods and methods to improve the performance of web servers, such as the relationship between CPU and storage, Web switches, and so on, which need further practice, analysis, and discussion. (This article mainly references the papers of Shanghai Tong Yin and others)

Note: The subject is clear and well organized. However, the technologies discussed should be more organically integrated with project instances.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.