A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
This article outlines:
1. Structure of small e-commerce website
2. Log and monitor system solutions
3. Building the master-slave architecture of the database
4. Image server architecture based on shared storage
5. Mobile m Station construction
6. System Capacity Estimation
7. Cache System
I. Structure of small e-commerce websites
Just from the traditional software industry into the e-commerce enterprises, the e-commerce website does not have any technical content, there is no threshold, are some of the existing things piled up like piles of wood. However, it was only after the real entry into the industry that it was not. It has been said that a good architecture is evolved, and so is the architecture of e-commerce sites. Now good e-commerce website, seems very complex, very good, in fact, it is from a very small structure, but also from nothing technical content began. Therefore, the evolution of architecture is the process of continuous pursuit of the ultimate in the technical team.
Today, we summarize the evolution of the architecture of small e-commerce websites. The initial architecture of a set of e-commerce systems often uses a more typical lamp architecture, with apache/php on the front and MySQL on the backend. This is a more popular kind of. There is, however, a set of. NET technical architectures that may be rarely mentioned. Unfortunately, I am an e-commerce company based on a. NET platform. So, today is also to summarize the e-commerce architecture of the. NET Platform.
1 Technical architecture
General Initial e-commerce website, basic on several business subsystems: website Front desk, merchant front desk, system management backstage, APP, M station and so on. The volume of business is also not very large. So, the MVC + cache + database is basically done.
In terms of development efficiency alone, the technical architecture of. NET MVC is no slower than lamp development. Therefore, some enterprises, in order to quickly launch their own e-commerce platform, will also adopt the. NET architecture.
At the infrastructure level. This is a very simple infrastructure.
Front-end sites and M-stations, taking into account the volume of traffic and the availability of the system, are basically distributed deployments. Request distribution through a proxy server.
Other business subsystems, such as the merchant's foreground and management system, are basically single-or master-slave deployments.
Each DB, Redis service and file and picture service, search engine SOLR service, etc., adopt master-slave deployment.
3 Detailed architecture
The whole system architecture, there is a more important component, that is the monitoring system. For example: traffic monitoring, hardware monitoring, system performance monitoring, there is a page to monitor, set up a piece of the page to monitor and so on. It is an important means of improving the usability of the entire platform. Multi-platform, multiple dimension monitoring to ensure system availability. In the event of an exception, especially in the case of hardware or performance anomalies, the monitoring system can immediately issue a warning, so as to prevent the case.
In summary, a good system architecture should be considered in terms of extensibility, security, performance, and reliability. Rome is not built in a day, the structure is suitable for the line, you can first and then excellent. Through the process of gradual evolution, gradually make the system more and more perfect.
Second, the log and monitoring system solutions
Monitoring system is mainly used for server cluster resource and performance monitoring, as well as application anomaly, performance monitoring, log management and other multi-dimensional performance monitoring and analysis. A perfect monitoring system and log system for the importance of a system need not say much. In short, only real-time understanding of the state of the system can guarantee the stability of the system.
As shown, monitoring platform monitoring is a wide range, from server performance and resources, to application system monitoring. Each company has a specific platform for unified monitoring of the needs and solutions, but the task and role of the monitoring platform is basically consistent.
Log is an important way to monitor the operation of the program, there are two main purposes: 1.bug timely detection and positioning; 2. Displays the program running status.
Accurate and detailed logging can quickly locate the problem. Also, by looking at the log, you can see what the program is doing, whether it is executing as expected, so it is necessary to record the running state of the program. There are two types of logs: 1. Exception Log; 2. Run the log.
We mainly use log4net, the log of each system, persistent record to the database or file, in order to facilitate the subsequent system anomaly monitoring and performance analysis. How to integrate log4net, not much to say here.
Several principles of logging:
Log level must distinguish clearly, which belongs to error, warning, info and so on.
The location where the error was recorded. If it is a layered system, it must be processed at a certain level, such as our MVC schema, which catch exceptions and processes in each action, while the business layer and the database layer are exceptions, all of which are catch-to-exception and thrown up to the top.
Log information clear and accurate meaningful, log as detailed as possible to facilitate processing. The relevant system, module, time, operator, stack information, etc. should be recorded. facilitate subsequent processing.
Monitoring system is a complex system platform, there are many open-source products and platforms at present. However, our platform is small, monitoring tasks and less demand, so the basic is their own development. There are five main aspects: 1. system resources; 2. server; 3. Service; 4. Application exception; 5. Application performance.
The specific architecture diagram is as follows:
1) System Resource monitoring
Monitor various network parameters and server related resources (CPU, memory, disk read and write, network, access request, etc.), ensure the security operation of server system, and provide exception notification mechanism to let system administrator quickly locate/solve various problems. At present, the more popular should be Zabbix.
2) Server monitoring
Server monitoring, mainly monitoring the various servers, network nodes, gateways and other network equipment request response is normal. Through the timer service, periodically to ping each network node device, to confirm that the network equipment is normal. If any network device has an exception, a message alert is issued.
3) Service Monitoring
Service monitoring, refers to the various Web services, image services, search engine services, caching services and other platform system services are operating properly. The service can be requested at regular intervals to ensure that the services of the platform are functioning properly.
4) Application of abnormal monitoring
At present, all of our platform's system exception records are recorded in the database. Statistical analysis of abnormal records over a period of time through timed services. If it is found that the relevant important modules of the system anomalies, such as payment, the next module is frequently abnormal, notify the relevant personnel immediately, to ensure the normal operation of the service.
5) Application Performance monitoring
Intercept and Record program performance (SQL performance, or program execution efficiency) in the API interface and in the relevant locations of each application. Relevant important modules provide performance alerts and identify problems ahead of time. At the same time statistics related monitoring information and display to the development personnel, in order to facilitate the subsequent performance analysis.
Third, building the master-slave architecture of the database
After the development to a large mature company, the master-slave architecture may be a bit outdated, instead of a more complex database cluster. But as a small e-commerce company, the master-slave architecture of the database should be the most basic. Any large system architecture is constantly evolving. The master-slave architecture is the most basic architecture in the database architecture. So after studying the master-slave architecture, we can understand more complex architectures.
Why first read and write separation?
For a small web site, a single database server might be able to meet the requirements. However, in some large web sites or applications, a single database server may be difficult to support large access pressure, upgrade server performance cost is too high, so it must be scaled-out. There is a library, read, write is to operate a database. After a lot of data, the performance of reading and writing to the database will have a great impact. It is also a challenge for data security and system stability.
What are the benefits of database read/write separation?
Separation of Read and write operations on different databases to avoid performance bottlenecks on the primary server;
When the primary server writes, the query performance of the query application server is not affected, the congestion is decreased and the concurrency is increased.
Data has multiple disaster recovery replicas to improve data security, and when the primary server fails, you can switch to other servers immediately to improve system availability.
The basic principle of read and write separation is to allow the primary database to handle transactional increment, change, delete operations (Insert, Update, delete) operations, and to process select queries from the database. Database replication is used to synchronize changes caused by transactional operations to other slave databases.
In SQL, for example, the main library is responsible for writing data and reading data. The Read library is only responsible for reading data. Each time there is a write library operation, synchronize updates to the Read library. Write library on one, read library can have more than one, the use of log synchronization to achieve the main library and multiple read library data synchronization.
1SQL Server Read-write decoupled configuration
SQL Server provides three technologies that can be used for the implementation of data synchronization between master and slave architectures: Log shipping, transactional replication, and new features in SQL 2012 always on technology. Their merits and demerits, the specific people to Baidu Bar, here to provide online friends of the configuration, for reference only.
Log shipping: SQL Server R2 master-Slave database synchronization
Transactional replication: SQL Server replication: Transactional Publications
(Graph Source: Network)
2c# database Read and write operations
C # 's request database operation, the database of a single database and a master-slave schema is still different. Master-Slave architecture of the database, in order to ensure data consistency, the general main library readable writable, from the library is only responsible for reading, not responsible for writing. Therefore, the actual C # when requesting the database, to be treated differently.
The simplest is to configure two database connections, and then, at the location of each database call, distinguish between read and write requests for the corresponding database server, such as:
The second solution is to determine whether the SQL statement is a write statement (Insert, Update, Create, Alter) or read statement (Select).
Demo Download: Http://files.cnblogs.com/files/zhangweizhong/Weiz.DB.rar
(PS: This demo for my summary, and the actual production of the DLL is not the same, but the principle is the same, we summarize the package bar)
At the same time, increase the associated database configuration
Iv. Image server architecture based on shared storage
In the current era of the Internet, no matter what kind of website, the demand for pictures is getting bigger. In particular, the e-commerce website, almost will face the massive image resources storage, access and other related technical issues. In the image server architecture, expansion, upgrade process, will certainly encounter a variety of problems and requirements. Of course it doesn't mean that you have to get a special NB Image service architecture, as long as simple, efficient, stable on the line. In this section, we summarize a particularly simple and efficient image service architecture: The Image Service architecture is implemented by means of shared storage.
However, there are some people ask me, now large web site Image server architecture is completely not so, others home picture system than you this much more, why not directly write that?
The fact is: first, large-scale system I do not, and second, the system is also a small structure evolved from the past, no one step. Here the picture server architecture is relatively simple, but also through the evolution of the single-machine era, basically can meet the needs of small and medium-sized distributed Web sites. This structure and the learning cost are very low, in line with the current "fast track approach" development model.
Shared storage is implemented through shared directories, and a separate domain name is configured on the shared directory file server, which separates the picture server from the application server to implement a standalone picture server.
1. Detach the image service from the application service to mitigate the I/O load on the application server.
2. You can avoid synchronization-related issues between multiple servers by reading and writing through a shared directory.
3. Relatively flexible, also support expansion/expansion. Supports configuration as a standalone image server and domain name access for future expansion and optimization.
4. Compared to the more complex distributed NFS system, this approach is cost-effective, in line with the current Internet "fast track approach" development model.
1. Shared directory configuration is a bit cumbersome.
2. Will cause a certain (read and write and security) performance loss.
3. If there is a problem with the picture server, all apps will be affected. The performance requirements for storage servers are particularly high.
4. Image upload operation, or go through the Web server, which is still a great pressure on the Web server.
The architecture is very simple, as shown in the basic architecture:
Set up a shared directory on the storage server (specifically, I will not repeat, self-Baidu, pay attention to the shared directory file security).
Each app uploads the image to the storage server directly through the shared directory (\\192.168.1.200).
Set up a Web site (i1.abc.com) to publish the shared directory through a Web site. This allows other applications to access the relevant images.
So, each app uploads files to a shared directory
Full Address: \\192.168.1.200\lib\2016\03\04\10\IMG\4ugvvt6m9gdu.jpg
RelativePath = relativedir + FileName + imageextension;
var Absolutepath = Confighelper.sharepath + relativepath;
After successful uploading, you can access it directly via the Web:
V. Construction of Mobile M station
I've been working on M stations recently, which is the mobile Web site. Because it is the first time, also encountered a lot of problems, so the recently learned things summed up. Talk about what a mobile m station is, and what it does and how it's advantageous.
Some people will ask, what is the difference between M station and app?
The app directly on the user's mobile device, the exposure rate is relatively high. and M station needs to open the browser, enter the address to access, so the exposure rate is relatively low.
M station to promote the channel compared to mobile apps, more channels, easy to track user sources, traffic access, etc., facilitate the promotion of activities and data analysis.
M station users do not need to install, enter a URL to access, and the app needs to download the installation.
M-stations are able to quickly get feedback from users through data analysis, which makes it easier to adjust products based on statistical data analysis and user needs.
Apps are more sticky and user-friendly.
M station is very convenient for marketing promotion activities, and forwarding sharing is convenient and quick.
M Station update iteration product speed and response product adjustments are very fast, ready to be released, and the app needs to be audited for time.
M-station cross-platform, no need to develop Android and iOS version, just have a browser.
So, I think that M-station and the client are complementary. The timeliness and rapidity of M station is unmatched by the app. The user experience of the app is not what M station can do. At present, the two can not be completely replaced by the other side, in the internet marketing of the big line of today, M station is also more and more important. Most marketing activities are presented and disseminated in the form of H5 pages. Through the marketing and promotion of M station, so as to promote the use and promotion of the app.
At present, Mobile m station has a tendency toward apps. M station will become more and more like an app, making M station also more and more important. Moreover, many app's display effect, when the native code cannot realize, the nested mobile H5 page is also a very good choice.
Here are the key points for the construction of several mobile m stations:
51Degrees claims to be the fastest and most accurate solution for equipment testing today. It is a free open source. NET mobile app development component that can be used to detect mobile devices and browsers. You can even get screen size, input method, plus manufacturer and model information. This allows you to selectively be directed to content that is designed for mobile devices. With accurate data for mobile devices, almost all mobile devices such as smartphones and tablets are supported.
In fact, the role of 51Degree is to identify the client's device. PC browser access, just jump to the PC station, mobile browser access to jump to M station. Thus achieving a better user experience.
How do I add 51Degree to an existing website?
The mobile web and the traditional web do not really differ in nature. Plainly or a Web site, the technology used is HTML+CSS+JS. The difference is that, just now under the trend of HTML5, HTML5 is added to the mobile m station, making m station more like a light app.
Bootstrap not much to say, there are a lot of bootstrap information on the Internet. Its biggest advantage should be very popular, very easy to get started. If a professional design or artwork is missing, then Bootstrap is a better choice. His usage is very simple, almost no learning cost, is definitely a fast-developing tool.
Official website: http://getbootstrap.com/
4 some suggestions
Mobile m station URL to try to be the same as the PC, which is to avoid the same URL in the PC station can be displayed, but on the phone opened but is 404;
M station write a separate TDK.
Six, System capacity estimation
E-commerce company's friends, such a scene is familiar with:
Operations and products The mysterious run came to ask: we have to do a promotion at night, the server can withstand it? How many machines do I need to add if I can't carry it?
As a result, technology is a face of crazy.
In fact, these are the problem of system capacity estimation, and capacity estimation is one of the necessary skills of architects. The so-called capacity estimation is actually the maximum amount of traffic that the system can withstand before it is down. This is an important indicator of a technician's understanding of the system performance. Common capacity evaluations include a range of content such as traffic, concurrency, bandwidth, CPU, memory, and disk. This section to talk about the problem of capacity estimation.
1 several important parameters
QPS: The number of requests processed per second.
Concurrency: The number of requests that the system processes concurrently.
Response time: Average response time is generally taken.
Many people often confuse the number of concurrent numbers with the QPS. In fact, once you understand the meaning of the above three elements, you can deduce the relationship between them: QPS = concurrency/average response time.
2 steps and methods of capacity evaluation
1) Estimated total traffic
How do I know the total number of visits? What is the best way to evaluate an operation's traffic, or a system-on-line PV assessment?
The simplest way is to ask the business side, ask the students to run, ask the product classmate, see the product and the operation of the flow estimate of the activity.
However, business parties ' estimates of traffic should be two indicators of PV and user access. Technicians need to calculate other relevant metrics, such as QPS, based on these two data.
2) estimated average QPS
Total number of requests = Total pv* page derived connections
Average QPS = total number of requests/total time
For example: The total number of visits to the active page within 1 hours is 30w PV, the floor-to-page derivative connection is 30, then the average qps= (30w*30)/(60*60) = 2500 of the landing page.
3) Estimated peak QPS
System capacity planning, not only the average QPS, but also to consider the peak of the QPS, then how to assess peak QPS?
This is based on the actual business assessment, through the previous marketing activities of PV and other data to estimate. In general, the peak QPS is about 3-5 times the mean QPS, and if the average daily QPS is 1000, the peak QPS is estimated to be 5000.
However, there are some businesses that are more difficult to assess business visits, such as "Seconds to Kill", and the capacity assessment for such a business is not discussed here for the time being.
4) estimate system, single machine limit QPS
How to estimate a business, a server single-machine limit QPS?
This performance indicator is one of the most basic indicators of the server, so there is no other way than stress testing. Through the pressure test, the server's single-machine limit QPS is calculated.
Before a business goes live, stress tests are generally required (many startups, which may not be a step in the fast-moving system), take the app to push a campaign for example (estimated daily QPS is 1000, peak QPS is 5000), and the business scenario might be:
Push an event message through the app;
Operational Activities H5 Landing page is a Web site;
H5 floor-to-ceiling pages are assembled from cache caches and data in database db.
Through the stress test found that the Web server can only resist 1200 Qps,cache and database db can withstand the concurrency of pressure (generally speaking, 1% of the traffic to the database, the database of the QPS can be easily anti-live, the cache will be able to resist the QPS, it needs to evaluate the cache bandwidth, This assumes that the cache is not a bottleneck, so that we get the Web single-machine limit of the QPS is 1200. In general, production systems will not run to the limit, which can easily affect the life and performance of the server, stand-alone line allowed to run to qps1200*0.8=960.
Extension said, through the stress test, already know that the web layer is a bottleneck, you can make some adjustments to the web-related aspects of optimization to improve the Web server single-machine QPS.
There are also stress testing work, usually in a specific business perspective of the stress test, is concerned about a specific business of the concurrency and QPS.
5) Answer the first two questions
Required machines = Peak qps/single-machine limit QPS
Well, the above has got the peak QPS is 5000, the single-machine limit QPS is 1000, the online deployment of 3 servers:
Can the server withstand it? ---peak 5000, stand-alone 1000, 3 units on line, can't carry
How many machines do I need to add if I can't carry it? 2 additional units, 1 units in advance, to 3 sets of insurance
It is important to note that the above are calculated for the capacity of a single server or a single cluster. The actual production environment is a complex cluster composed of a series of web, message queue, cache, database and so on. In a distributed system, any node bottleneck can lead to an avalanche effect, resulting in the collapse of the entire cluster ("avalanche effect" refers to a small problem in the system will gradually expand, resulting in the entire cluster outage).
Therefore, to understand the capacity of planning the entire platform, you must calculate the capacity of each node. Identify any bottlenecks that may occur.
Seven, the cache system
For an e-commerce system, caching is an important part, and one of the main ways to improve system performance is caching. It can block out most of the impact of database access, and without it, the system is likely to crash because the database is unavailable.
But caching poses some other tricky issues: consistency and real-time data. For example, the state of the data in the database has changed, but the old value of the cache is still seen on the page until the buffer time expires before the cache can be re-updated. How to solve this problem?
There is the cache data if it does not expire, it will remain in memory, the memory of the server is also a burden, then, what data can be put cache, what data can not, this is the system design must be considered at the beginning of the problem.
What data can be cached?
It does not require real-time updates but consumes data from the database extremely. For example, the homepage of the product sales rankings, hot search products and so on, these data are basically a day, the user will not pay attention to whether it is real-time.
Data that needs to be updated in real time, but the data is not updated at a high frequency.
Each time this data is fetched, it undergoes complex processing logic, such as generating a report.
What data should not use caching?
In fact, most of the data in an e-commerce system is cacheable, and there is little data to use for caching. This type of data includes money, keys, business-critical core data, and more. In short, if you find that most of the data in the system is not cached, this indicates that the architecture itself is out of the question.
How to solve the problem of consistency and real-time?
The way to ensure consistency and real-time is that once the database is updated, the original cache must be updated.
Let's talk about our caching scheme: Our current cache system: Redis (Master-slave) + RabbitMQ + cache cleanup Service, as follows:
The cache cleanup job subscribes to the RABBITMQ message queue and updates the data back to the Redis cache server as soon as the data is updated into the queue.
Of course, some friends of the scheme, after the database update is complete, immediately to update the relevant cache data. This does not require MQ and cache cleanup jobs. However, this also increases the coupling of the system. Depends on your business scenario and platform size.
How to create a small and refined e-commerce website architecture?
Start building with 50+ products and up to 12 months usage for Elastic Compute Service