Several conceptual nouns in system architecture

Source: Internet
Author: User
Tags hosting

QPS

QPS per second query rate
Query rate per second The QPS is the measure of how much traffic is handled by a particular query server over the specified time, and on the Internet, the performance of the machine that is the domain Name System server is often measured by the query rate per second.
Corresponds to the fetches/sec, that is, the number of response requests per second, which is the maximum throughput capacity.
IOPS
IOPS is the I/O per second, which is the number of read-write (I/O) operations per second, used in databases and other applications to measure random access performance.
IOPS performance on the storage side is different from the host-side Io, which refers to how many hosts per second the storage can receive, and the host's IO requires multiple accesses to the storage to complete. For example, the host writes to a minimum block of data, and also passes the three steps of "Send write request, write data, receive write acknowledgement", which is 3 storage-side access.


SPOF
Single point of Failure (SPoF) refers to a system of such a component, if it fails or stops running, it will cause the entire system to not work. We certainly don't want to see such a part in a system that requires high availability, but this is true in networks, software applications, and other industrial systems.

Concurrent
In the operating system, there are several programs in a period of time that are running from the start to completion, and the programs are running on the same processor, but only one program is running on the processor at any one time point.
In a database, a process that allows multiple users to access and change shared data at the same time. SQL Server uses locks to allow multiple users to access and change shared data at the same time without conflict with one another.
Features of operating system concurrency program execution:
In the concurrency environment, due to the closure of the program is broken, there are new features:
① program and calculation no longer corresponds to one by one, a program copy can have multiple calculations
② Concurrent Program has the mutual restriction relation, the direct restriction manifests as one procedure needs another program's computation result, the indirect restriction manifests as many programs to compete a certain resource, such as the processor, the buffer and so on.
③ Concurrent program in the execution is to walk and stop, intermittent advance.

concurrency and parallelism differences and linkages
Concurrency and parallelism are two concepts that are similar and differentiated, parallel refers to two or more events occurring at the same time, while concurrency refers to two or more events occurring at the same interval. In a multi-channel program environment, concurrency refers to a period of time in the macro on a number of programs at the same time, but in a single processor system, each moment can only have a program to execute, so microscopic these programs can only be time-sharing alternating execution. If there are multiple processors in a computer system, these programs can be executed concurrently to be allocated to multiple processors, implementing parallel execution, that is, using each processor to process a concurrent execution of a program, so that multiple programs can be executed concurrently.


Parallel
Parallelism refers to "walking side-by" or "simultaneous execution or implementation". In the operating system, a set of programs executes at a speed that is independent and asynchronous, not equal to the overlap of time (the same moment occurs). to differentiate concurrency. Concurrency refers to the execution of two or more programs in the same time period, with overlapping of time (macro is simultaneous, microscopic is still sequential execution). Parallel also refers to the 8-bit data at the same time through the parallel line transmission, so that the speed of data transmission is greatly improved, but the length of the parallel transmission is limited, because the length increases, the interference will increase, the data will be error prone. Parallel in biology (parallelism) refers to the descendants of the same ancestor who live under similar environmental conditions after being separated, thus producing different groups of similarities and differences.


High Availability
What is high availability (HA)
High availability availability typically describes a system that has been designed to reduce downtime while maintaining the high availability of its services.
For example, we want power and hydro services to be highly available systems.
The reliability of computer system is measured by mean time-out (MTTF), that is, how long the computer system can run normally, only one fault occurs. The higher the reliability of the system, the longer the average trouble-free time. Maintainability is measured by mean time to repair (MTTR), which is the average time spent repairing and resuming normal operations after a system failure. The better the maintainability of the system, the shorter the average maintenance time. The availability of a computer system is defined as: mttf/(mttf+mttr) * 100%. Thus, the availability of a computer system is defined as the percentage of uptime that is maintained.

Scalability
In the field of software engineering, it refers to:
Well-designed code allows more functionality to be inserted into the appropriate location if necessary. The danger is to deal with changes that may need to be made in the future, causing the code to be developed over-engineered.
Extensibility can be achieved through software frameworks: Dynamically loaded plug-ins, carefully designed class hierarchies with abstract interfaces at the top, useful callback function constructs, and very logical and malleable code structures.

Olap
Shorthand for OLAP, with the development and application of database technology, database storage data volume from the the 1980s trillion (M) bytes and Gigabit (G) bytes to the present trillion (T) bytes and Gigabit (P) bytes, while the user's query requirements are increasingly complex, It is not only to query or manipulate one or several records in a relational table, but also to analyze and synthesize the data of thousands records in multiple tables, the relational database system cannot satisfy this requirement. In foreign countries, many software manufacturers have taken the development of their front-end products to compensate for the lack of support of relational database management system, trying to unify dispersed public application logic and respond to the complex query requirements of non-data processing professionals in a short time.

Oltp
On-line Transaction processing online transaction processing system (OLTP)
Also known as the transaction-oriented processing system, its basic characteristic is that the customer's raw data can be transferred to the computing center immediately and the processing results are given in a short time. The greatest advantage of this is that you can instantly process the input data and answer it in a timely manner. Also known as real time system. An important performance metric for online transaction processing systems is System performance, which is reflected in real-time response times (Response time), which is the time required for a computer to respond to a request after it has been fed into the terminal.
The OLTP database is designed to allow transactional applications to write only the data they need to handle a single transaction as quickly as possible.

ACID
ACID, which is an abbreviation for the four basic elements that the database transaction performs correctly. Contains: atomicity (atomicity), consistency (consistency), isolation (isolation), persistence (durability). A support transaction (Transaction) database system, must have these four characteristics, otherwise in the transaction process (Transaction processing) can not guarantee the correctness of the data, the transaction process is very likely not to reach the requirements of the counterparty.

Atomic Nature
All operations in the entire transaction, either complete or incomplete, are not likely to stall in one part of the middle. When an error occurs during execution, the transaction is rolled back (Rollback) to the state before the transaction begins, as if the transaction had never been executed.
Consistency
The integrity constraints of the database are not compromised until the transaction begins and after the transaction has ended.
Isolation of
The execution of two transactions is non-disruptive, and one transaction cannot see the data in the middle of a time when other transactions are running.
Durability
After the transaction is completed, the changes made to the database by the firm persist in the database and are not rolled back.

CAP
C:consistency consistency
A:availability availability
P:tolerance of the network Partition partition tolerance

Bear paw and fish can not be combined, three goals can not be met at the same time. If the "consistency" requirement is high, and must be "partitioned", then it is necessary to sacrifice availability, and for large sites, availability and partition tolerance priority is higher than data consistency, generally as far as possible in the direction of a, P design, and then through other means to ensure consistency of business needs.

BASE
* Basically availble--Basic available
* Soft-state--Soft state/Flexible transaction
* Eventual consistency--final consistency


Sharding  
Sharding definition
The word "Shard" means "fragment" in English, and the technical terminology associated with the database seems to be among the earliest in the massively multiplayer online role-playing game (MMORPG). "Sharding" is called "Shard".
Sharding is not a new technology, but a relatively simple concept of software. As you know, the data table partitioning feature is only available after MySQL 5, so many of MySQL's potential users have concerns about MySQL extensibility, and whether partitioning is a key metric for measuring the scalability of a database (not the only indicator, of course). Database extensibility is an eternal topic, and MySQL advocates often ask: how do you handle processing of application data on a single database that needs to be partitioned and so on? The answer is: sharding.
Sharding is not a feature attached to a particular database software, but an abstraction on top of specific technical details, a solution for horizontal scaling (scale out, or scale-out, and scale-out), with the main purpose of exceeding the I/O capability limits of a single-node database server. Resolve database extensibility issues.

Cache
The cache refers to the temporary file Exchange area, the computer put the most commonly used files from memory to put in the cache temporarily, just like the tools and materials on the workbench, this will be more convenient than the time to go to the warehouse. Because the cache tends to use RAM (a non-permanent storage that loses power off), the file is sent to the hard disk and so on for permanent storage after the work is done. The largest cache in the computer is the memory, the fastest is the CPU mounted L1 and L2 cache, graphics card memory is for the GPU cache, the hard disk also has 16M or 32M cache. Do not think of the cache as a thing, it is a way to deal with the collective!

About caching, I've written an article before. The ubiquitous cache http://iamcaihuafeng.blog.sohu.com/131637030.html in Web application development

Message Queuing
A message is a unit of data that is transferred between two computers. Messages can be very simple, such as containing only text strings, or they can be more complex and may contain embedded objects.
The message is sent to the queue. Message Queuing is the container in which messages are saved during the transmission of a message. The Message Queue Manager acts as an intermediary when relaying a message from its source to its destination. The primary purpose of the queue is to provide routing and guarantee the delivery of messages, and if the recipient is unavailable when the message is sent, Message Queuing retains the message until it can be successfully passed.

Distributed Systems
Distributed Systems (Distributed system) are software systems built on top of the network. Because of the nature of the software, distributed systems are highly cohesive and transparent. Therefore, the difference between a network and a distributed system is more about high-level software (especially the operating system) than the hardware. Cohesion refers to the high degree of autonomy of each database distribution node, and the local database management system. Transparency means that each database distribution node is transparent to the user's application and does not see whether it is local or remote. In a distributed database system, the user does not feel that the data is distributed, that is, the user does not need to know whether the relationship is split, if there is no replica, which site the data is stored in, and at which site the transaction executes.

Reverse Proxy
The reverse proxy method refers to a proxy server that accepts connection requests on the Internet, then forwards the request to a server on the internal network and returns the results from the server to the client requesting the connection on the Internet, Reverse. At this point the proxy server is represented as a server externally.

Load Balancing
Load balancing (Outbound load Balancing) based on the existing network structure, it provides an inexpensive and effective way to extend the bandwidth of network devices and servers, increase throughput, enhance network data processing capabilities, and improve network flexibility and availability.

Load balancer (Load Balance)
Because the core parts of the existing network are increasing with the traffic, the processing power and the computing intensity increase correspondingly, so that the single server equipment can not bear at all. In this case, if you throw away the existing equipment to do a lot of hardware upgrades, which will result in the waste of existing resources, and if faced with the next increase in the volume of business, which will lead to another hardware upgrade of the high cost of investment, even the performance of the equipment can not meet the demand for the growth of the current business volume.

Load balancing has two meanings: first, a large number of concurrent access or data traffic is divided into multiple nodes of the device processing, reduce the time the user waits for response; second, a single heavy load operation is divided into multiple node devices to do parallel processing, each node device processing ends, the results are summarized, returned to the user, System processing capacity has been greatly improved.

Raid
RAID is an abbreviation for "Redundant array of independent disk", in Chinese meaning a redundant array of independent disks. The redundant disk array technology was born in 1987 and was presented by the University of California, Berkeley. Simply explained, the N-drive is combined into a virtual single large-capacity hard drive using a raid Controller (sub-hardware,software). The adoption of RAID is of great benefit to the storage system (or the server's built-in storage), which increases the transfer rate and provides fault tolerance.

RAID disk array (redundant array of independent Disks)
It is characterized by the simultaneous read speed of n drives and the provision of fault-tolerant fault tolerant, so raid is a storage speed issue (Storage) that is not a backup problem (solution) as the primary access data.

Simply put, RAID is a combination of multiple separate hard disks (physical hard disks) in different ways to form a hard disk group (logical hard disk), providing higher storage performance than a single hard drive and providing data backup technology. Different ways of composing a disk array are called RAID levels (RAID levels).

The different technologies used in the disk array for different applications, called RAID level, are redundant array of inexpensive disks abbreviations, and each level represents a technology that is currently recognized by the industry as RAID 0~raid 5.

This level does not represent the level of technology, Level 5 is not higher than level 3,level 1 is not lower than level 4, as to select the type of RAID level products, purely depending on the user's operating environment (operating environment) and applications ( application), and the level of the high and low there is no inevitable relationship.

A basic concept in RAID is called EDAP (Extended Data availability and Protection), which emphasizes extensibility and fault tolerance mechanisms, and is also a manufacturer such as: Mylex,ibm,hp,compaq,adaptec, The focus of infortrend and other demands, including the following actions can be handled without the need for downtime:

RAID disk array supports automatic detection of failed hard drives;
RAID disk array supports rebuilding hard drive bad track data;
RAID disk arrays support hot Spare for hard drives that do not require downtime;
RAID disk array support to replace hot Swap with a hard drive that does not require downtime;
RAID disk array supports expansion of HDD capacity.

Ssds
SSD = Solid State disk, SSD, a solid-state electronic storage chip made of hard disk, widely used in military, automotive, industrial control, video surveillance, network monitoring, network terminals, power, medical, aviation, navigation equipment and other fields.

The most significant advantage of SSDs relative to hard drives is speed, such as a 15000 rpm drive per minute takes 200 milliseconds, while on SSD because the data is stored in the semiconductor memory, can be in less than a millisecond of time to the storage unit in any position to complete I/O (input/output) operation, Therefore, on the most critical I/O performance indicator for many applications--iops (that is, how many IO actions per second), SSDs can reach the 50~1000 times of the hard disk.

Using Flash memory SSD has high data security, and in the noise, portability and other aspects of the hard drive can not rival the advantages, in the aerospace, military, financial, telecommunications, e-commerce and other departments have a wide range of uses.

Idc
IDC (Internet Data Center), the internet datacenter. Refers to the various value-added services provided on the Internet. He includes: Applications for domain names, rental of virtual hosting space, hosting and other services.

IDC, the Internet Data Center, is an Internet-based network that provides operational maintenance facilities and related service systems for centralized collection, storage, processing, and sending of data. IDC's main business includes hosting (camera, rack, VIP room rental), resource leasing (such as virtual hosting business, data storage service), system maintenance (System configuration, data backup, troubleshooting services), management services (such as bandwidth management, traffic analysis, load balancing, intrusion detection, System vulnerability diagnosis), and other support and operation services.

For the IDC concept, there is no uniform standard, but conceptually it can be understood as a public commercial internet "room", it is also an IT professional services, IT industry is an important infrastructure. IDC is not only a service concept, but also a network concept, which forms part of the network infrastructure, like backbone, access network, provides a high-end data transmission (datadelivery) services and high-speed access services.

Cdn
The full name of the CDN is the content Delivery network, which is the contents distribution networks. The goal is to add a new layer of network architecture to the existing Internet, publish the content of the site to the "Edge" of the network closest to the user, so that users can get the content they need, solve the congestion on the Internet, and increase the responsiveness of users to the site. The problem of the slow response speed of the user visiting website caused by the small network bandwidth, the large number of users, and the uneven distribution of dot is comprehensively solved technically. (that is, the content of a server, the average division to multiple servers, the server intelligent identification, so that users get to the nearest user server, improve speed.)

CDN Technology is one of the most effective means to solve the problem of Internet performance, which has arisen and developed rapidly in America in recent years. The basic idea is to avoid the internet can affect the speed and stability of data transmission bottlenecks and links, so that content transmission faster and more stable. By placing the node servers in the network, a layer of intelligent virtual network based on the existing Internet, the CDN system can re-direct the user's request to the nearest service node according to the network traffic and the connection of each node, the load condition and the distance and response time of the user.

Bandwidth
Bandwidth (band width), also known as frequency, is the amount of data that can be transmitted at a fixed time, that is, the ability to transmit data in a transport pipeline. In digital devices, the bandwidth is usually expressed in bps, which is the number of bits per second that can be transmitted. In analog devices, the bandwidth is typically expressed in cycles per second or hertz (Hz).

The "highest data rate" that can be passed from a point in the network to another point in a unit of time. For the concept of bandwidth, a metaphor for comparing images is the freeway. The amount of data that can be transmitted over a line per unit of time, commonly used in bps (bit per second). The bandwidth of a computer network refers to the highest data rate that the net can pass, that is, how many bits per second.

Strictly speaking, the bandwidth of a digital network should be represented by a baud rate (baud), which represents the number of pulses per second. and Bits is the unit of information, because the digital device uses binary, the amount of information per bit level is 1 (the logarithm of 2 for the base 2, if it is a four-input, 2 is the base of the logarithm of 4, each bit level of information is 2). Therefore, in numerical values, the baud is the same as the bit. Because people are not very clear about the two concepts, so often use bit rate to express the rate, it is also the use of bits too many people, so bit rate is a bandwidth fact of the standard term.

"Bits per second" is often omitted when describing bandwidth. For example, the bandwidth is 10M, which is actually 10mb/s, where M is 10^6.

Several conceptual nouns in system architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.