Problem 1: It is difficult for game services to be stateless
The game service architecture is quite different from the Internet architecture. Due to the high real-time requirements for games, it is difficult for many game services to use distributed centralized cache, which makes it difficult to realize the statelessness of real game services. The microservice solution cannot be directly applied to the game. I will introduce the comparison between the game and the Internet, and how the game service is split and decoupled.
Question 2: Manpower shortage
The shortage of staff is actually a common problem of many companies. However, in the gaming companies I have experienced, the staffing of a mobile game project is usually: 5-6 people in the front end, 3-4 people in the back end, 1-2 people in the test and 1 player. dimension. Therefore, it is difficult to have a dedicated person to be responsible for the realization of the automated process of
DevOps, and can only take the time to work overtime and promote the implementation by himself.
Question 3: Cross-departmental collaboration, high cost of early communication training
In the process of transformation, because the communication and collaboration between various departments were separated by a "wall" before, the personnel knowledge system and cognition were different, so the team members did not support or cooperate slowly. We can build a tolerant environment for failure by encouraging shared responsibility, establishing automated processes, pushing down the department walls, creating a
DevOps culture to reward proactive changes, and changing risk management methods.
Question 4: The initial investment is large and the effect is small
When the initial personnel of the project is not enough and the construction period is tight, it is necessary to do infrastructure construction, personnel communication training, etc., the input cost is high and the effect is low, it is easy to make the leadership lose confidence. Therefore, the implementation of
DevOps also needs to be carried out in stages, gradually improving the process, taking each stage to meet the current business needs as the basic criterion, which is also the principle of Yijing Software. My work is generally divided into three periods: product prototype period, product testing period and product operation period.
Product prototype period: This is in the early stage of development, so we generally only need to implement the Git code warehouse, Jenkins CI integration, and use FindBugs or SonarQube to perform static code analysis.
Product testing period: On the basis of the previous, continue to implement Jenkins integrated Gradle to achieve automatic construction and packaging, unit testing, deployment to the test environment and other processes.
Product operation period: Finally, the assembly line is improved to realize the automatic deployment of the pre-production environment and the production environment, and the grayscale update.
DevOps' advanced ideas and perfect concepts are the best solutions I think so far, but
DevOps's final landing is largely due to its complete set of technologies and open source tools. Next let's take a look at the technology stack DevOps is thinking about.
Technology Stack
If the content of this section is too much involved, I will briefly introduce some of the common open source
DevOps technical tools. You can choose to use it according to your needs. Of course, you can also use something like VSTS (Visual Studio Team Services). Integrated team environment.
Some of them are detailed in my new book, such as code warehouse management, virtual machine and containerization, continuous integration & continuous deployment tool Jenkins, configuration management tool SaltStack.
Agile management tools
Trello
Teambition
Worktile
Tower
Product & Quality Management
confluence
Zen Tao
Jira
Bugzila
Among them, confluence and Zen Tao are mainly comprehensive management tools for product requirements, definitions, dependencies, and promotion; while Jira and Bugzilla are product quality management and monitoring capabilities, including test cases, defect tracking, and quality monitoring. Currently we use Jira more.
Code warehouse management
Github
Git is an open source distributed version control system; Gitlab and Github are open source projects for warehouse management systems. They use Git as a code management tool and build a web service on this basis. We mainly use Git and Gitlab.
Development process specification
Git Flow
Git Flow is a model for organizing software development activities built on Git, and is a software development best practice built on Git. Git Flow is a set of code of conduct and tools for simplifying some Git operations when using Git for source code management.
Github Flow
Github Flow is a simpler alternative to Git Flow. It has only one feature branch and one master branch, which is simple and clean.
Gitlab Flow
GitHub Flow believes that you can directly deploy code online by merging feature branches.
Selenium
Selenium tests run directly in the browser, just like real users do. The Selenium test can be run in Internet Explorer, Mozilla, and Firefox on Windows, Linux, and Macintosh.
Mock test
Mock test is a test method that uses a virtual object to create some objects that are not easy to construct or obtain during the testing process. This virtual object is the Mock object, and the Mock object is the substitute for the real object during debugging. The Mock framework in Java is commonly used EasyMock and Mockito.
Consumer-driven contract testing
Contract test is a test for the interface of external services, which can verify whether the service meets the contract expected by the consumer. When some consumers use the behavior provided by a component through an interface, a contract is created between them. This contract contains expectations, performance, and concurrency for input and output data structures. And PACT is the current more streamlined consumer-driven contract testing framework.
Automated operation and maintenance tools
Ansible
Puppet
Chef
IT operation and maintenance automation refers to the automation of daily and large amounts of repetitive work in IT operation and maintenance, turning the past manual execution into an automated operation. Automation is the sublimation of IT operation and maintenance work. IT operation and maintenance automation is not just a maintenance process, but also a management improvement process. It is the highest level of IT operation and maintenance, and it is also the future development trend.
Monitoring and management tools
Zabbix
Zabbix is an enterprise-level open source solution that provides distributed system monitoring and network monitoring functions based on the WEB interface.
ELK Stack log analysis system
ELK Stack is an open source log processing platform solution, and the commercial company behind it is Elastic. It consists of three parts: Logstash, a full-text search engine based on Lucene, Elasticsearch, and the Kibana analysis and visualization platform.
Cloud monitoring (such as Amazon CloudWatch)
Amazon CloudWatch is a service that monitors AWS cloud resources and applications running on AWS. You can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, set alarms, and automatically respond to changes in AWS resources
Game architecture
Comparison of game industry and Internet industry
Project iteration cycle comparison
Internet iteration model
Game project development cycle
Through the above comparison, we can see that each iteration of the Internet project's requirements can be more agile and faster, because it can split large requirements into multiple small specific implementations, which can ensure continuous continuous delivery and deployment. .
The iteration of the game will be more difficult and longer than the Internet. Because a game can be delivered to users, the most basic functions and gameplay must be complete before it can be tested and used.
Request communication mechanism comparison
A simple comparison of games and Internet services
In the Internet, the request-response mode is generally used. Generally, each request is a synchronous blocking method; while the game is mostly in the request-push mode, which not only pushes itself, but also pushes to other users in the game. Every request in the game is It is asynchronous and non-blocking.
Summary: The biggest difference between the Internet server and the game server is actually the "state". The state of the game server is rapidly changing in real time, can tolerate loss, and requires a lot of broadcast synchronization; while the state of the Internet server is generally persistent, What is not tolerated is only related to specific clients. Therefore, the difficulty of implementing DevOps in the game is much greater than that of the Internet, and the mature implementation of the Internet cannot be copied to the game. Next, I will analyze what the common game service architecture looks like from the game architecture-the source of DevOps implementation?
Common game service architecture analysis-the roots of
DevOps
Rest card game
Rest game architecture
This type of game generally uses the http communication mode. Its architecture is similar to the commonly used web server architecture. It uses redis centralized cache to save the game state, so that it can be loaded through nginx, and the game server can support unlimited horizontal expansion.
Open room game
Room Mode Game Architecture
This type of game is generally divided into two parts on the server side: one is the lobby server, and the other is the room server. The hall server is a huge broadcast cluster, responsible for data transmission and query in less real time. A room server is a set of small real-time broadcast service processes that can be quickly rented and returned.
In the lobby server, all online players are distributed in one of multiple processes according to their IDs. In the query and broadcast operations between players, multiple servers are operated in parallel, and finally the results are provided by way of summarizing the results. This operation delay will be relatively high, but it can allow massive user data to be stored on different machines; while the room server is only responsible for providing specific game broadcast functions. Once the players form a group to enter, the lobby server will copy the data to The room server, and the room server is only responsible for these few players. After the game is over, the player data is cleared and a new game is prepared.