Absrtact: June 29 2016 Cloud Habitat Chengdu Summit opened the curtain, Aliyun senior experts Jianzhi brought "massive game log storage and analysis" important speech. From data, cloud computing to change the game industry, and then talk about the whole process of log service, including the role of log, log processing challenges, as well as the principle of log channel, model, and finally analyzed the log service part of the function and typical application scenarios. Let's take a peek--game and log analysis data, cloud computing change gaming industry
Let's take a look at a picture, which is a report of the foreign application market: Statistics of the last 4 years, a game from the shelves to reach 90% download duration of the length of time, the horizontal axis represents the year, vertical axis represents the number of consecutive weeks. In 2012, an average game lasts 180 weeks (ie, it's been downloaded for 2014 years), but the ratio continues to slide every year, to 2015, to 24 weeks, into the fast-food-consumption era.
Whatever the reason behind it, from the whole trend of the game industry has been from the seller's market (20 ago game cassette lending to each other, one card hard to find), to the current buyer's markets.
The second trend is that cloud computing has changed the industry, with a notable feature of the game deployment and the shortened time to online. The original heavy operation and maintenance work to further reduce, the traditional sense of operation and maintenance to operations. This is the challenge of the Times, but also the whole stack of engineers lucky.
Just two points are more common problems, the 3rd is the opportunity to make big data. We'll open 2015 24 weeks to see where the opportunity is. As you can see from the chart, there are usually 4 stages of the game: research and development, growth, maturity, and recession. In the growth phase we will encounter imitators, to seize market share. How to deal with imitators. Make fewer mistakes, get closer to the user, and change your business strategy in real time.
The game industry changes fiercely, but the user is always the same
Although the market is very fierce, but the user's habit has remained unchanged for more than 20 years. We can look at a ongamesdata in the comics. In the face of a new game: Users will first attempt to download the demo will love the game, and the game with some love of the players will be in facebook/twitter and other dissemination of the game, introducing more players to pay for the game
Objectively, the game users are not because of the fierce market competition and reduce, the opposite of mobile Internet to allow users to enjoy the game in their leisure time, social media makes good games more easily spread, users are increasingly willing to spend money, market opportunities or exist.
What does the team need to focus on to get the user?
In order to allow users to fall in love with our game, the team of different people in different aspects of the resultant force. Let's take a look at some examples of what different people will focus on. Game Director: Conversion RATE,ARPU,ARPPU,UAC operation: DAU,MAU,PCU Channel: CTR,CVR,CPI Program/Product: Eed,xed,outbound message per User,message Conversion Rates
Here are some examples of concern: the reasons behind the growth of Dau in FarmVille (Happy Farm)
Operating in the eye: the promotion of the route: advertising money to get how many users a professional player's road a page to play home road
3.FPS game Statistics The balance of the props, as well as the difficulty of setting the level of the programmer through different kill events statistic the balance of the weapon through the map visualization, adjust the difficulty and setting of the checkpoint
Collect user behavior and optimize the whole process of the game
What the game team needs to do to get the results. We can roughly split into three processes: in the game development phase buried in the various channels to collect data on the multidimensional analysis of the data, get the results to take action
The most important point in the whole process of stringing up the client and server is data. Two states in the game: Slice (Snapshot) status and increment log
To better understand the relationship between the log and the game, let's look at what is log data and what is the relationship between the game:
The game at the user side appears to be the alternation of two behaviors: action, drawing. When the mouse is moved and the keyboard is clicked, we change the position and state of the protagonist, and then the rendering engine is drawing. We can sample the state of the game at a point in time, such as 10:06, the position of all the protagonists in the game, money and hand-held weapons as follows, the state reflects the whole system at a point in time.
The log is the amount of change between the state and the status. For example, 10 o'clock-10:06 What the user did at this time. The best thing about a log is that it can record the entire detail.
In addition to the earlier mentioned to help the operation, the channel more efficient operation, the role of the log on the game. Help users: Find lost: Equipped with the repair data: machine when the machine data is lost, can be restored by the log to the anomaly: find theft, cheating and other behavioral ads; no win boss: What is missing props user portrait: Age, Sex, what is your dish. Log can help the operation of the user to reflect the card, in what link landing failure, behind what is the reason log to deal with a variety of challenges
There are so many roles in the log, so what are the challenges of dealing with it?
The first challenge is related to the creation of a game involving all aspects of cooperation. For example, the game publisher, Mobile end, Web end, server side and so on. So to collect logs from multiple dimensions, multiple channels, for each log there is a unique approach: for example, in order to analyze the channel we need to bury the site; In order to get the user's behavior, we need to record the player trajectory from the mobile device, the service side and so on, in order to analyze the stability of the service, we need to observe the delay of
Here we need to use a unified data model to support the data channels of each channel to complete the unified event.
The second challenge comes from scale, performance, and Stability: Take a visual example, assuming that we need to collect a user's 1KB data per second, and that the number is 100mb/s to deal with when the 100W player is online, which is a big challenge. How to maintain the stability of performance in the case of data scale growth is an engineer's concern.
The third challenge comes from demand, which we mentioned earlier that different people in the game team have different outputs for the requirements. For example, to access the log, the operation of the demand is statistical active number of people, the operational dimension of the relationship is delayed and access state, the development of concern is what resources are hot, need to optimize. Therefore, we need a data to support a variety of processing, statistical methods. Aliyun Log Service
Let's see how the Aliyun Log service can help solve such problems. Before opening in the Aliyun official website, the log service has experienced more than 3 years of experience in Alibaba, reaching the industry leading level. In the game log analysis of the scene can also play a huge role.
Log service provides three main functions: Log collection and implementation consumption (Loghub: Log central channel) through log and Data Warehouse (logshipper: Log delivery) to provide mass log query and analysis (Logsearch, log retrieval)
Loghub is the core function of the log service, which connects the log source with the log consumer through a unified data model. The biggest advantage is the data pipeline, standardization. The pipeline is large capacity, high reliability, and flexible telescopic, users do not care about the amount of data, how to connect the top issues, you can directly use. Loghub has 10+ Language SDK, provides agents and supports Third-party agents, syslog,webtracking, and other protocols. At the end of the consumption, the 10+ is butted downstream, including the spark, Storm and so on, which are very hot in the open source world.
Logshipper provides the ability to loghub pipeline data to storage. At present, with OSS, ODPs, OTS and other mass storage systems. Data processing and analysis can be done through mapreduce, hive and other methods.
Logsearch is an additional feature that allows you to choose to index and query data in Loghub, such as the app, error, and so on, and you can quickly locate a query when a problem occurs. Within Ali we collect and index key logs from thousands of machines centrally, up to hundreds of TB.
Loghub (log channel) and log base concepts
A log is a series of records that are monotonically increasing, sorted exactly by time. Looks as follows:
The log order is determined by the "time". From the graph you can see the chronological order of the log from right to left, the new events are recorded, the past events fade away, but it records what happens at what time, which is the basis of cognition and inference for computers, humans, and the world as a whole.
The Log channel (Loghub) basic concepts and data model are as follows: log: the smallest data unit processed by the log in the log service. The log service uses a semi-structured data pattern to define a log that contains a time field and a JSON-composed keyvalue Pair log Group (Loggroup): A set of sets of logs, the basic unit of writing and reading. Log group limit is: maximum 4096 lines of log, or 10MB space. Partitions (Shard): Read and write basic units under each log library, and users can specify the number of partitions under each log library. Each partition can carry a certain amount of service Capability Log Library (Logstore): The log library is the unit of collection, storage, and querying of log data in the log service. Each log library is attached to one project, and each project can create multiple log libraries. Users can generate multiple log libraries for a project based on actual requirements, and it is common practice to create a separate log library for each type of log in an application. For example, if a user has a "big-game" game, there are three kinds of logs on the server: the action log (Operation_log), the Application log (Application_log), and the Access log (Access_log), where the user can first create the name " Big-game project, and then create three log libraries for the three logs below the project for their collection, storage, and querying. Project: The project is the basic snap-in in the log service for resource isolation and control. A user can manage all the logs and related log sources for an application through a project.
What are the advantages of Loghub? Stable, reliable, high-performance in the Ali group temper for many years, has withstood the pb/day level of flow test. In particular, the client's performance and resource consumption, is open source software more than 10 times times.
Elastic scaling: When the changes in the data caused by business changes, you can calmly respond.
Rich upstream and downstream support mobile end, Web page, switch, equipment (ARM platform), such as ECS, MNS, OSS, CDN, Containerservice natural through; Maxcompute Easy Docking Support Unity3d and other games development SDK support Logagent,logstash,log4j,syslog and other systems
In addition to Loghub, simply mention the Logshipper and Logsearch functions
Logshipper is a Loghub product add-on that supports the delivery of real-time log data to storage class services (Oss,odps, OTS) for off-line analysis and calculation. The biggest benefit is 0 cost, convenience, reliability, high throughput rate.
Logsearch (original SLS) can real-time index log data, scale up to a day hundred TB level, to provide convenient, a large number of query capabilities. It mainly makes up the vacancy between the Loghub real-time and the Logshipper delivery number storehouse, and provides a lightweight and quasi real-time query ability.
For example, in the game development process we will have a lot of systems, distributed on different machines. We just need to collect and index these system logs. You can quickly navigate to the user's behavior characteristics by searching for the user's ID, status, and so on.
Wind and waves in the digital world
Finally, let's take a look at the game of a typical log scheme, as well as the summary of the game scene log analysis of the process: embedded in the program log to collect log analysis log (query, statistics, reports, alarms, etc.) + action
More interesting content can be concerned, thank you. Log service home page log processing circle
Sweep me, and cloud habitat online exchange "cloud Habitat Express" the first Alibaba online technology summit, will be held on July 19-21st 20:00-21:30 online. The Summit invited to Ali Group 9-bit Technology V, sharing the electric business framework, security, data processing, database, multiple application deployment, interactive technology, Docker continuous delivery and micro-services, such as front-line combat experience, interpretation of the latest technology in the Ali Group application practice. For more information, please click