"Editor's note" reproduced from Che Yun Net, author army.
This paper will describe the business model and the way of mining data from the angle of large data. OBD car networking in accordance with the "sell XX" to the business model, there are three kinds of things to sell: equipment, services, data.
Sell equipment: Sell the OBD equipment to the network operator. OBD equipment must contain OBD module, communication module is Bluetooth or GSM, positioning module can choose GSM Base station positioning, or GPS, and some manufacturers of equipment including G-sensor.
Selling services: Usually by car network operators to provide vehicle owners with a variety of car service, such as fleet management, but also include value-added services, such as the 4S group using OBD equipment to strengthen the connection with customers. Services are usually charged on a yearly basis, and service charges may contain or do not contain equipment prices. At present, the service mode of the individual owner is not to force.
Selling data: This way is very internet, refers to the vehicle network data analysis, so as to provide a personalized service, this service is not limited to car use, more focused on car activities. Currently prevalent is the sale of insurance, from the previous mileage based on the Payd, to consider driving safety of the phyd, developed to the present synthesis of various factors of the ubi.
"Sell XX" mode, not limited to the car network OBD equipment, the car before the same test. The current ubi is mainly through OBD interfaces that are easy to install and disassemble;
Controllable and Open is the king
Taking Ubi as an example, in order to exert the maximum value of the data, the data should have a certain and controllable openness to form a good ecological chain and its diversity. Specifically, the data openness is reflected in three: equipment, vehicle networking services, insurance services.
Equipment。 For Ubi operating platform, the equipment is not important to which manufacturers, it is important that the data collected by these devices can enter the operating platform. These data can be heterogeneous, that is, the structure is different, but it should be synonymous with the same meaning and value.
Vehicle Networking Services. The current vehicle networking service provider is to control the equipment, and the OBD equipment is more by the service providers themselves (because of the need to sell equipment to make money). For these service providers, Ubi can also take the form of data cooperation, allowing owners to choose their own. In addition, other derivative services based on these data in the future should provide the underlying data in a openapi manner.
Insurance Services. For the vehicle networking service providers, can also provide several insurance companies with equipment, systems or services, then they may have their customers to choose insurance companies. UBI operating platform can open the relevant insurance data, and insurance companies to control the cash flow and other financial operations.
Of course, openness is relative and must be in a controllable and safe condition. To make the data open and controllable, we need a good combination of operation management and technical processing.
The way of large data mining
In the case of selling insurance in the data, the process of processing the entire data is shown in the following illustration:
In the case of other data-selling schemes, the driving behavior model, premium risk model, auto insurance management system and policy claim data four parts may be adjusted accordingly, for example, the driving behavior model becomes the LBS position model.
In the above flowchart, the Italic section is marked with different links to the data calculation requirements. When monitoring and managing vehicles, the requirement of real time is high, when the model of driving behavior is calculated, the batch processing method can be adopted, and if there are some timeliness requirements, the event-driven computing model can be combined, and as the risk model of insurance, it belongs to the BI category, and the premium risk model is also a new demand in recent
For such a system, the past scenario is generally: relational database + Large memory + bus or message system, which may include workflow and rule engine as needed. If you use Java Open Source technology, then this scenario, usually the database operation components, memory components, bus components, etc. as part of a whole framework, the program is packaged and run under the server, according to different needs, may have to solve the table, Fail over, hot deployment and so on.
The popularization of large data technology now, the alternative scheme for this system is: relational database +nosql+ flow calculation + distributed batch computing +bi. These schemes have more mature technology at present, and have solved the problem of transparent communication, hot deployment, Fail over programming and system management. (Note: The above system composition does not give a person's operating end, the car's operating end and other terminal parts.) And in this respect, the whole system has changed, in the past to the car and PC mainly, the current is more mobile phones. Mobile phone to join, changed not only a more than a display interface, a lot of operating methods, but a lot more demand. )
The relational database is used in the Network Operation Management Section, this part of the business and technology has been relatively mature, refers to the preservation of vehicle and owner data (including maintenance).
NoSQL is used to manage data collected from automobiles, as well as subsequent processes, but different parts should be selected with different NoSQL solutions adapted to their respective characteristics:
The traffic data is more suitable for storing in the way of log file. Vehicle-reported data is usually based on byte-coded such as ASN.1, which needs to be computed and then decoded.
Monitoring management results, more suitable for a memory database scheme, may need to support the fast read and write history data, and timed or quantitative data to write (solid) hard disk.
Driving behavior model, it is necessary to consider how to solve the problems such as the multiplicity of the types (even compound types) of the related data and the value of the factors after the change calculation parameters are recalculated, added or subtracted.
Premium risk models, either compatible with current insurance schemes, or adapted to new bi (new bi will be introduced later).
Flow calculation is used to meet the requirements of real-time vehicle supervision. Different streaming systems focus on solving different problems. For example, storm solves the problem of real-time distributed computing, including computing flow can be distributed on one or more machines, dynamically adding and subtracting servers and fail over self management, communication mechanism transparent, thermal deployment calculation flow, etc. esper solves the rules and relationship between events. If monitoring requirements result in multiple and complex data, a memory database is necessary.
The most popular solution for distributed batch computing is Hadoop. One of the hotspots of current Hadoop is to transform Hadoop to meet a certain timeliness requirement, not just batch processing, but timeliness, because it does not reach the level of real-time.
BI (Business Intelligence). In the current large data environment, the traditional approach based on relational database presents several deficiencies: 1, traditional programs focus on social (analysis of the overall model, with the individual characteristics of the comparison), it is difficult to meet the individual at a certain time "complex/chaotic/divergent" needs; 2, the traditional program in the data volume is very large, May be sampled, it is difficult to achieve a full analysis; 3. More Internet company data and enterprise system data, its storage has already used NoSQL scheme, traditional scheme is difficult to match. The BI framework, which solves the above three problems, is still immature.
No matter which part of the data processing, the use of that processing technology, the quality of data recognition, good or bad control is necessary. In the network, from the car or OBD equipment, due to the variety of models, the complexity of the equipment working environment, the data can not achieve a unified quality standards, how to deal with the different availability of data, how to deal with the value generated by these data, is a key issue to consider.
(editor: Heritage)