Build efficient research and development and automated operation and maintenance

Source: Internet
Author: User
Tags solr radar

Why does IT operations need to be automated?

The so-called IT Operations Management Automation refers to the daily IT operations through a large number of repetitive work, small to simple daily inspection, configuration changes and software installation, large to the entire change process of organization scheduling, from the past manual execution to automated operations, thereby reducing or eliminating the delay in operations, to achieve "0 delay" IT operations. To put it simply, IT operations automation refers to a process-based framework that relates events to it processes that, when performance is exceeded or down, triggers related events and pre-defined processes that automatically initiate failure response and recovery mechanisms. The automated work platform also helps IT operations to perform daily repetitive tasks (such as backup, antivirus, etc.) and improve IT operations efficiency. At the same time, IT operations automation also requires the ability to predict failures, to be able to alarm before failure occurs, so that IT operations personnel to eliminate the failure before the occurrence of the loss to a minimum.

Operational dimensions should include the following:

    • Environment definition: Development environment, test environment, class production environment, production environment, etc.
    • Deployment: The ability to deploy deployment packages efficiently to different environments.
    • Monitoring: Ability to monitor post-deployment systems and applications.
    • Alarm: Response and handling mechanism when a problem occurs.
    • Performance optimization: Optimization of system services such as nginx/java/php/db/network.
    • SLA protection: Usually discussed with business-related departments to determine.

Service governance, task scheduling, cluster collaboration, call chain analysis, interface quality, SQL quality, real-time logging, and more

Packaging, automated testing, testing, grayscale publishing, zoning on-line, operations automation, configuration standardization, Directive standardization, etc.

Infrastructure such as distributed framework, storage & Cache middleware, automated testing, cloud search, open platform, marketing platform, etc.

Self-built technology infrastructure (open source + self-research)
? Automated publishing System--grayscale publishing, Partition publishing
Operation and Maintenance Configuration automation system--operation and maintenance system automatic discovery, standardized configuration
Atomic command system-supports hundreds of servers, hundreds of atomic script operations
Search platform-Support hundreds of index, hundreds of millions of data
-Recommended computing platform-supports hundreds of millions of user data calculation
? API Automation test System, mock simulation test system--support interface Automation test, simulation test, web Automation test
? API drainage system, SQL Waterproof system--management system unreasonable call
? Real-time log system-supports Nginx, TOMCAT, BI real-time logs and offline tracking
? Distributed development Framework-Unified distributed communication
Configure distribution System--support configuration items, Cluster service discovery
? MQ distributed message Middleware (push mode IDP, pull mode Kafka)--1500w/Monday ~ Friday, 600w/week 6th
? KV Distributed Cache System Middleware (Memcached, Redis, Tair)-billion-tier data cache, 95% hit ratio
? LPFS Distributed File Middleware (MongoDB)--mongodb, pictures, files
? DB Database Sub-Library sub-table middleware (MySQL)-Unlimited data volume expansion
? Distributed task Scheduling middleware (Schedule)--Support 100+ service, 200+/distributed task scheduling
? Push Unified Messaging Push platform – Daily 100w+ push to Android, IOS, Email, SMS, Comet

rely on the open source technology stack
Language: Java (tomcat/spring) Shell (OPS) Nodejs (front end) Android IOS
? Distributed: ActiveMQ Kafka Zookeeper Router service discovers Cat
? Storage: Mysql Mongodb tair Memcached Redis
? Calculation: SOLR ElasticSearch Hadoop HBase Storm Spark
Operation: Linux Nginx Puppet Zabbix OpenStack
Project management: Eclipse Git maven build Hudson Continuous Integration confluence knowledge sharing DMS Project management

Development phase Code/build
? Development Framework
? | -web Development Framework Swift
? | -nodejs front-end development Framework
? | -ios Mobile development Framework
? | -android development Framework
? | -shell script Automation
? Distributed middleware
? | -Distributed call RPC
? | -Real-time push comet
? | -Push Message Queuing IDP
? | -Pull Message Queue Kafka
? | -Configure the System zookeeper
? | -Scheduling system Scheduler
? Storage middleware
? | -Relational store MySQL
? | -File storage MongoDB
? | -KV storage Tair
? | -Level Two cache Redis
? | -First level cache memcached
? Compute platform
? | -Cloud Search
? | -Recommended
? | -Big Data calculation
? | -page parsing
? | -Text parsing
? | -word Preview
test phase Test/ci
? | -API automation Test
? | -API simulation test mock
? | -web automated test selenium
? | -Test wxtest
? | -open test katest
? | -Test environment Release
online phase Release/deploy
? | -Publishing system
? | -operation and Maintenance system
? | -Code Detection Builder Operations Phase
operation and maintenance System Monitor
? | -Automation system
? | -Monitoring System Zabbix
? | -Radar log system
? | -puppet/mco

Services Governance Service
?| -api Drainage System Apiwater
?| -sql Drainage System Monyogsql
?| -router Service Center
?| -Configure the distribution system
?| -Dispatch System Scheduler
?| -Call chain system cat operation phase
? Open Platform
?| -Platform Weixin
?| -Weibo platform Weibo
?| -Telephone Platform Jiya
?| -Payment Platform Pay
?| -Open Platform API
?| -seo Platform Resource
? Operating Platform Channel
?| -Push platform pushes
?| -SMS Platform Push
?| -Email Platform Mail
?| -Platform Open
?| -Private Messages Platform Messagecode

1. Distributed Service architecture

service discovery, communication, control
Distributed Registration Center Router:
? synchronous call to RPC
? Service agreement: HTTP protocol/Heartbeat detection
? service discovery: Cluster Information Unified file router.conf
? load Balancing
? Calling MQ Asynchronously
Push mode: Development fast, stable, real-time fast
Pull mode: Traceable, log collection, data synchronization
? Distributed task Scheduling
? Schedule Dispatch System
? Distributed transaction Control
? Swift Development Framework: Transaction-based transactional consistency

2. Automation system developed by operation and maintenance

Standardization of operation and maintenance 3 large level


? 2.1, Hardware standardization:
?-machine standardization: engine room, rack position, switch, machine
?-Resource standardization: IP, DNS
?-Standardized configuration: Automatic collection of machine configuration, standardized inspection, KVM
? 2.2, Software standardization:
?-Standardized Software Installation: Tomcat Jdkmemcachedredis ...
?-nginx standardization: Domain name, configuration, release
? 2.3, Project standardization:
?-project configuration standardization: s, zone A, zone B, Area C
?-supports multiple projects: Tomcat, Java, Nodejs, Python, ios\android

2.1, hardware standardization-automated acquisition

2.2, software standardization-unified software specifications

2.2, software standardization-automatic safety loading and unloading

2.2. Software standardization-automatic service management

2.2, Nginx standardization-automatic configuration 300 domain name

3. Project Release automation system
? 3.1, Code release system
?-Grayscale Publishing
?-Partition Release: Swimlane Publishing

3.2. Configure the publishing system
?-Publish configuration information
?-Cluster Collaboration: SOLR, Kafka

? 3.3, Atomic instructions
?-system-level operations
?-System Operation Log

4. Service Management System
? service health status detection
? Distributed task scheduling (Schedule)
? Call chain analysis (CAT)
? real-time log monitoring (radar system)
? API quality Governance (Apiwater)
? SQL quality governance (MONYOG)

4.1, service health status detection

4.2. Distributed Task Scheduling Schedule

Distributed Dispatch Center:
? Based on Mina distributed coordination
? Select a single point of dispatch for a service
Multi-point service failover
Long-time Task breakpoint continuation
? task dependent Scheduling

4.3. Call Chain analysis Cat

4.4. Real-time log monitoring (radar system)

? Real-time Log view
? history Log Analysis
? user or IP tracking
? Log statistics

4.4. Real-time Log monitoring

4.6. SQL Quality Management (MONYOG)
? MySQL Performance monitoring tool monyog, parsing slow sql
? program print Slow SQL log
? optimize indexes, table structure

5. Automated construction of test environment

6. Automated Testing

Automated Test-API Automated Testing

Automated Test-web Automated Testing
? Automated testing of Selenium-web pages

Automated Test-mock Simulation test

The above content is from the network, hope to your system architecture design, software development help. Other articles you might be interested in:

Design ideas of Internet database architecture
Practice of a large-scale electric clouds platform
Enterprise Application Architecture mode N-tier multi-tier architecture
An enterprise social application network topology architecture diagram
IT Infrastructure planning scenario one (network system planning)
Food chain Company It informatization solution One

If you want to know more software development, system it integration, Enterprise informatization, project management and other information, please follow my subscription number:


Petter Liu
Source: http://www.cnblogs.com/wintersun/
This article is copyright to the author and the blog Park, Welcome to reprint, but without the consent of the author must retain this paragraph, and in the article page obvious location to the original link, otherwise reserves the right to pursue legal responsibility.
The article was also published in my Independent blog-petter Liu blog.

Build efficient research and development and automated operation and maintenance

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.