Create a distributed system using line 1 code using Mesos, Docker, and Go

Source: Internet
Author: User
Tags docker registry apache mesos

Create a distributed system using line 1 code using Mesos, Docker, and Go
It is very difficult to build a distributed system. It requires scalability, fault tolerance, high availability, consistency, scalability and efficiency. To achieve these goals, a distributed system requires many complex components to work collaboratively in a complex way. For example, when Apache Hadoop processes terabytes of data in parallel in a large cluster, it needs to rely on a file system (HDFS) with high fault tolerance to achieve high throughput. Previously, every new distributed system, such as Hadoop and Cassandra, had to build its own underlying architecture, including message processing, storage, network, fault tolerance, and scalability. Fortunately, systems like Apache Mesos simplify the tasks of building and managing distributed systems by providing management services similar to operating systems to key building modules of distributed systems. Mesos removes CPU, storage, and other computing resources, so developers can treat the entire data center cluster as a giant when developing distributed applications. Applications built on Mesos are called frameworks that solve many problems: Apache Spark, a popular cluster-based data analysis tool; Chronos, A fault-tolerant distributed scheduler similar to cron, which is an example of two frameworks built on Mesos. The build framework can use multiple languages, including C ++, Go, Python, Java, Haskell, and Scala. In distributed system use cases, bitcoin mining is a good example. Bitcoin converts the challenge of making an acceptable hash into verifying the reliability of a transaction. It may take decades. It may take more than 150 years to dig a single laptop. As a result, many mining pools allow miners to combine their computing resources to speed up mining. Derek, an intern at Mesosphere, wrote a bitcoin mining framework ( https://github.com/derekchiang/Mesos-Bitcoin-Miner Use the advantages of cluster resources to do the same thing. The following code is used as an example. One Mesos framework consists of one schedtor and one executor. Schedtor communicates with the Mesos master and determines the task to run, while the executor runs on slaves to execute the actual task. Most frameworks implement their own schedors and use a standard executors provided by Mesos. Of course, the executor can also be customized by the framework. In this example, a custom scheduler will be compiled and the standard command executor will be used to run the Docker image containing our bitcoin service. For scheduler here, two types of tasks need to be run: one miner server task and multiple miner worker tasks. The server communicates with a bitcoin mining pool and assigns blocks to each worker. Worker will work hard to mine Bitcoin. The task is actually encapsulated in the executor framework. Therefore, running the task tells Mesos master to start an executor on one of the Server Load balancer instances. Because the standard executor is used here, you can specify a job as a binary executable file, bash script, or other command. Since Mesos supports Docker, an executable Docker image will be used in this example. Docker is a technology that allows you to package an application with the Dependencies it needs during running. To use Docker images in Mesos, you must register their names in Docker registry:

const (    MinerServerDockerImage = "derekchiang/p2pool"    MinerDaemonDockerImage = "derekchiang/cpuminer")
Define a constant to specify the resources required for each task:
const (    MemPerDaemonTask = 128  // mining shouldn't be memory-intensive    MemPerServerTask = 256    CPUPerServerTask = 1    // a miner server does not use much CPU)
Now define a real scheduler, track it, and ensure that it is in the correct running status:
type MinerScheduler struct {    // bitcoind RPC credentials    bitcoindAddr string    rpcUser      string    rpcPass      string    // mutable state    minerServerRunning  bool    minerServerHostname string     minerServerPort     int    // the port that miner daemons                                // connect to    // unique task ids    tasksLaunched        int    currentDaemonTaskIDs []*mesos.TaskID}
This scheduler must implement the following interface:
type Scheduler interface {    Registered(SchedulerDriver, *mesos.FrameworkID, *mesos.MasterInfo)    Reregistered(SchedulerDriver, *mesos.MasterInfo)    Disconnected(SchedulerDriver)    ResourceOffers(SchedulerDriver, []*mesos.Offer)    OfferRescinded(SchedulerDriver, *mesos.OfferID)    StatusUpdate(SchedulerDriver, *mesos.TaskStatus)    FrameworkMessage(SchedulerDriver, *mesos.ExecutorID,                      *mesos.SlaveID, string)    SlaveLost(SchedulerDriver, *mesos.SlaveID)    ExecutorLost(SchedulerDriver, *mesos.ExecutorID, *mesos.SlaveID,                  int)    Error(SchedulerDriver, string)}
Now let's look at a callback function:
func (s *MinerScheduler) Registered(_ sched.SchedulerDriver,       frameworkId *mesos.FrameworkID, masterInfo *mesos.MasterInfo) {    log.Infoln("Framework registered with Master ", masterInfo)}func (s *MinerScheduler) Reregistered(_ sched.SchedulerDriver,       masterInfo *mesos.MasterInfo) {    log.Infoln("Framework Re-Registered with Master ", masterInfo)}func (s *MinerScheduler) Disconnected(sched.SchedulerDriver) {    log.Infoln("Framework disconnected with Master")}
Registered is called after scheduler successfully registers with the Mesos master. Reregistered is called when scheduler and Mesos master are disconnected and re-registered, for example, when the master is restarted. Disconnected is called when schedted and Mesos master are Disconnected. This occurs when the master node fails. So far, only the log information is printed in the callback function, because for a simple framework like this, most callback functions can be left empty. However, the next callback function is the core of every framework and must be carefully written. ResourceOffers is called when scheduler obtains an offer from the master. Each offer contains a list of resources that can be used by the framework in a cluster. Resources usually include CPU, memory, port, and disk. A framework can use some of the resources it provides, all resources, or a little resource. For each offer, it is expected to gather all the resources provided and decide whether to release a new server task or a new worker task. Here, you can send as many tasks as possible to each offer to test the maximum capacity, but because bitcoin exploitation depends on the CPU, therefore, each offer runs a miner task and uses all available CPU resources.
for i, offer := range offers {    // … Gather resource being offered and do setup    if !s.minerServerRunning && mems >= MemPerServerTask &&            cpus >= CPUPerServerTask && ports >= 2 {        // … Launch a server task since no server is running and we         // have resources to launch it.    } else if s.minerServerRunning && mems >= MemPerDaemonTask {        // … Launch a miner since a server is running and we have mem         // to launch one.    }}
Create a corresponding TaskInfo message for each task, which contains the information required to run the task.
s.tasksLaunched++taskID = &mesos.TaskID {    Value: proto.String("miner-server-" +                         strconv.Itoa(s.tasksLaunched)),}
Task IDs are determined by the framework, and each framework must be unique.
containerType := mesos.ContainerInfo_DOCKERtask = &mesos.TaskInfo {    Name: proto.String("task-" + taskID.GetValue()),    TaskId: taskID,    SlaveId: offer.SlaveId,    Container: &mesos.ContainerInfo {        Type: &containerType,        Docker: &mesos.ContainerInfo_DockerInfo {            Image: proto.String(MinerServerDockerImage),        },    },    Command: &mesos.CommandInfo {        Shell: proto.Bool(false),        Arguments: []string {            // these arguments will be passed to run_p2pool.py            "--bitcoind-address", s.bitcoindAddr,            "--p2pool-port", strconv.Itoa(int(p2poolPort)),            "-w", strconv.Itoa(int(workerPort)),            s.rpcUser, s.rpcPass,        },    },    Resources: []*mesos.Resource {        util.NewScalarResource("cpus", CPUPerServerTask),        util.NewScalarResource("mem", MemPerServerTask),    },}
TaskInfo message specifies some important metadata information about the task. It allows the Mesos node to run the Docker container. In particular, it specifies the name, task ID, container information, and some parameters to be passed to the container. Resources required by the task are also specified here. Now TaskInfo has been built, so the task can run like this: driver. launchTasks ([] * mesos. offerID {offer. id}, tasks, & mesos. filters {RefuseSeconds: proto. float64 (1)}) in the Framework, the last thing to handle is what happens when the miner's server is disabled. The StatusUpdate function can be used for processing. There are different types of status updates for different stages in the lifecycle of a task. For this framework, what you want to ensure is that if the miner server fails for some reason, the system will Kill all the miners worker to avoid wasting resources. Here is the relevant code:
if strings.Contains(status.GetTaskId().GetValue(), "server") &&    (status.GetState() == mesos.TaskState_TASK_LOST ||        status.GetState() == mesos.TaskState_TASK_KILLED ||        status.GetState() == mesos.TaskState_TASK_FINISHED ||        status.GetState() == mesos.TaskState_TASK_ERROR ||        status.GetState() == mesos.TaskState_TASK_FAILED) {    s.minerServerRunning = false    // kill all tasks    for _, taskID := range s.currentDaemonTaskIDs {        _, err := driver.KillTask(taskID)        if err != nil {            log.Errorf("Failed to kill task %s", taskID)        }    }    s.currentDaemonTaskIDs = make([]*mesos.TaskID, 0)}
Everything is fine! Through hard work, a working distributed bitcoin mining framework is built on Apache Mesos, which only uses about 300 lines of GO code. This proves how fast and easy it is to use the Mesos framework API to write distributed systems.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.