Deep Learning Solutions on Hadoop 2.0

Source: Internet
Author: User

Original link: https://www.paypal-engineering.com/tag/data-science/

absrtact: with the explosive growth of data and thousands of machine clusters, we need to adapt the algorithm to run in such a distributed environment. Running machine learning algorithms in a common distributed computing environment has a number of challenges. This article explores how to implement and deploy deep learning in a Hadoop cluster.

Boston's Data science team is leveraging cutting-edge tools and algorithms to optimise business activities based on deep dialysis of user data. The extensive use of machine algorithms in data science can help us identify and exploit patterns in our data. It is a challenging task to get dialysis from large-scale Internet data, so it is a critical requirement to run the algorithm on a large scale. With the explosive growth of data and thousands of machine clusters, we need to adapt the algorithm to run in such a distributed environment. Running machine learning algorithms in a common distributed computing environment has a number of challenges.

Here, let's explore how to implement and deploy deep learning (a cutting-edge machine learning framework) in a Hadoop cluster. We provide specific details on how the algorithm adapts to run in a distributed environment. We also give the result of the algorithm running on the standard data set.

Deep Trust Network

Depth Trust network (deep belief Networks, DBN) is a graphical model obtained by iterating and training a limited Boltzmann machine (Boltzmann machines, RMB) under greedy and unsupervised conditions. DBN is trained to extract the deep dialysis of the training data by modeling the interconnected distribution of dimensions X and hidden layer HK, which are observed below.

Expression 1:DBN-Distributed

In, the relationship between the input layer and the hidden layer can be observed. At a high level, the first layer is trained as an RBM to model the original input x. The input data is a sparse binary dimension, indicating that the data will be categorized, such as a binary digital image. Subsequent layers use the data (samples or activations) passed from the previous layer as a training example. The number of layers can be determined by experience to achieve better model performance, and DBN supports any number of layers.

Figure 1:DBN Hierarchy

The following code snippet shows the training to enter an RBM. There are several predefined points of time in the input data provided to the RBM. The input data is divided into small batches of data to calculate weights, activations, and deltas for each layer.

After all the layers have been trained, the parameters of the deep network are adjusted using supervised training standards. Supervised training standards, for example, can be designed as a classification problem, allowing the use of deep networks to solve classification problems. More complex supervised standards can be used to provide interesting results such as situational interpretation, such as explaining what is shown in the picture.

Basic construction

Deep learning is a widespread concern, not only because it can produce better results than some other learning algorithms, but also because it can run on distributed devices, allowing large-scale datasets to be processed. Deep networks can be parallel, layer level, and data level at two levels [6]. For layer-level parallelism, many implementations use GPU arrays to compute layer-level activations in parallel and synchronize them frequently. However, this approach is not suitable for clusters where data resides on multiple machines connected over the network, because of the high network overhead. For data-tier parallelism, training is parallel to the data set and more suitable for distributed devices. Most of PayPal's data is stored on a Hadoop cluster, so being able to run algorithms on those clusters is our top priority. The maintenance and support of dedicated clusters is also an important factor we need to consider. However, because deep learning is iterative in nature, paradigms such as mapreduce are not suitable for running these algorithms. But with the advent of Hadoop2.0 and yarn-based resource management, we can write iterative programs that can fine-tune the resources used by the program. We used Iterativereduce [7], a user in Hadoop yarn that wrote an iterative algorithm that we could deploy to a PayPal cluster running Hadoop 2.4.1.

Method

We implemented the core algorithm of Hinton, which was quoted in [2]. Since our demand is to disperse the algorithms that run in clusters of multiple machines, we use their algorithms in such environments. For dispersing this algorithm on multiple machines, we refer to the guidelines presented by Grazia [6]. Here is a detailed summary of our implementation:

    1. The master node initializes the weights of the RBM.
    2. The master node pushes weights and splits to the worker node.
    3. The Worker node trains an RBM layer at the point in time of a dataset, in other words, sends the updated weights to master after a worker node has completely passed all split.
    4. At a given point in time, the master node averages the weights from all worker nodes.
    5. In the predefined set of time (in our case, 50), repeat the 3-5-step operation.
    6. After the completion of the 6th step, a layer is trained. The subsequent RBM layers also repeat these steps.
    7. When all the layers are trained, the depth network is adjusted accurately by using the wrong reverse broadcast mechanism.

Describes a single data set point in time (step 3-5) when running deep learning algorithms. We note that this paradigm can be exploited to implement a machine learning algorithm that can be iterated by the host.

Figure 2: Point-in-time for a single data set for training

The following code snippet shows the steps involved in training the DBN of a single machine. The dataset is first split into multiple batches, and then multiple RBM layers are initialized and trained in sequence. After the RBM has been trained, they will broadcast the phase in reverse with an accurately adjusted error.

We used the iterativereduce[7] to a great extent for yarn pipelines. We have made significant changes to the implementation that can be leveraged into our deep learning algorithm implementations. Iterativereduce's implementation was written for Cloudera Hadoop distributed, and it was reset to the platform to fit the standard Apache Hadoop distribution. We have also rewritten the implementation to use the standard programming model described in [8]. In particular, we need the yarn client API to communicate between ResourceManager and client programs. We also used amrmclient and amnmclient to communicate between Applicationmaster, ResourceManager, and NodeManager.

We first use the Yarn API to submit the application to Yarn Explorer:

After the app is committed, yarn Explorer launches the application master. If necessary, the application master is responsible for allocating and releasing the worker container. The program master uses Amrmclient to communicate with the resource manager.

Application Master uses the Nmclient API to run commands in the container (the master node passes over).

Once the application master launches the worker container it needs, it sets up a port to communicate with the application master. For our deep learning implementations, we have added methods that require parameter initialization, layer-wise training, and precise signal tuning for the original Iterativereduce interface. Iterativereduce uses the Apache Avro IPC for communication between master and worker.

The following code snippet shows a series of distributed training involving the Master-worker node, where master sends initial parameters to the worker, and the worker trains its RBM on part of the data. After the worker finishes training, it sends the results to Master,master to synthesize the results. When the iteration is complete, master completes the process by starting the reverse broadcast to precisely adjust the phase.

Results

We evaluated the performance of deep learning achieved using mnist handwritten numeral recognition [3]. The dataset contains 0-9 numbers that are manually marked. The training set consists of 60000 images, and the test set contains 10000 images.

To measure performance, the DBN is first pre-trained and then accurately adjusted in 60000 photos, and DBN is evaluated on 10000 test images through the above steps. During training or evaluation, the picture is not preprocessed. The error rate is obtained by the ratio of the total number of pictures that are categorized to the total number of pictures in the test set.

We can achieve the best classification error rate of 1.66% when each RBM uses 500-500-2000 's hidden unit, while using 10 nodes of distributed devices. The error rate can be comparable to the 1.2% reported by the author of the original algorithm (using 500-500-2000 's hidden unit) [2], and some results under similar settings [3]. We noticed that the original implementation was on a single machine, and our implementation was on a distributed machine. This step of the average parameter results in a slight decrease in performance, but distributing the algorithm over multiple machines is more beneficial than harmful. The table below summarizes the error rate changes for the number of hidden units in each layer running on a cluster of 10 nodes.

Table 1:mnist Performance evaluation

Deep thinking

We have successfully deployed a deep learning system, and we believe it is useful in solving some machine learning problems. In addition, iterative reduction abstractions can be leveraged to distribute any other appropriate machine learning algorithms, and the ability to leverage a common Hadoop cluster will prove to be very beneficial for running large machine learning algorithms on large data sets. We note that our current framework requires some improvement, mainly around reducing network latency and more advanced resource management. In addition, we need to optimize the DBN framework so that communication between internal nodes can be reduced. The Hadoop yarn framework gives us more flexibility with the granular control of cluster resources.

Resources

[1] G. E. Hinton, S. osindero, and Y. Teh. A Fast Learning algorithm for deep belief nets. Neural computations, 18 (7): 1527–1554, 2006.

[2] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of Data with neural Networks. Science, 313 (5786): 504–507, 2006.

[3] Y. LeCun, C. Cortes, C. j.c. Burges. The MNIST database of handwritten digits.

[4] deep learning Tutorial. LISA Lab, University of Montreal

[5] G. E. Hinton. A Practical Guide to Training Restricted Boltzmann machines. Lecture Notes in Computer science Volume 7700:599-619, 2012.

[6] M. Grazia, I. Stoianov, M. Zorzi. Parallelization of Deep Networks. Esann, 2012

[7] Iterativereduce, Https://github.com/jpatanooga/KnittingBoar/wiki/IterativeReduce

[8] Apache Hadoop yarn–enabling Next Generation Data applications,http://www.slideshare.net/hortonworks/ Apache-hadoop-yarn-enabling-nex

Deep Learning Solutions on Hadoop 2.0

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.