Apache Ignite Series (v): Distributed computing

Last Update:2018-08-24 Source: Internet

Author: User

Tags apache ignite

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Ignite distributed Computing

In the ignite, there are distributed computation of the traditional MapReduce model, and the collocated computation based on distributed storage, when the data is dispersed to different nodes, the computation will propagate to the node where the data resides, according to the provided collocated key, and then the data is collocated and the associated data is stored in the same node. This avoids the large amount of data movement involved in the calculation process, which can effectively guarantee the performance of the calculation.

The main features of Ignite distributed computing are as follows:

features	Description
Automatic deployment	Compute classes can be automatically propagated without the need to deploy related classes at each node, which enables `peerClassLoadingEnabled` automatic propagation of compute classes through configuration options, but the cached entity classes are not automatically propagated.
Balanced loading	Data after loading will be in the cluster of a rebalancing process, to ensure that the data evenly distributed across the nodes, when there is a calculation in the cluster execution, can be located according to the provided collocated key to the data node to calculate, that is, collocated calculation.
Fail over	When a node fails or other calculations occur, the task is automatically transferred to other nodes in the cluster for execution

1. Distributed closures:

The Ignite compute grid can broadcast and load balance any closures within a cluster or cluster group, including pure Java runnables andcallables

Closure Type	function
Broadcast	Propagate a task to a partially specified node or to all nodes
Call/run	Perform a single task or task set
Apply	Apply receives a closure and a collection as parameters, generating a task with an equal number of parameters, each of which applies the closure to one of the parameters, and returns the result set.

ComputeTestController.java

    /** Broadcast Test */@RequestMapping ("/broadcast") String broadcasttest (httpservletrequest request, HTTPSERVLETRESP  Onse response) {//Ignitecompute compute = Ignite.compute (Ignite.cluster (). Forremotes ());        Propagate only remote nodes ignitecompute COMPUTE = Ignite.compute ();        Compute.broadcast ((), System.out.println ("Hello Node:" + ignite.cluster (). Localnode (). ID ()));    Return "all executed."; /** Call and Run Test */@RequestMapping ("/call") public @ResponseBody String calltest (httpservletrequest request,        HttpServletResponse response) {collection<ignitecallable<integer>> calls = new arraylist<> ();        /** call */SYSTEM.OUT.PRINTLN ("-----------call-----------"); For (String Word: "How many Characters". Split (")) {Calls.add (word::length);//Calls.add (()        Word.length ());        } collection<integer> res = Ignite.compute (). Call (calls); int total = Res.stream (). MaptoinT (integer::intvalue). sum ();        System.out.println (String.Format ("The total lengths of all words are [%s].", total));        /** Run */System.out.println ("-----------run-----------"); For (String Word: "Print words on different cluster Nodes". Split (")") {Ignite.compute (). Run ((), System.        OUT.PRINTLN (word));        }/** Async Call */System.out.println ("-----------async call-----------");        Ignitecompute Asynccompute = Ignite.compute (). Withasync ();        Asynccompute.call (calls); Asynccompute.future (). Listen (fut, {collection<integer> result = (collection<integer>) fut.get (            );            int t = Result.stream (). Maptoint (Integer::intvalue). sum ();        SYSTEM.OUT.PRINTLN ("Total number of characters:" + All);        });        /** Async Run */System.out.println ("-----------Async Run-----------");       collection<computetaskfuture<?>> futs = new arraylist<> (); Asynccompute = Ignite.compute (). Withasync (); For (String Word: "Print words on different cluster Nodes". Split (")") {Asynccompute.run ((), System.out.            println (word));        Futs.add (Asynccompute.future ());        } futs.stream (). ForEach (Computetaskfuture::get);    Return "all executed."; }/** Apply Test */@RequestMapping ("/apply") public @ResponseBody String applytest (httpservletrequest request, H        Ttpservletresponse response) {/** apply */System.out.println ("-----------apply-----------");        Ignitecompute compute = Ignite.compute (); collection<integer> res = compute.apply (string::length, Arrays.aslist ("How many Chara        Cters ". Split (")));        int total = Res.stream (). Maptoint (Integer::intvalue). sum ();        System.out.println (String.Format ("The total lengths of all words are [%s].", total)); /** Async Apply */Ignitecompute Asynccompute = ignite.coMpute (). Withasync ();        res = asynccompute.apply (string::length, Arrays.aslist ("How many Characters". Split (""))        ); Asynccompute.future (). Listen (fut, {int t = ((collection<integer>) Fut.get ()). Stream (). Maptoint (Integ            er::intvalue). sum ();        System.out.println (String.Format ("Total number of characters:" + All));        });    Return "all executed."; }

2. MapReduce:

In ignite, the implementation of MapReduce is ComputeTask that the main method is map () and reduce (), map () can control the task map to the node process, and reduce () is a final calculation result set of a processing. ComputeTaskThere are two main implementations ComputeTaskAdapter and ComputeTaskSplitAdapter , the main difference is the ComputeTaskAdapter need to manually implement the map () method, and ComputeTaskSplitAdapter can automatically map tasks.

Computetaskadapter:

    /**computetaskadapter*/@RequestMapping ("/taskmap") public @ResponseBody String Taskmaptest (httpservletreques T request, httpservletresponse response) {/**computetaskmap*/int cnt = Ignite.compute (). Execute (mapexample        Charactercounttask.class, "Hello Ignite Enable world!");        System.out.println (String.Format (">>> total number of characters in the phrase is%s.", CNT));    Return "all executed."; } private static Class Mapexamplecharactercounttask extends Computetaskadapter<string, integer> {/** node map * /@Override public map<? Extends Computejob, clusternode> map (list<clusternode> nodes, String Arg) throws Igniteexception {map            <computejob, clusternode> map = new hashmap<> ();            Iterator<clusternode> it = Nodes.iterator ();                For (Final String word:arg.split ("")) {//If we used all nodes, restart the iterator. if (!it.hasnexT ()) {it = Nodes.iterator ();                } clusternode node = It.next (); Map.put (New Computejobadapter () {@Override public Object execute () throws Igniteexc                        eption {System.out.println ("-------------------------------------");                        System.out.println (String.Format (">>> Printing [%s] on this node from ignite job.", word));                    return Word.length ();            }}, node);        } return map; }/** Results Summary */@Override public Integer Reduce (list<computejobresult> results) throws Igniteexcept            ion {int sum = 0;            for (Computejobresult res:results) {sum + = Res.<integer>getdata ();        } return sum; }    }

Operation Result:

------------------------------------->>> Printing [Ignite] on this node from ignite job.------------------------------------->>> Printing [World!] on this node from ignite job.>>> Total number of characters in the phrase is 23.

Computetasksplitadapter:

    /**computetasksplitadapter*/@RequestMapping ("/tasksplit") public @ResponseBody String Tasksplittest (httpserv Letrequest request, HttpServletResponse response) {/**computetasksplitadapter (auto map) */int result = ignite.        Compute (). Execute (splitexampledistributedcompute.class, NULL);        System.out.println (String.Format (">>> result: [%s]", result));    Return "all executed."; } private static Class Splitexampledistributedcompute extends Computetasksplitadapter<string, integer> {@ Override protected collection<? Extends computejob> split (int gridSize, String arg) throws igniteexception {collection<computejob> Jo            BS = new linkedlist<> (); Jobs.add (New Computejobadapter () {@Override public Object execute () throws Igniteexception                {//Ignitecache<long, student> cache = Ignition.ignite (). cache (cachekeyconstant.student);    Ignitecache<long, binaryobject> cache = Ignition.ignite (). Cache (Cachekeyconstant.student). Withkeepbinary (); /** Plain Query */String sql_query = "name =?" and email =? ";                    /Sqlquery<long, student> csqlquery = new Sqlquery<> (Student.class, sql_query);                    Sqlquery<long, binaryobject> csqlquery = new Sqlquery<> (Student.class, sql_query); Csqlquery.setreplicatedonly (True). Setargs ("student_54", "student_54gmail.com");//LIST&LT;CACHE.ENTRY&L T                    Long, student>> result = Cache.query (csqlquery). GetAll ();                    List<cache.entry<long, binaryobject>> result = Cache.query (csqlquery). GetAll ();                    System.out.println ("--------------------");                        Result.stream (). Map (x, {Integer studid = X.getvalue (). Field ("Studid");  String name = X.getvalue (). Field ("name");                      Return String.Format ("name=[%s", studid=[%s]. ", name, Studid);                    }). ForEach (System.out::p rintln);                    System.out.println (String.Format ("The query size is [%s].", Result.size ()));                return Result.size ();            }            });        return jobs;            } @Override public Integer reduce (list<computejobresult> results) throws Igniteexception {            int sum = Results.stream (). Maptoint (x, X.<integer>getdata ()). sum ();        return sum; }    }

Operation Result:

--------------------name=[student_54], studId=[54].the query size is [1].>>> result: [1]

Limitations of MapReduce:

MapReduce is suitable for parallel and batch scenarios, not for serial, iterative, and recursive scenarios where tasks cannot be parallel and split.

Problems in distributed computing and points for attention

   在使用ignite的分布式计算功能的时候，如果用到了缓存, 并且缓存value不是平台类型(java基础类型)，则需要考虑反序列化的问题。

There are two types of solutions available:

Deploy the Cache entity class package to the Ignite node

The cache entity class has to implement the serializable interface, and it has to specify Serialversionuid

Serialversionuid represents the current version of the entity class, with each class that implements the Serializable interface, and if there is no setting for that value, the Java serialization mechanism will help you generate one by default. It is best to set Serialversionuid to a value when using the Serializable interface, or when the entity class is modified at one end of the transfer, the Serialversionuid is set to a new value by the virtual machine. An exception occurs that causes Serialversionuid inconsistencies on both ends.

public class Student implements Serializable {    private static final long serialVersionUID = -5941489737545326242L;    ....}

Package the entity classes into normal jar packages and place them under the $ignite_home/libs/path:

Note: Packaging can not be packaged as Spring-boot executable package, to be packaged into a normal jar, so that the related classes can load normally. Of course, if the nodes in the cluster are application nodes, this problem is not to be considered.

Using binary objects to manipulate the cache

Ignite default to use deserialization values as the most common usage scenario, to enable BinaryObject processing, you need to get an IgniteCache instance of it and then use the withKeepBinary() method. When enabled, this flag ensures that the objects returned from the cache are formatted, if possible BinaryObject .

 IgniteCache<Long, BinaryObject> cache = ignite.cache("student").withKeepBinary(); BinaryObject obj = cache.get(k);  //获取二进制对象 String name = obj.<String>field("name");  //读取二进制对象属性值<使用field方法>

3. Collocated Calculation:

affinityCall(...)and affinityRun(...) methods place the job and the node that caches the data, in other words, given the cache name and the relationship key, these methods try to locate the node where the key resides in the specified cache, and then execute the job there.

The two types of collocated and the difference:

collocated

	features
Data collocated	The associated cache data is collocated together to ensure that all its keys are cached on the same node, avoiding network overhead resulting from data movement between nodes.
Calculate and reset	Based on the relationship key and cache name, locate the node where the relationship key resides and execute the job unit on that node.

ComputeTestController.class

    /** the collocated calculation Test */@RequestMapping ("/affinity") public @ResponseBody String affinitytest (httpservletrequest request, HttpServletResponse response) {/** Affinityrun call */System.out.println ("-----------Affinityrun call----        -------");        Ignitecompute compute = Ignite.compute ();//Ignitecompute compute = Ignite.compute (Ignite.cluster (). Forremotes ());            for (int key = 0; key < key++) {//final long k = key; Generates a random K-value final long k = Intstream.generate ((), (int) (System.nanotime ()%)). Limit (1). FindFirst (). Getasi            NT (); Compute.affinityrun (Cachekeyconstant.student, K, (), {ignitecache<long, binaryobject> cache = i                Gnite.cache (cachekeyconstant.student). Withkeepbinary ();                Binaryobject obj = Cache.get (k); if (obj!=null) {System.out.println (String.Format ("co-located[key=%s, value=%s]", K, Obj.<string&gt      ; Field ("Name"));          }            });        } ignitecache<long, binaryobject> cache = Ignite.cache (cachekeyconstant.student). Withkeepbinary (); Cache.foreach (Lo-Compute.affinityrun (Cachekeyconstant.student, Lo.getkey (), (), {System.out.println        (Lo.getvalue (). <string>field ("name"));        }));    Return "all executed."; }

Operation Result:

-----------affinityRun call-----------student_495student_496student_498...

At this point, Ignite distributed computing is complete.

Apache Ignite Series (v): Distributed computing

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More