Generally, when a third-party jar package is called in mapreduce programs, the jar package cannot be found. Check that the jar package is in the corresponding path and the mapreduce task cannot be found. If you think about it, you will find that the jar package is stored in the memory on the machine where the mapreduce main program is executed, generally the client machine. When we call the jar package in the map or reduce function, it is called in the memory on the machine in the cluster. How can this be called. You can use the following methods:
1. Place the jar package on the daily machine of the cluster in advance.
2. Like a cluster that calls the MySQL driver, first put the jar package into HDFS and then distribute the jar package to each machine through MySQL's distributedcache.
How to call a third-party jar package in the map function or reduce Function