We used distcp on the CDH4 version of Hadoop to copy the data from the CDH5 version of Hadoop to Cdh4, which commands the following
Hadoop Distcp-update-skipcrccheck hftp://cdh5:50070/xxxx hdfs://cdh4/xxx
When the file is very general there is such an error,
2017-12-15 10:47:24,506 info execute. bulkloadhbase-caused By:java.io.IOException:Got EOF But currentpos = 2278825984 < Filelength = 3486427523 2017-12-1 5 10:47:24,506 info Execute. Bulkloadhbase- at org.apache.hadoop.hdfs.ByteRangeInputStream.update (byterangeinputstream.java:172)
2017-12-15 10:47:24,506 info execute. Bulkloadhbase- at org.apache.hadoop.hdfs.ByteRangeInputStream.read (byterangeinputstream.java:187)
2017-12-15 10:47:24,506 info execute. Bulkloadhbase- at java.io.DataInputStream.read (datainputstream.java:149)
2017-12-15 10:47:24,506 info Execute. Bulkloadhbase- at java.io.BufferedInputStream.read1 (bufferedinputstream.java:273)
2017-12-15 10:47:24,506 info Execute. Bulkloadhbase- at java.io.BufferedInputStream.read (bufferedinputstream.java:334)
2017-12-15 10:47:24,506 INFO Execute. Bulkloadhbase-at Java.io.FilterInputStream.read (filterinputstream.java:107)
The information can be solved by using the Webhdfs method, and the command is as follows
Hadoop Distcp-update-skipcrccheck webhdfs://cdh5:50070/xxxx hdfs://cdh4/xxx