anti-spam Rd There is a hql, in the execution of the error exits, reported Java.io.IOException:Broken pipe exception, HQL used to Python script, hql and Python script recently no one changed, It was normal at number 10.1th, but the same error was always encountered after number 10.4th, and the error occurred in the stage-2 phase, with the following error message on the gateway:
2014-10-10 15:05:32,724 Stage-2 map = 100%, reduce = 100%
Ended Job = job_201406171104_4019895 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
Jobtracker Page Job error message:
2014-10-10 15:00:29,614 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"1000390355","reducesinkkey1":"14"},"value":{"_col0":"1000390355","_col1":25,"_col2":"Infinity","_col3":"14","_col4":17},"alias":0}
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:518)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:419)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1061)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"1000390355","reducesinkkey1":"14"},"value":{"_col0":"1000390355","_col1":25,"_col2":"Infinity","_col3":"14","_col4":17},"alias":0}
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256)
... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Broken pipe
at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:348)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)
... 7 more
Caused by: java.io.IOException: Broken pipe
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.hive.ql.exec.TextRecordWriter.write(TextRecordWriter.java:43)
at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:331)
... 15 more
STDERR logs:
Traceback (most recent call last):
File "/data10/hadoop/local/taskTracker/liangjun/jobcache/job_201406171104_4019895/attempt_201406171104_4019895_r_000000_0/work/./pranalysis.py", line 86, in <module>
pranalysis(cols[0],pr,cols[1],cols[4],prnum)
File "/data10/hadoop/local/taskTracker/liangjun/jobcache/job_201406171104_4019895/attempt_201406171104_4019895_r_000000_0/work/./pranalysis.py", line 60, in pranalysis
print '%s\t%d\t%d\t%d'%(uid,v[14]-20,type,rank)
TypeError: %d format: a number is required, not float
from the above job error information preliminary judgment, the problem should be 10.1 after the data problems, resulting in the Python script execution time to exit, the traffic channel is closed, and The Execreducer.reduce () method does not know that the channel that writes data to Python has been closed because of an exception and continues to write data to it, Java.io.IOException:Broken pipe exception occurs.
Here is the analysis process:
1, HQL and Python
HQL content is as follows:
add file /usr/home/wbdata_anti/shell/sass_offline/pranalysis.py;
select transform(BS.*) using 'pranalysis.py' as uid,prvalue,trend,prlevel
from
(
select B1.uid,B1.flws,B1.pr,iter,B2.alivefans from tmp_anti_user_pagerank1 B1
join
mds_anti_user_flwpr B2
on B1.uid=B2.uid
where iter>'00' and iter<='14' and dt='lowrlfans20141001'
distribute by uid sort by uid,iter
)BS;
The Python script reads as follows:
#!/usr/bin/python
#coding=utf-8
import sys,time
import re,math
from optparse import OptionParser
import ConfigParser
reload(sys)
sys.setdefaultencoding('utf-8')
parser = OptionParser(usage="usage:%prog [optinos] filepath")
parser.add_option("-i", "--iter",action = "store",type = 'string', dest = "iter", default = '14',
help="how many iterators" )
(options, args) = parser.parse_args()
def pranalysis(uid,prs,flw,fans,prnum):
tasc=tdesc=0
try:
v=[float(pr)*100000000000 for pr in prs]
fans=int(fans)
interval=fans/100
except:
#rst=sys.exc_info()
#sys.excepthook(rst[0],rst[1],rst[2])
return
for i in range(1,prnum-1) :
if i==1:
if v[i+1]-v[i]>interval and v>fans: tasc += 1
elif v[i]-v[i+1]>interval and v[i+1]<fans: tdesc += 1
continue
if v[i+1]-v[i]>interval: tasc += 1
elif v[i]-v[i+1]>interval: tdesc += 1
# rank indicate the rate between pr and fans. higher rank(big number) mean more possible negative user
rate=v[prnum-1]/fans
rank=4
if rate>3.0: rank=0
elif rate>2.0: rank=1
elif rate>1.3: rank=2
elif rate>0.7: rank=3
elif rate>0.5: rank=4
elif rate>0.3: rank=5
elif rate>0.2: rank=6
else: rank=7
# 0 for stable trend. 1 for round trend, 2, for positive user, 3 for negative user.
type=0
if tasc>0 and tdesc>0:
type=1
elif tasc>0:
type=2
elif tdesc>0:
type=3
else: # tdesc=0 and tasc=0
type=0
#if fans<60:
# type=0
print '%s\t%d\t%d\t%d'%(uid,v[14]-20,type,rank)
#format sort by uid, iter
#uid follow pr iter fans
#1642909335 919 0.00070398898 04 68399779
prnum=int(options.iter)+1
pr=[0]*prnum
idx=1
lastiter='00'
lastuid=''
for line in sys.stdin:
line=line.rstrip('\n')
cols=line.split('\t')
if len(cols)<5: continue
if cols[3]>options.iter or cols[3]=='00': continue
if cols[3]<=lastiter:
print '%s\t%d\t%d\t%d'%(lastuid,2,0,7)
pr=[0]*prnum
idx=1
lastiter=cols[3]
lastuid=cols[0]
pr[idx]=cols[2]
idx+=1
if cols[3]==options.iter:
pranalysis(cols[0],pr,cols[1],cols[4],prnum)
pr=[0]*prnum
lastiter='00'
idx=1
2, stage-2 reduce phase of the implementation plan:
Reduce Operator Tree:
Extract
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: bigint
expr: _col2
type: string
expr: _col3
type: string
expr: _col4
type: bigint
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Transform Operator
command: pranalysis.py
output info:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
According to the execution plan, it can be seen that the reduce phase of the stage-2 is very simple, that is, the data obtained in the map phase is calculated using the pranalysis.py script, converted from 5 columns to 4 columns, Python output when the data format requirements:
print '%s\t%d\t%d\t%d '% (Uid,v[14]-20,type,rank)
Depending on the results of the execution plan, the stderr logs information in conjunction with the job:
Traceback (most recent call last):
File "/data10/hadoop/local/taskTracker/liangjun/jobcache/job_201406171104_4019895/attempt_201406171104_4019895_r_000000_0/work/./pranalysis.py", line 86, in <module>
pranalysis(cols[0],pr,cols[1],cols[4],prnum)
File "/data10/hadoop/local/taskTracker/liangjun/jobcache/job_201406171104_4019895/attempt_201406171104_4019895_r_000000_0/work/./pranalysis.py", line 60, in pranalysis
print '%s\t%d\t%d\t%d'%(uid,v[14]-20,type,rank)
TypeError: %d format: a number is required, not float
As can be seen, HQL is actually in the execution of Python due to data anomalies, python after the completion of a data format is float type, and we expected the data format should be number type, causing the Python script to exit abnormally, The data flow channel was closed when exiting, butThe execreducer.reduce () method actually does not know that the channel to write data to Python has been closed because of the exception, but also continue to write data, then there is a java.io.IOException:Broken pipe exception.
Reference:
http://fgh2011.iteye.com/blog/1684544
http://blog.csdn.net/churylin/article/details/11969925
Hive uses Python script to cause java.io.IOException:Broken pipe to exit unexpectedly