Hue is an open source graphical management tool under the Apache Foundation, developed using the Python language, using the framework of Django. Sqoop is also an open source tool for Apache, developed using the Java language, primarily for data transfer between HDFS and traditional relational databases. These two days in the integration of these two tools, encountered a problem, hereby recorded.
The hue version is the 3.9.0,sqoop version of 1.99.6, which is the latest version of SQOOP2.
After installing hue and sqoop, modify the Hue profile, Hue-3.9.0/desktop/conf/hue.ini
Locate the Sqoop configuration item: Change the Sqoop request path to a formal URL.
# Sqoop Server Urlserver_url=http://ip:12000/sqoop
Start hue, Sqoop, and on the Hue Management page, you can create and modify Sqoop links. But the task of creating a new sqoop has been reported wrong.
View Sqoop's log: sqoop-1.99.6-bin-hadoop200/server/logs/catalina.out
A null pointer exception was found:
Java.lang.NullPointerException at Org.apache.sqoop.json.util.ConfigInputSerialization.restoreConfig ( configinputserialization.java:160) at Org.apache.sqoop.json.util.ConfigInputSerialization.restoreConfigList ( configinputserialization.java:129) at Org.apache.sqoop.json.JobBean.restoreJob (jobbean.java:179) at Org.apac He.sqoop.json.JobBean.restore (jobbean.java:159) at Org.apache.sqoop.handler.JobRequestHandler.createUpdateJob ( jobrequesthandler.java:169) at Org.apache.sqoop.handler.JobRequestHandler.handleEvent (jobrequesthandler.java:106 ) Org.apache.sqoop.server.v1.JobServlet.handlePostRequest (jobservlet.java:91) at Org.apache.sqoop.server.SqoopProtocolServlet.doPost (sqoopprotocolservlet.java:63) at Javax.servlet.http.HttpServlet.service (httpservlet.java:643) at Javax.servlet.http.HttpServlet.service ( httpservlet.java:723) at Org.apache.catalina.core.ApplicationFilterChain.internalDoFilter ( applicationfilterchain.java:290) At Org.apache.catalina.core.ApplicationFilterChain.doFilter (applicationfilterchain.java:206) at org.apache.h Adoop.security.authentication.server.Authentication
Find the source of Sqoop, see:
for (int i = 0; i < inputs.size (); i++) { Jsonobject input = (jsonobject) inputs.get (i); Minputtype type = minputtype.valueof ((String) Input.get (Configinputconstants.config_input_type)); Omit part of the code .... Switch (type) {case STRING: { //Error location long size = (long) input.get (configinputconstants.config_input_size) ; Minput = new Mstringinput (name, Sensitive.booleanvalue (), editable, overrides, (short) size); break; } Omit part of the code ... }
Configinputconstants.config_input_size is the string constant "size" to see if this code is empty when it gets the value of "size", and the error is when a strong null is converted to a long type. Then according to the error message to look forward to the Jobbean class, which is loaded in hue passed the JSON data, and set to the Jobbean class, the following is to get "from-config-values" data, that is, the source links related information.
.... static final String from_config_values = "from-config-values"; private mjob restorejob (Object obj) {// Omit part of the code ... Jsonarray Fromconfigjson = (jsonarray) object.get (from_config_values);//Omit part of the code ...//Error point list<mconfig> Fromconfig = Restoreconfiglist (Fromconfigjson);//Omit part of the code ...}
From the analysis above, it is the From link configuration item that the hue passes over, that is, the "from-config-values" item does not have a "size" field or the obtained data is empty causing the error.
And sqoop about this interface api:https://sqoop.apache.org/docs/1.99.3/restapi.html#v1-job-post-create-job, the JSON format given is required to have "size" of the field.
Omit part ... From-config-values: [ { id:2, inputs: [ { id:2, name: ' Fromjobconfig.inputdirectory ', value: "Hdfs%3a%2f%2fvbsqoop-1.ent.cloudera.com%3a8020%2fuser%2froot%2fjob1", Type: "STRING", size: 255, Sensitive:false } ], name: "Fromjobconfig", Type: "JOB" } ],//omitted part ...
To analyze how Hue uses this interface: hue-3.9.0/apps/sqoop/src/sqoop/api/job.py
@never_cachedef create_job (Request):
# Omit part of the code
...... D = json.loads (Smart_str (Request). post[' job ')) job = client. Job.from_dict (d) try: c = client. Sqoopclient (CONF. Server_url.get (), Request.user.username, request. Language_code) response[' job ' = c.create_job (Job). To_dict () except Restexception, E: response.update ( Handle_rest_exception (E, _ (' Could not create job. '))) Except Sqoopexception, E: response[' status '] = response[' errors '] = e.to_dict () return Jsonresponse ( Response
Finally, I saw hue filtering the size: hue-3.9.0/apps/sqoop/src/sqoop/client/config.py
def to_dict (self): d = { ' id ': self.id, ' type ': Self.type, ' name ': self.name, ' sensitive ': Self.sensitive, } if self.value: d[' value '] = Self.value if self.size! =-1: d[' size ' = Self.size if self.values: d[' values '] = ', '. Join (self.values) return D
Hue sets the size=-1 to empty, and then Sqoop gets the value of size less than a nullpointerexception.
Workaround: Modify the config.py To_dict method, the size of the filter operation is removed, as follows:
def to_dict (self): d = { ' id ': self.id, ' type ': Self.type, ' name ': self.name, ' sensitive ': Self.sensitive, } if self.value: d[' value '] = self.value d[' size ' = self.size if Self.values: d[' values '] = ', '. Join (self.values) return D
If the size is set to a constant, then it may cause a org.apache.sqoop.server.common.ServerError error, in fact, the source link that sqoop based on the parameters passed by hue, the destination link.
Or the driver configuration is inconsistent with the sqoop.
A solution to the exception of Hue integrated Sqoop report NULL pointer