Recently in the design and development of a generic query restful Service (https://github.com/lalaguozhe/polestar-1), project name Polestar (Chinese name Polaris, camping lights, instructors, Hope that everyone's query to attract convergence, you know, before the query hive statements are basically walking hive server, but Hive server 1 is not perfect, such as
1. Have compiler memory leak problem
2. The Thrift API does not support multiple connections and client sessions
3. After the statement is submitted will be block, unable to get execution status information in real time
4. Authentication not supported (Kerberos)
These issues will not be resolved until HIVE server 2 (https://issues.apache.org/jira/browse/HIVE-2935), so we develop a Restul Service for a unified query statement that allows the user to take the query statement ( Hive, Shark, and Phoenix, etc.) are submitted through the RESTful API, and the background worker node will set it up in command line and return the results and intermediate state information to him.
Polestar Frame composition
The top tier is the application layer, such as the Hive Web (user-defined query statement in edit box), the Operations tool (DW's reporting tool, which generates query template queries), ad hoc (other applications ad hoc queries), all requests pass through the haproxy+ keepalived do load balance,haproxy support a variety of balance algorithm, the default is Leastconn, we use source here, that is, according to the client's IP and server weights hash. The next request is forwarded to a worker node. Each worker node is deployed independently, with Hive, shark clients, different process processes based on the user-specified execution engine, and stdout and stderr are crawled
Restful API:
@Path ("/query") public class Polestarcontroller {private Iqueryservice QueryService = Defaultqueryservice.getins
Tance ();
@GET @Produces (mediatype.text_plain) public String Getqueryid () {return Queryservice.getqueryid (); @GET @Path ("/status/{id}") @Produces (mediatype.application_json) public QueryStatus ge
Tquerystatus (@PathParam ("id") String ID) {return queryservice.getstatusinfo (ID); @GET @Path ("/download/{filename}") @Produces (mediatype.application_octet_stream) public Res Ponse Get (@PathParam ("filename"), String filename) {return Response.ok (queryservice.getdatafile (filename)). Build
(); @GET @Path ("/cancel/{id}") @Produces (Mediatype.application_json) public Boolean cancelquery (@PathParam ("id") String ID)
{return queryservice.cancel (ID); } @POST @Path ("/post") @Consumes (MediAtype.application_json) @Produces (mediatype.application_json) public Response postquery (query query) {
QueryResult result = queryservice.postquery (query);
Return Response.Status (status.created). Entity (Result)-build (); }
}
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/database/extra/