AWS Machine Learning Approach (1): Comprehend

Source: Internet
Author: User
Tags phpmyadmin

An exploration of AWS Machine Learning (1): comprehend-natural language processing service

1. Comprehend Service Introduction 1.1 features

The Amazon comprehend service uses natural language processing (NLP) to analyze text. Its use is very simple.

    • Input: text in any UTF-8 format
    • Output: Comprehend outputs a set of entities (entity), a number of keywords (key phrase), which language (Language), what mood (sentiment, including positive,negative,neutual, Mixed, etc.) and syntax analysis for each word (Syntax)
    • Format: Supports synchronous single-document return, asynchronous multi-document processing, and batch processing
    • Supported languages: The Language Judgment API supports hundreds of languages, and the rest of the APIs only support English and Spanish.
    • Whether preprocessing is required: not required. AWS itself continues to train the processing model to continuously improve processing accuracy, which is transparent to the user.

Typical asynchronous batch processes:

    1. Save a document in AWS S3
    2. Open one or more comprehend jobs to process these documents
    3. Monitor the status of these jobs
    4. Get analysis results from another S3 bucket
1.2 Example

In the diagram, the left side is a paragraph of the input text, and the right side is the output of the comprehend API, which are entities, keywords, emotions, and languages.

Examples of testing on the interface:

Enter US President Trump's latest tweet on the comprehend interface, which it considers to be negative:

Examples of testing using the CLI:

 AWS Comprehend Detect-dominant-language--region Us-east-1 --text  " hello World  "  {  languages   " : [{ "  languagecode  : "  en   ,   " score  " :   
2. One example Scenario 2.1 scenario Description

Deployment architecture Diagram:

Schema Description:

    • In an AWS region, take advantage of the comprehend API in this region
    • There is a VPC in the region, it has two public Subnet, one of which has a EC2 instance, and phpMyAdmin is installed to connect and manage the Aurora instances in the private subnet
    • Has a private subnet, which creates an Aurora instance that can only be accessed within the VPC scope
    • There is a LAMBDA function in the VPC. Because the function requires direct access to the Aurora instance, it must be in a VPC.
    • Because LAMBDA functions require access to the Comprehen API, AWS does not currently provide internal access to the API's endpoints, so there is a need for a NAT gateway. The LAMBDA function accesses the comprehend API through the gateway.

Operation Process:

    1. Users use the Aurora database through phpMyAdmin. The database has a table named Reviewinfo, each row represents a text message, and three columns hold the reviewid,message,sentiment of the text information, respectively, the ID of the record, the message content, and the sentiment.
    2. Whenever a user inserts a message (1 and 2 in the figure), the LAMBDA function is automatically triggered (3 in the figure), it calls the comprehend API (4 in the figure), gets the sentiment of the information, and then writes back to the record in Aurora sentiment Field (5 in the figure).
    3. The user queries the sentiment of the record from phpMyAdmin.
2.2 Implementation

(1) Follow the deployment diagram to create the individual AWS service instances you need, including EC2 instances, NAT instances, vpcs, installation phpMyAdmin, and so on. Procedure omitted. Create an Aurora instance in your VPC, and configure phpMyAdmin to point to that instance. Create a Python 2.7 Lambda function in your VPC. The function contents are as follows:

Import pymysqlimport jsonimport boto3import osdef Lambda_handler (event, context):    comprehend = Boto3.client (service_name= ' comprehend ') jsonresponse= json.dumps (comprehend.detect_sentiment (Text=event[') Reviewtext '], languagecode= ' en '), sort_keys=true, indent=4) Json_object = json.loads (jsonresponse) sentiment= json_object["sentiment"] db = Pymysql.connect (host=os.environ[' host '],user=os.environ[' user '],passwd= os.environ[' password '],db=os.environ[' db '], autocommit=True) Add_order = ("UPDATE reviewinfo SET sentiment=%s WHERE reviewid=%s; " ) Db.cursor (). Execute (Add_order, (sentiment,event[' Reviewid ')) Db.commit () db.close ()   

The function is very simple. A brief description is as follows:

    1. Incoming database connection information such as HOST,USER,PASSWORD,DB through environment variables.
    2. Preferred to create a comprehend client from the Boto3 library
    3. Get the message content from the incoming event
    4. Call the Detect_sentiment function of the comprehend service to get the sentiment of the message
    5. Linking to a database through the Pymysql library
    6. Update the sentiment column for the record that corresponds to the message

(2) Create a database Comprehend_demo in an Aurora instance by phpMyAdmin .

(3) in phpMyAdmin, execute the following SQL statement to create a data table Reviewinfo in the database. It has three fields.

CREATE TABLE Comprehend_demo. Reviewinfo (    PRIMARYKEY  ,    TEXTnotNULL  ,     VARCHAR  -  not NULL)

(4) in phpMyAdmin, execute the following SQL statement to create a stored procedure named Aurora_to_lambda in database Comprehend_demo. Note that you need to replace the Comprehend_lambda_arn with the ARN of the LAMBDA function created in step (1). The stored procedure invokes the LAMBDA function specified by the ARN and passes in the REVIEWID and Reviewtext parameter values.

DROP PROCEDURE IF EXISTSComprehend_demo. Aurora_to_lambda;delimiter;;CREATE PROCEDUREComprehend_demo. AURORA_TO_LAMBDA (inchReviewid NUMERIC,inchReviewtextTEXT) LANGUAGE SQLBEGINCall Mysql.lambda_async (' <comprehend_lambda_arn>', CONCAT ('{"Reviewid": "', Reviewid,'" ," Reviewtext ":"', Reviewtext,'"}') );END;;D Elimiter;

Results:

(5) Execute the following SQL statement in phpMyAdmin to create a trigger in the database. The trigger is called whenever a new row in the Reviewinfo table is inserted. It takes the Reviewid and Reviewtext fields of the row, and then calls the trigger created in step (4), which invokes the LAMBDA function.

CREATE TRIGGER INSERT  on  for-BEGIN  SELECTinto  @ReviewId@ Reviewtext;  Call  Comprehend_demo. Aurora_to_lambda (@ReviewId@ReviewText);     END

(6) Because Aurora needs to invoke a lambda function, you need to configure Aurora's IAM Role to have permission to invoke the lambda function.

First, on the IAM interface, create an IAM Policy that contains invokefunction permissions.

Then create an IAM Role that contains the policy.

On the Aurora interface, create a new DB Cluster parameter group:

Modify its aws_default_lambda_role to the Arn of the previously created IAM role, for example:

Set Group2 to group for the Aurora instance. The instance needs to be restarted to make the modification effective.

Aurora's IAM role is then set in the following interface for the above role:

(7) Do a simple test, insert a piece of data, if the following error occurs, it means that Aurora successfully called the lambda function, but the lambda function cannot connect to the Comprhend service. You need to check the path to the comprehend API from the LAMBDA function through the NAT gateway, primarily the route table for the VPC.

(8) After the network path is confirmed, the following error indicates that the LAMBDA function is not authorized to invoke the comprehend API.

(9) Configure the permissions of the LAMBDA function to invoke the comprehend API.

You first need to create a policy in IAM that has full permissions for the Comprhend API. Of course, you can grant sentiment API permissions only.

Then create an IAM role that associates the policy. The role is then configured to the LAMBDA function as its execution role.

(11) Until now, the entire path has been fully opened. When you insert a row into the Reveiwinfo table by using an SQL statement in phpMyAdmin, the LAMBDA function automatically updates the sentiment field within the row.

Reference Documentation:

    • aws.amazon.com/cn/blogs/machine-learning/ building-text-analytics-solutions-with-amazon-comprehend-and-amazon-relational-database-service/

You are welcome to pay attention to my personal public number:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.