Just put, it was two years ago that I used AWS for the first time. This was two years since the rapid development of cloud computing and big data technologies. During this period, the freetier instance has been running for nearly a year and will immediately enter the billing cycle. Although I have used a piece of aliyun product in the middle (A Lot Of Money), and now I contribute $5 to digitalocean every month, only AWS training has participated, several activities are also completed using AWS, which is well understood. Here, we will record the usage experience, which is a summary and a kind of knowledge sorting.
Because of the commonly used computing-intensive tasks, the combination of EC2 spot instances + SQS + S3 is more suitable. When auto scaling is not required, the main idea is as follows:
- Sending EC2 spot instances requests
- Sending messages into SQS
- Start Python scripts on EC2 instances
- Upload Python scripts to S3 and download it on EC2 instances
- Get messages from SQS and saving results to S3
SQS is a message queue system. After being provided as a service, SQS is very useful for decoupling of modules in the entire architecture. S3 stores initial data, processing results, and startup scripts. EC2 is the node that processes data.
1. S3
s3 = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)bucket = s3.get_bucket(DATA_BUCKET)key = bucket.new_key(KEY_BOOT)#key.set_contents_from_string(startup)key.set_contents_from_filename(SCRIPT_BOOT)
2. SQS
def send_message_sqs(q, id): message = q.new_message(body=simplejson.dumps({"key":id})) print q.write(message)sqs = boto.connect_sqs(KEY, SECRET)q = sqs.create_queue(REQUEST_QUEUE)for id in ids: send_message_sqs(q, str(id))
3. EC2 spot instances
request = conn.request_spot_instances(price=AWS_MAX_PRICE, image_id=AWS_IMAGE_ID, count=AWS_INSTANCE_COUNT, type=AWS_REQUEST_TYPE, key_name=AWS_KEY_NAME, security_groups=AWS_SECURITY_GROUPS, instance_type=AWS_INSTANCE_TYPE, placement=AWS_PLACEMENT, user_data=BOOTSCRIPT % { ‘KEY‘ : AWS_ACCESS_KEY_ID, "SECRET" : AWS_SECRET_ACCESS_KEY, "DATA_BUCKET" : DATA_BUCKET, "KEY_BOOT" : KEY_BOOT, ‘BOOT_SCRIPT_PATH‘: SCRIPT_BOOT_PATH_SPOT })
Http://aws.amazon.com/ec2/purchasing-options/spot-instances/spot-and-science/
Http://www.slideshare.net/AmazonWebServices/massive-message-processing-with-amazon-sqs-and-amazon-dynamodb-arc301-aws-reinvent-2013-28431182