TPU Use instructions

Source: Internet
Author: User

1 TPU Classification and charging standard 1.1 Classification and billing instructions
Area Preemptive type TPU Cloud TPU
United States $1.35/hour $4.5/hour
Europe $1.485/hour $4.95/hour
Asia Pacific Region $1.566/hour $5.22/hour
  • Preemptive TPU is a Cloud TPU when you need to assign resources to another task, you can terminate (preempt) the TPU at any time. The cost of preemptive TPU is much lower than that of ordinary TPU.
  • TPU is billed in increments of 1 seconds.

In order to connect to TPU, we must configure a virtual machine (billing separately). It is important to note that the virtual machine and TPU are billed separately .

In other words, only after the start of TPU, Cloud TPU Billing will start, after the stop or remove TPU, the billing is stopped. Run ctpu pause or gcloud compute tpus stop stop TPU. Again, we will charge you virtual machines only after the virtual machine has been activated.

If the virtual machine is stopped and cloud TPU is not stopped, you will need to continue to pay for the cloud TPU. If Cloud TPU is stopped or deleted, and the virtual machine does not stop, you will need to continue to pay for the virtual machine.

1.2 Useful Query links
    • Compute Engine Price List
    • Compute Engine Price Calculator
1.3 Example of price calculation

The following example explains how to calculate the total cost of a training job that uses the TPU resources and Compute Engine instances in the US region.

A machine Learning Research institute provisioned a virtual machine by creating an Compute Engine instance, and they chose the n1-standard-2 machine type. They also created a TPU resource, and the cumulative use time for Compute Engine instances and TPU resources was 10 hours. To calculate the total cost of the training exercise, the Machine Learning Institute must add the following:

    • Total cost for all Compute Engine instances
    • Total cost of all Cloud TPU resources
Resources price per machine per hour (USD ) Number of machines Hours of Billing Total cost of resources Total cost of training work
Compute Engine N1-standard-2 Instance $0.095 1 10 $0.95 _
Cloud TPU Resources $4.50 1 10 $45.00 _
$45.95

Example of price using preemptive TPU

In the following example, the resources and duration are the same as in the previous example, but this time the research institute decided to use preemptive TPU to save costs. The cost of preemptive TPU is $1.35 per hour, not ordinary TPU $4.50 per hour.

Resources price per machine per hour (USD ) Number of machines Hours of Billing Total cost of resources Total cost of training work
Compute Engine N1-standard-2 Instance $0.095 1 10 $0.95 -
Preemptive type TPU $1.35 1 10 $13.50 -
$14.45
2 Use steps 2.1 Creating a GCP project

After clicking on the link to Google Cloud platform, you will get to this interface:

Click Create Project , enter the project name, and then the project will be created successfully, sometimes you may need to refresh the Web page project to appear.

2.2 Creating a cloud Storage bucket

Cloud Storage is simply used to store model training data and training results . The official explanation is that it is a powerful and cost-effective storage solution for unstructured objects, ideal for hosting real-time Web content, storing data for analytics, archiving, and backup services.

Note: To use cloud Storage, you need to enable the billing feature.

2.2.1 Creating a storage partition

Storage partitions are used to hold the objects (any type of file) that you want to store in Cloud storage.

    • First select "Save"( in Chinese and English)on the left side of the console to go to the cloud storage page,
    • Then click "Create storage Partition"



    • Enter the storage name to create the completion, note that the name needs to be unique, or you cannot create the success.

2.2.2 Uploading and sharing objects

To get started with your storage partition, simply upload the object and open its access rights.

2.2.3 Cleanup

In the final step, you will delete the storage partitions and objects that you created earlier for this tutorial.

2.3 Open the cloud Shell and use the CTPU tool

The shell is in the upper-right corner of the console, as shown in:



Enter ctpu print-config to view configuration information. The result of my input is this:

ctpu configuration:        name: hkbuautoml        project: test01-219602        zone: us-central1-bIf you would like to change the configuration for a single command invocation, please use the command line flags.
2.3.1 Creating Computer Engine VMs and TPU

The command is:ctpu up [optional: --name --zone]

Note: Name can only be composed of lowercase letters and numbers, and uppercase or other characters will error.

Here I created a name for tputest the TPU. Enter Y to confirm the creation.




The above ctpu up command mainly does the following several things:

    • Turn on computer engine and cloud TPU services
    • Create a computer Engine VM preloaded with the latest stable version TensorFlow. Where the default zone is us-central1-b .
    • Create a cloud TPU using the appropriate version of TensorFlow and pass the name of the cloud TPU as an environment variable ( TPU _ NAME ) to the computer Engine VM.
    • By granting a specific IAM role to the Cloud TPU service account (see), make sure your cloud TPU can get the resources you need from the GCP project.
    • Perform other checks.
    • Log you in to the new compute Engine VM.
2.3.2 Check if login is successful

When the VM is successfully logged in, we can see that the shell prompt has been [email protected] transformed into [email protected] .

2.3.3 Running a tensorflow program
    • Creating Code files
      pico cloud-tpu.py

The sample code is as follows

import osimport tensorflow as tffrom tensorflow.contrib import tpufrom tensorflow.contrib.cluster_resolver import TPUClusterResolverdef axy_computation(a, x, y):  return a * x + yinputs = [    3.0,    tf.ones([3, 3], tf.float32),    tf.ones([3, 3], tf.float32),]tpu_computation = tpu.rewrite(axy_computation, inputs)tpu_grpc_url = TPUClusterResolver(    tpu=[os.environ['TPU_NAME']]).get_master()with tf.Session(tpu_grpc_url) as sess:  sess.run(tpu.initialize_system())  sess.run(tf.global_variables_initializer())  output = sess.run(tpu_computation)  print(output)  sess.run(tpu.shutdown_system())print('Done!')

Run the code with the following results:

[array([[4., 4., 4.],       [4., 4., 4.],       [4., 4., 4.]], dtype=float32)]Done!
2.3.4 Releasing Resources

Remember to release the resources after you run the code, or the system will continue to be billed. The release resource method is as follows:

1. Disconnect from the computer Engine VM :

(vm)$ exit

After successful disconnection, the shell prompt becomes the project name instead of the VM name.

2. Remove computer Engine VM and cloud TPU

$ ctpu delete

!!! Special Note: If you specify Name,name when you create a VM, you also specify name when you delete it. I did not add the name when I deleted it, although the command line results show that the deletion was successful, but later I looked at the resource usage in the console and found that the VM instance still exists. So the safest way is after the command is finished, go to the console and see if the instance still exists.



3. Delete Storage

The command is:gsutil rm -r gs://Your-storage-name

For more detailed information, refer to the official documentation.



Marsggbo ? Original





2018-10-16



TPU Use instructions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.