1 TPU Classification and charging standard
1.1 Classification and billing instructions
Area |
Preemptive type TPU |
Cloud TPU |
United States |
$1.35/hour |
$4.5/hour |
Europe |
$1.485/hour |
$4.95/hour |
Asia Pacific Region |
$1.566/hour |
$5.22/hour |
- Preemptive TPU is a Cloud TPU when you need to assign resources to another task, you can terminate (preempt) the TPU at any time. The cost of preemptive TPU is much lower than that of ordinary TPU.
- TPU is billed in increments of 1 seconds.
In order to connect to TPU, we must configure a virtual machine (billing separately). It is important to note that the virtual machine and TPU are billed separately .
In other words, only after the start of TPU, Cloud TPU Billing will start, after the stop or remove TPU, the billing is stopped. Run ctpu pause
or gcloud compute tpus stop
stop TPU. Again, we will charge you virtual machines only after the virtual machine has been activated.
If the virtual machine is stopped and cloud TPU is not stopped, you will need to continue to pay for the cloud TPU. If Cloud TPU is stopped or deleted, and the virtual machine does not stop, you will need to continue to pay for the virtual machine.
1.2 Useful Query links
- Compute Engine Price List
- Compute Engine Price Calculator
1.3 Example of price calculation
The following example explains how to calculate the total cost of a training job that uses the TPU resources and Compute Engine instances in the US region.
A machine Learning Research institute provisioned a virtual machine by creating an Compute Engine instance, and they chose the n1-standard-2 machine type. They also created a TPU resource, and the cumulative use time for Compute Engine instances and TPU resources was 10 hours. To calculate the total cost of the training exercise, the Machine Learning Institute must add the following:
- Total cost for all Compute Engine instances
- Total cost of all Cloud TPU resources
Resources |
price per machine per hour (USD |
) Number of machines |
Hours of Billing |
Total cost of resources |
Total cost of training work |
Compute Engine N1-standard-2 Instance |
$0.095 |
1 |
10 |
$0.95 |
_ |
Cloud TPU Resources |
$4.50 |
1 |
10 |
$45.00 |
_ |
|
|
|
|
$45.95 |
|
Example of price using preemptive TPU
In the following example, the resources and duration are the same as in the previous example, but this time the research institute decided to use preemptive TPU to save costs. The cost of preemptive TPU is $1.35 per hour, not ordinary TPU $4.50 per hour.
Resources |
price per machine per hour (USD |
) Number of machines |
Hours of Billing |
Total cost of resources |
Total cost of training work |
Compute Engine N1-standard-2 Instance |
$0.095 |
1 |
10 |
$0.95 |
- |
Preemptive type TPU |
$1.35 |
1 |
10 |
$13.50 |
- |
|
|
|
|
$14.45 |
|
2 Use steps
2.1 Creating a GCP project
After clicking on the link to Google Cloud platform, you will get to this interface:
Click Create Project , enter the project name, and then the project will be created successfully, sometimes you may need to refresh the Web page project to appear.
2.2 Creating a cloud Storage bucket
Cloud Storage is simply used to store model training data and training results . The official explanation is that it is a powerful and cost-effective storage solution for unstructured objects, ideal for hosting real-time Web content, storing data for analytics, archiving, and backup services.
Note: To use cloud Storage, you need to enable the billing feature.
2.2.1 Creating a storage partition
Storage partitions are used to hold the objects (any type of file) that you want to store in Cloud storage.
- First select "Save"( in Chinese and English)on the left side of the console to go to the cloud storage page,
Then click "Create storage Partition"
Enter the storage name to create the completion, note that the name needs to be unique, or you cannot create the success.
2.2.2 Uploading and sharing objects
To get started with your storage partition, simply upload the object and open its access rights.
2.2.3 Cleanup
In the final step, you will delete the storage partitions and objects that you created earlier for this tutorial.
2.3 Open the cloud Shell and use the CTPU tool
The shell is in the upper-right corner of the console, as shown in:
Enter ctpu print-config
to view configuration information. The result of my input is this:
ctpu configuration: name: hkbuautoml project: test01-219602 zone: us-central1-bIf you would like to change the configuration for a single command invocation, please use the command line flags.
2.3.1 Creating Computer Engine VMs and TPU
The command is:ctpu up [optional: --name --zone]
Note: Name can only be composed of lowercase letters and numbers, and uppercase or other characters will error.
Here I created a name for tputest
the TPU. Enter Y to confirm the creation.
The above ctpu up
command mainly does the following several things:
- Turn on computer engine and cloud TPU services
- Create a computer Engine VM preloaded with the latest stable version TensorFlow. Where the default zone is
us-central1-b
.
- Create a cloud TPU using the appropriate version of TensorFlow and pass the name of the cloud TPU as an environment variable (
TPU _ NAME
) to the computer Engine VM.
- By granting a specific IAM role to the Cloud TPU service account (see), make sure your cloud TPU can get the resources you need from the GCP project.
- Perform other checks.
- Log you in to the new compute Engine VM.
2.3.2 Check if login is successful
When the VM is successfully logged in, we can see that the shell prompt has been [email protected]
transformed into [email protected]
.
2.3.3 Running a tensorflow program
- Creating Code files
pico cloud-tpu.py
The sample code is as follows
import osimport tensorflow as tffrom tensorflow.contrib import tpufrom tensorflow.contrib.cluster_resolver import TPUClusterResolverdef axy_computation(a, x, y): return a * x + yinputs = [ 3.0, tf.ones([3, 3], tf.float32), tf.ones([3, 3], tf.float32),]tpu_computation = tpu.rewrite(axy_computation, inputs)tpu_grpc_url = TPUClusterResolver( tpu=[os.environ['TPU_NAME']]).get_master()with tf.Session(tpu_grpc_url) as sess: sess.run(tpu.initialize_system()) sess.run(tf.global_variables_initializer()) output = sess.run(tpu_computation) print(output) sess.run(tpu.shutdown_system())print('Done!')
Run the code with the following results:
[array([[4., 4., 4.], [4., 4., 4.], [4., 4., 4.]], dtype=float32)]Done!
2.3.4 Releasing Resources
Remember to release the resources after you run the code, or the system will continue to be billed. The release resource method is as follows:
1. Disconnect from the computer Engine VM :
(vm)$ exit
After successful disconnection, the shell prompt becomes the project name instead of the VM name.
2. Remove computer Engine VM and cloud TPU
$ ctpu delete
!!! Special Note: If you specify Name,name when you create a VM, you also specify name when you delete it. I did not add the name when I deleted it, although the command line results show that the deletion was successful, but later I looked at the resource usage in the console and found that the VM instance still exists. So the safest way is after the command is finished, go to the console and see if the instance still exists.
3. Delete Storage
The command is:gsutil rm -r gs://Your-storage-name
For more detailed information, refer to the official documentation.
Marsggbo
? Original
2018-10-16
TPU Use instructions