AI Technology Camp News, Beijing time Monday (February 12) night, Google announced that the Google Cloud Platform (GCP) officially launched Cloud Tpus Beta service to help researchers train and run the machine learning model faster. The current fee is 6.5 U.S. dollars/cloud tpu/hours, and the supply is limited, need to apply in advance.
TPU (Tensor processing Unit) is the full name of the tensor processing units, Google is designed for machine learning and a customized chip. At the Google I/O conference last May 18, Google officially launched the second generation of Tpu--cloud TPU, which was optimized for the second generation of reasoning and training compared to the first generation.
However, Google's TPU has been used only in-house, the first full opening of Google's own TPU, which may mean that Google officially to the Nvidia GPU artillery, accelerated TPU commercialization process, in the AI infrastructure to occupy territory.
The following is the official blog content of Google Cloud:
Cloud Tpus is a series of hardware accelerators designed by Google to optimize machine learning workloads that accelerate and expand the use of TensorFlow programming. Each Cloud TPU consists of four custom ASIC components, single Cloud TPU's floating-point computing capability can reach 180 teraflops (trillions per second), 64GB of memory bandwidth.
These cards can be used alone, but also through the super fast dedicated network connected together to build Multi-petaflop (Guichi time per second) machine learning supercomputer, we call "TPU pods." Later this year, we will provide these large "supercomputers" on the GCP.
The purpose of our design Cloud Tpus is to provide differentiated performance for tensorflow workloads, allowing machine learning engineers and researchers to iterate faster (machine learning models). Like what:
From then on, you no longer have to wait for a shared computer cluster to be scheduled, just through a controlled and customized Google computing engine virtual machine that can monopolize the networked Cloud TPU. It may take days or weeks before a business machine learning model can be trained, and now it only takes one night to train the different variants of the unified model on the Cloud Tpus cluster, and the next day you can deploy the most accurate training model to production activities. Using a single Cloud TPU and following the tutorial (https://cloud.google.com/tpu/docs/tutorials/resnet), you can train the RESNET-50 network in less than a day to match your expectations, making it The Imagenet benchmark challenge achieves the exact rate you expect and costs no more than 200 dollars.
▌ makes machine learning model training easier
Traditionally, writing programs for custom ASIC and supercomputer requires in-depth expertise, but for Cloud TPU, you can use advanced TensorFlow APIs for programming. We have open source a series of high-performance Cloud TPU model implementations to help you get started quickly:
RESNET-50, other commonly used image classification models. Transformer for target detection for machine translation and language modeling retinanet
In order to save your time and energy, we will continue to test the performance of these models in the standard datasets and the convergence to the desired accuracy rate.
After that, we will also open up more models to achieve. The discovery-loving machine-learning expert can use the documentation (https://cloud.google.com/tpu/docs/) and tools we provide (https://cloud.google.com/tpu/docs/ Cloud-tpu-tools), optimizes other TensorFlow models that run in cloud Tpus.
If you start using Cloud TPU now, you will benefit from training time to a significant increase in accuracy when we launch the TPU pod later this year. As we announced on NIPS 2017, on a complete TPU pod, ResNet-50 and Transformer will be trained from half a day to less than 30 minutes, without modifying any code.
▌ Scalable Machine Learning Platform
Cloud TPU also simplifies the planning and management process of machine learning computing resources:
You can provide the best machine learning acceleration for your team, and dynamically adjust your capacity according to the changing requirements;
Instead of spending time and money and hiring professionals to design, install, and maintain an entity machine learning computing cluster that requires specialized functionality, cooling, networking, and storage, you can benefit from the massive, highly integrated machine learning infrastructure that Google has been optimizing for years. There is no need to struggle to ensure a variety of workstation and server driver update upgrades because Cloud TPU does not install drivers at all. Google Cloud will provide the same complex security mechanisms and practical protection.
In addition to Cloud Tpus,google Cloud will also provide a range of high-performance CPUs (including the Intel Skylake) and GPU (including Nvidia Tesla V100).
At present, the supply of Cloud TPU is still limited, apply for https://services.google.com/fb/forms/cloud-tpu-beta-request/. The price will be billed in seconds, about 6.5 dollars/cloud tpu/hours.