Google releases the second generation of TPU and offers a free trial program

Last Update:2018-08-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Unlike the first generation of TPU, which can only serve machine learning model calculations, Google's latest TPU now handles both training and service. Infoq introduced the first generation of TPU white paper in detail at the beginning of this year.

The release time of the second generation TPU coincided with a week before Nvidia release Volta, a common GPU, optimized for tensorflow with a rigorously tested tensor core feature. Google did not provide a public white paper that matched the second generation of TPU, as it did when the first generation of TPU was released. The first-generation TPU white paper was released only a few months after TPU was released. Therefore, it is possible to speculate that the white paper detailing second-generation TPU (TPU-2) benchmark data is about to come. Ideally, it would include a test mix of TPU and competitor chipset configurations, their boundary performance, and the machine learning workload types that run on them. Similar to the first generation of TPU white paper, it provides detailed information about TPU-2.

Google provides some advanced performance metrics, presumably based on the TPU physical infrastructure that Google uses to provide TPU as a service through the GCP computing engine. A group of specific researchers and scientists will be free to use a cluster containing 1000 of TPU. The free TPU infrastructure and GCP services for everyone else may have a large degree of abstraction, and hardware researchers or news information can have an in-depth understanding without white papers. In terms of performance improvement, Google points out:

...... Our large-scale new translation model has been trained all day in 32 of the world's best commercial GPU, and 1/8 TPU pods can finish this task in one afternoon ...

The TPU-2 pod contains a TPU-2 board consisting of multiple TPU-2 processors. According to the sporadic technical information in the Google Bulletin and a few photos, we can speculate that the flash memory on each chip may be connected and that a single TPU-2 may share the flash state.

The second-generation TPU infrastructure provides up to 256 chips, which are connected together to provide 11.5 gigabit machine learning computing power. Google accepts the Alpha version trial application, however, the application form is the same as the researcher's free trial form. It is not clear whether the next generation of TPU will be used in services like CLOUDML, which perform model training on the GPU. However, the service is not limited to TPU. The GCP feature

Allow users to build their own models on rival chips (such as Intel's Skylake) or GPU (such as Nvidia's Volta), and then migrate the project to Google's TPU cloud for final processing.

It is difficult to compare the performance of TPU-2 with the first generation of TPU because their feature set differs from the underlying mathematical operating primitives. The first-generation TPU does not use floating-point operations, but rather uses a 8-bit integer to approximate floating-point numbers. It is not yet known whether Google offers approximate methods to convert floating-point operation performance to 8-bit integers for quantitative analysis of floating-point operations estimates.

Google's latest large-scale translation model needs to be trained all day on 32 "Best commercial GPU", while 1/8 TPU pods can finish the work in one afternoon ... The maximum peak throughput per board is 45 trillion floating-point operations per second, as described above, the system board totals 180 trillion floating-point operations per second, with peak performance up to 11500 trillion times.

The ability to access flash memory and to provide training and services on the same hardware can affect Google's competitive relationship with other chipset manufacturers, because AMD's Vega Radeon instinct GPU Accelerator can either directly access flash memory or provide ML training and services.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Google releases the second generation of TPU and offers a free trial program

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Google releases the second generation of TPU and offers a free trial program

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support