Go 1.8 Performance improvements, one month in

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Sunday September the 18th marks a month since the Go 1.8 cycle opened officially. I ' m passionate about the performance of Go programs, and of the compiler itself. This post was a brief look at the state of play, roughly to the development cycle for Go 1.81.

Note:these results is of course preliminary and represent only a at time, not the performance of the final Go 1.8 Release.

Compile Times

Nothing much. Using the methodology from my previous Go 1.7 benchmarks, there are a 3.22%–5.11% improvement in full compile time compared to Go 1.7.

Performance improvements

Intel AMD64

Better code generation and small improvements to the runtime and standard library show some small improvements for amd642, But really nothing to write home about yet.

name Old time/op new Time/op DeltaBinarytree17-4 3.07s±2% 3.06s±2% ~ (p=0.661 n=10+9) fannkuch11-4 3.23s±1% 3.22s±0% -0.43% (p=0.008 n=9+10) FmtFprintfEmpty-4 64.4ns±0% 61.8ns±4% -4.17% (p=0.005 n=9+10) FMTF PrintfString-4 162ns±0% 162ns±0% ~ (p=0.065 n=10+9) FmtFprintfInt-4 142ns±0% 142 ns±0% ~ (p=0.137 n=8+10) FmtFprintfIntInt-4 220ns±0% 217ns±0% -1.18% (p=0.000 n=9+10) Fmtfprin  TfPrefixedInt-4 224ns±0% 224ns±1% ~ (p=0.206 n=9+9) FmtFprintfFloat-4 313ns±0% 312ns±                0% -0.26% (p=0.001 n=10+9) FmtManyArgs-4 906ns±0% 894ns±0% -1.32% (p=0.000 n=7+6) GobDecode-4  8.88ms±1% 8.81ms±0% -0.81% (p=0.003 n=10+10) GobEncode-4 7.93ms±1% 7.88ms±0%                   -0.66% (p=0.008 n=9+10) Gzip-4 272ms±1% 277ms±0% +1.95% (p=0.000 n=10+9) Gunzip-4  47.4ms±0%  47.4ms±0% ~ (p=0.720 n=9+10) HTTPClientServer-4 201µs±4% 202µs±2% ~ (p=0.631 n=10+10) JS ONEncode-4 19.3ms±0% 19.3ms±0% ~ (p=0.063 n=10+10) JSONDecode-4 61.0ms±0% 61 .2ms±0% +0.33% (p=0.000 n=10+8) mandelbrot200-4 5.20ms±0% 5.20ms±0% ~ (p=0.475 n=10+7) gopars  E-4 3.95ms±1% 3.97ms±1% +0.65% (p=0.003 n=9+9) regexpmatcheasy0_32-4 88.4ns±0% 88.7ns ±0% +0.34% (p=0.001 n=10+9) regexpmatcheasy0_1k-4 1.14µs±0% 1.14µs±0% ~ (p=0.369 n=9+6) REGEXPMATC Heasy1_32-4 82.6ns±0% 82.0ns±0% -0.70% (p=0.000 n=9+10) regexpmatcheasy1_1k-4 469ns±0% 463ns±0 % -1.23% (p=0.000 n=6+9) regexpmatchmedium_32-4 138ns±1% 136ns±0% -1.38% (p=0.000 n=10+9) regexpmatchmed Ium_1k-4 43.6µs±1% 42.0µs±0% -3.74% (p=0.000 n=9+9) regexpmatchhard_32-4 2.25µs±1% 2.23µs±0%- 0.57% (p=0.000 n=8+8)Regexpmatchhard_1k-4 68.8µs±0% 68.6µs±0% -0.37% (p=0.000 n=8+8) Revcomp-4 477ms±1% 472ms±0% -1.03% (p=0.000 n=8+8) Template-4 76.1ms±0% 76.4ms±0% +0.35% (p=0.000 n=9+9) time Parse-4 367ns±0% 366ns±0% -0.16% (p=0.003 n=10+8) TimeFormat-4 386ns±0% 384 ns±0% -0.58% (p=0.000 n=9+9) name old speed new speed Delta  GobDecode-4 86.4mb/s±1% 87.1mb/s±0% +0.81% (p=0.003 n=10+10) GobEncode-4 96.7mb/s±1% 97 .4mb/s±0% +0.66% (p=0.007 n=9+10) Gzip-4 71.4mb/s±1% 70.0mb/s±0% -1.91% (p=0.000 n=10+9) Gunz Ip-4 409mb/s±0% 410mb/s±0% ~ (p=0.703 n=9+10) JSONEncode-4 101mb/s±0% 100MB /s±0% ~ (p=0.084 n=10+10) JSONDecode-4 31.8mb/s±0% 31.7mb/s±0% -0.33% (p=0.000 n=10+8) goparse-  4 14.7mb/s±1% 14.6mb/s±1% -0.67% (p=0.002 n=9+9) regexpmatcheasy0_32-4 362mb/s±0% 361mb/s± 0% -0.36% (p=0.000 n=10+9) regexpmatcheasy0_1k-4 898mb/s±0% 898mb/s±0% ~ (p=0.762 n=9+8) Regexpmatche  Asy1_32-4 387mb/s±0% 390mb/s±0% +0.70% (p=0.000 n=9+10) regexpmatcheasy1_1k-4 2.18gb/s±0% 2.21gb/s±0% +1.20% (p=0.000 n=9+9) regexpmatchmedium_32-4 7.23mb/s±1% 7.32mb/s±0% +1.19% (p=0.000 n=10+9) Regexpmatchmediu  M_1k-4 23.5mb/s±1%24.4mb/s±0% +3.88% (p=0.000 n=9+9) regexpmatchhard_32-4 14.2mb/s±1% 14.3mb/s±0% +0.58% (p=0.000 n=8+8) Re Gexpmatchhard_1k-4 14.9mb/s±0% 14.9mb/s±0% +0.34% (p=0.000 n=8+7) Revcomp-4 533mb/s±1% 539 mb/s±0% +1.04% (p=0.000 n=8+8) Template-4 25.5mb/s±0% 25.4mb/s±0% -0.36% (p=0.000 n=9+9)

Arm

The major improvement that landed recently in the development Branch are the conversion of the remaining architecture Backe NDS to use the compiler ' s SSA form. This have brought a substantial improvement in generated code for non Intel architectures, like ARM3.

name Old time/op new Time/op DeltaBinarytree17-4 33.8s±1% 27.7s±0% -18.06% (p=0.000 n=10+10) fannkuch11-4 42.0s±0% 19.3s±0% -54.10% (p=0.000 n=10+10) FmtFprintfEmpty-4 670ns±1% 581ns±1% -13.30% (p=0.000 n=10+1      0) FmtFprintfString-4 2.04µs±1% 1.65µs±0% -19.09% (p=0.000 n=10+10) FmtFprintfInt-4 1.71µs±0% 1.21µs±0% -29.39% (p=0.000 n=10+9) FmtFprintfIntInt-4 2.69µs±1% 1.94µs±0% -27.77% (p=0.000 n=10 +10) FmtFprintfPrefixedInt-4 2.70µs±0% 1.85µs±0% -31.41% (p=0.000 n=10+9) FmtFprintfFloat-4 5.15µs± 0% 3.65µs±0% -29.01% (p=0.000 n=9+10) FmtManyArgs-4 11.3µs±0% 8.5µs±0% -24.79% (p=0.000 n =10+9) GobDecode-4 112ms±0% 77ms±1% -31.04% (p=0.000 n=9+9) GobEncode-4 88.5ms ±1% 77.2ms±1% -12.78% (p=0.000 n=10+10) Gzip-4 4.79s±0% 3.34s±0% -30.18% (p=0.00        0 n=9+9) Gunzip-4            702ms±0% 463ms±0% -34.05% (p=0.000 n=10+10) HTTPClientServer-4 645µs±3% 571µs±3%                -11.45% (p=0.000 n=10+10) JSONEncode-4 227ms±0% 186ms±0% -18.16% (p=0.000 n=10+10) JSONDecode-4   845ms±0% 618ms±0% -26.81% (p=0.000 n=10+10) mandelbrot200-4 59.3ms±0% 40.0ms±0% -32.47% (p=0.000 n=10+10) GoParse-4 45.0ms±0% 37.0ms±0% -17.68% (p=0.000 n=9+9) Regexpmatche Asy0_32-4 974ns±0% 878ns±0% -9.81% (p=0.000 n=10+9) regexpmatcheasy0_1k-4 4.60µs±0% 4.48µs± 0% -2.57% (p=0.000 n=10+10) regexpmatcheasy1_32-4 1.02µs±0% 0.94µs±0% -8.08% (p=0.000 n=8+10) REGEXPMATC Heasy1_1k-4 6.92µs±0% 6.08µs±0% -12.10% (p=0.000 n=10+10) regexpmatchmedium_32-4 1.61µs±0% 1.27µs ±0% -20.98% (p=0.000 n=9+6) regexpmatchmedium_1k-4 447µs±0% 317µs±0% -29.05% (p=0.000 n=10+9) regexpma Tchhard_32-4 24.9µs±0% 18.4µs±0% -25.89% (p=0.000 n=10+10) regexpmatchhard_1k-4 740µs±0% 552µs±0% -25.36% (p=0.00 0 n=10+10) Revcomp-4 81.0ms±1% 65.2ms±0% -19.53% (p=0.000 n=9+9) Template-4 1. 17s±0% 0.81s±0% -31.28% (p=0.000 n=9+9) TimeParse-4 5.52µs±0% 3.79µs±0% -31.42% (p=0 . n=10+9) TimeFormat-4 10.6µs±0% 8.5µs±0% -19.14% (p=0.000 n=10+10) name old speed new speed Delta  GobDecode-4 6.86mb/s±0% 9.95mb/s±1% +45.00% (p=0.000 n=9+9) GobEncode-4 8.67mb/s±1% 9.94mb/s±1% +14.69% (p=0.000 n=10+10) Gzip-4 4.05mb/s±0% 5.81mb/s±0% +43.32% (p=0.000 n=10+   9) Gunzip-4 27.6mb/s±0% 41.9mb/s±0% +51.63% (p=0.000 n=10+10) JSONEncode-4 8.53mb/s±0% 10.43mb/s±0% +22.20% (p=0.000 n=10+10) JSONDecode-4 2.30mb/s±0% 3.14mb/s±0% +36.39% (p=0.000 n=9 +10) GoParse-4 1.29mb/s±0% 1.56mb/s±0% +20.93% (p=0.000 n=9+10) regexpmatcheasy0_32-4 32.8mb/s± 0% 36.4mb/s±0% +10.87% (p=0.000 n=10+10) regexpmatcheasy0_1k-4 222mb/s±0% 228mb/s±0% +2.64% (p=0.000 n= 10+10) regexpmatcheasy1_32-4 31.3mb/s±0% 34.0mb/s±0% +8.75% (p=0.000 n=9+10) regexpmatcheasy1_1k-4 148MB/s  ±0% 168mb/s±0% +13.76% (p=0.000 n=10+10) regexpmatchmedium_32-4 620kb/s±0% 790kb/s±0% +27.42% (p=0.000 N=10+8) RegexpmatchmediuM_1k-4 2.29mb/s±0% 3.23mb/s±0% +41.05% (p=0.000 n=10+10) regexpmatchhard_32-4 1.29mb/s±0% 1.74mb/s±0%                +34.88% (p=0.000 n=9+10) regexpmatchhard_1k-4 1.38mb/s±0% 1.85mb/s±0% +34.06% (p=0.000 n=10+10) Revcomp-4   31.4mb/s±1% 39.0mb/s±0% +24.26% (p=0.000 n=9+9) Template-4 1.65mb/s±0% 2.41mb/s±0% +45.71% (p=0.000 n=10+9)

Notes:

    1. Despite the Go 1.8 development cycle opening-late, in order-keep to the 6 month cadence, the feature freeze for This cycle would still occur on the 1st of November.
    2. Intel (R) Core (TM) i5-2520m CPU @ 2.50GHz, 3.13.0-95-generic #142-ubuntu
    3. Freescale i.mx6, 3.14.77-1-arch
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.