linux編程的108種奇淫巧計-11(亂序)【續】

來源:互聯網
上載者:User

接上文:linux編程的108種奇淫巧計-11(亂序)

 

用了支援SSE4的CPU,intel core i3,因為支援了palignr指令,所以把上文的代碼改用了palignr指令重寫了一下如下:

可能由於在虛擬機器上啟動並執行原因,效能提升並不顯著。

 

#include<stdio.h><br />#include<stdlib.h><br />#include <stddef.h><br />#include <stdint.h><br /> asm(" .text ");<br /> asm(" .type shl_7, @function ");<br /> asm("shl_7: push %rbp ");<br /> asm(" mov %rsp,%rbp ");<br /> asm("loop: sub $0x80, %rdx ");<br /> asm(" movaps -0x07(%rsi), %xmm1 ");<br /> asm(" movaps 0x09(%rsi), %xmm2 ");<br /> asm(" movaps 0x19(%rsi), %xmm3 ");<br /> asm(" movaps 0x29(%rsi), %xmm4 ");<br /> asm(" movaps 0x39(%rsi), %xmm5 ");<br /> asm(" movaps 0x49(%rsi), %xmm6 ");<br /> asm(" movaps 0x59(%rsi), %xmm7 ");<br /> asm(" movaps 0x69(%rsi), %xmm8 ");<br /> asm(" movaps 0x79(%rsi), %xmm9 ");<br /> asm(" lea 0x80(%rsi), %rsi ");<br /> asm(" palignr $7, %xmm8, %xmm9 ");<br /> asm(" palignr $7, %xmm7, %xmm8 ");<br /> asm(" palignr $7, %xmm6, %xmm7 ");<br /> asm(" palignr $7, %xmm5, %xmm6 ");<br /> asm(" palignr $7, %xmm4, %xmm5 ");<br /> asm(" palignr $7, %xmm3, %xmm4 ");<br /> asm(" palignr $7, %xmm2, %xmm3 ");<br /> asm(" palignr $7, %xmm1, %xmm2 ");<br /> asm(" movaps %xmm9, 0x70(%rdi) ");<br /> asm(" movaps %xmm8, 0x60(%rdi) ");<br /> asm(" movaps %xmm7, 0x50(%rdi) ");<br /> asm(" movaps %xmm6, 0x40(%rdi) ");<br /> asm(" movaps %xmm5, 0x30(%rdi) ");<br /> asm(" movaps %xmm4, 0x20(%rdi) ");<br /> asm(" movaps %xmm3, 0x10(%rdi) ");<br /> asm(" movaps %xmm2, (%rdi) ");</p><p> asm(" leaveq ");<br /> asm(" retq ");</p><p> asm(" .text ");<br /> asm(" .type shl_7_f, @function ");<br /> asm("shl_7_f: push %rbp ");<br /> asm(" mov %rsp,%rbp ");<br /> asm("loop_f: sub $0x80, %rdx ");</p><p> asm(" movaps -0x07(%rsi), %xmm1 ");<br /> asm(" movaps 0x09(%rsi), %xmm2 ");<br /> asm(" movaps 0x19(%rsi), %xmm3 ");<br /> asm(" movaps 0x29(%rsi), %xmm4 ");<br /> asm(" movaps 0x39(%rsi), %xmm5 ");<br /> asm(" movaps 0x49(%rsi), %xmm6 ");<br /> asm(" movaps 0x59(%rsi), %xmm7 ");<br /> asm(" movaps 0x69(%rsi), %xmm8 ");<br /> asm(" movaps 0x79(%rsi), %xmm9 ");<br /> asm(" lea 0x80(%rsi), %rsi ");<br /> asm(" palignr $7, %xmm8, %xmm9 ");<br /> asm(" movaps %xmm9, 0x70(%rdi) ");<br /> asm(" palignr $7, %xmm7, %xmm8 ");<br /> asm(" movaps %xmm8, 0x60(%rdi) ");<br /> asm(" palignr $7, %xmm6, %xmm7 ");<br /> asm(" movaps %xmm7, 0x50(%rdi) ");<br /> asm(" palignr $7, %xmm5, %xmm6 ");<br /> asm(" movaps %xmm6, 0x40(%rdi) ");<br /> asm(" palignr $7, %xmm4, %xmm5 ");<br /> asm(" movaps %xmm5, 0x30(%rdi) ");<br /> asm(" palignr $7, %xmm3, %xmm4 ");<br /> asm(" movaps %xmm4, 0x20(%rdi) ");<br /> asm(" palignr $7, %xmm2, %xmm3 ");<br /> asm(" movaps %xmm3, 0x10(%rdi) ");<br /> asm(" palignr $7, %xmm1, %xmm2 ");<br /> asm(" movaps %xmm2, (%rdi) ");</p><p> asm(" leaveq ");<br /> asm(" retq ");</p><p> int main(void)<br /> {<br /> uint8_t * src = (uint8_t*)malloc(sizeof(uint8_t)*(128+7));<br /> uint8_t * des = (uint8_t*)malloc(sizeof(uint8_t)*128);<br /> int i = 0;<br /> for(;i<128;++i)<br /> {<br /> src[i] = 0xFF;<br /> }<br /> src[0] = 0x0;<br /> src[1]=0x1;<br /> src[2]=0x2;<br /> src[7]=0x7;<br /> src[8]=0x8;<br /> src[9]=0x9;<br /> src[10]=0xA;<br /> src[126]=0xB;<br /> src[127]=0xA;<br /> src+=7;<br /> shl_7(des,src);<br /> i = 0;<br /> for(;i<128;++i)<br /> {<br /> //printf("%d,%d/n",i,des[i]);<br /> }<br /> i = 0;<br /> for(;i<100000000;++i)<br /> {<br /> #ifdef _SLOW<br /> shl_7(des,src);<br /> #endif<br /> #ifdef _FAST<br /> shl_7_f(des,src);<br /> #endif<br /> }</p><p> return 0;<br />}</p><p>

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.