矩陣運算也許對於大多數程式員並不重要,所以下面來個更通俗版本的AMP:
1 #include <iostream>
2 #include <amp.h>
3
4 int main()
5 {
6 int nickName[6]{'a', 96, 'd', 'r', 'j', 'x'};
7 concurrency::array_view<int> myView(6, nickName);
8 concurrency::parallel_for_each(myView.extent,
9 [=] (concurrency::index<1> idx) restrict(amp)
10 {
11 myView[idx] += 1;
12 }
13 );
14 for(int i = 0; i<6; ++i)
15 std::cout<<(char)myView[i];
16 return 0;
17 }
在沒有運行前,你知道輸入的是什麼嗎?先不急著公布答案,先來說說其中的“內涵”。
首先,你是無法通過編譯的。因為Line 6我使用了c++11裡的初始化方式。預設情況下是無法通過編譯的,你要改成
int nickName[6] = {'a', 96, 'd', 'r', 'j', 'x'};
是的,目前vs11不支援initialize list。
其次你可能會問,為什麼nickName不是char類型的而是int類型的。最直白的回答就是array_view不支援char,最少也要是int。
具體我們可以看array_view的聲明:
template <typename _Value_type, int _Rank = 1>
class array_view : public _Array_view_base<_Rank,sizeof(_Value_type)/sizeof(int)>
{
//為了便於閱讀,省略此處代碼
}
array_view在上一篇文章中提到過,這裡重申下,通俗來講他類似與一個迭代器。他提供了parallel_for_each需要用的index介面。
restrict是個非保留字元,他只是在當前語境中才有作用。這個“提示符”負責告訴編譯器,程式員意圖要產生什麼程式。是cpu呢還是amp,amp的意思就是使用加速器。
什麼是加速器(Aaccelerator)?就是另一個可以並行計算的裝置,比如你的顯卡GPU,比如其他支援SIMD的向量處理器,比如你通過OS驅動類比的處理器等。另外,早期amp的前身是字串“direct3d”,現在替換為amp了。如果你看資料發現了“direct3d”,不要太奇怪。
kernel,就是要運行在加速器上的代碼。
本樣本的kernel就一句:
myView[idx] += 1;
顯然是對myNickName所有字元+1.現在你該明白輸出什麼了吧?
下面給個VS11對c++ 11的支援度(visual studio only $_$):
| C++11 Core Language Features |
VC10 |
VC11 |
| Rvalue references v0.1, v1.0, v2.0, v2.1, v3.0 |
v2.0 |
v2.1* |
| ref-qualifiers |
No |
No |
| Non-static data member initializers |
No |
No |
| Variadic templates v0.9, v1.0 |
No |
No |
| Initializer lists |
No |
No |
| static_assert |
Yes |
Yes |
| auto v0.9, v1.0 |
v1.0 |
v1.0 |
| Trailing return types |
Yes |
Yes |
| Lambdas v0.9, v1.0, v1.1 |
v1.0 |
v1.1 |
| decltype v1.0, v1.1 |
v1.0 |
v1.1** |
| Right angle brackets |
Yes |
Yes |
| Default template arguments for function templates |
No |
No |
| Expression SFINAE |
No |
No |
| Alias templates |
No |
No |
| Extern templates |
Yes |
Yes |
| nullptr |
Yes |
Yes |
| Strongly typed enums |
Partial |
Yes |
| Forward declared enums |
No |
Yes |
| Attributes |
No |
No |
| constexpr |
No |
No |
| Alignment |
TR1 |
Partial |
| Delegating constructors |
No |
No |
| Inheriting constructors |
No |
No |
| Explicit conversion operators |
No |
No |
| char16_t and char32_t |
No |
No |
| Unicode string literals |
No |
No |
| Raw string literals |
No |
No |
| Universal character names in literals |
No |
No |
| User-defined literals |
No |
No |
| Standard-layout and trivial types |
No |
Yes |
| Defaulted and deleted functions |
No |
No |
| Extended friend declarations |
Yes |
Yes |
| Extended sizeof |
No |
No |
| Inline namespaces |
No |
No |
| Unrestricted unions |
No |
No |
| Local and unnamed types as template arguments |
Yes |
Yes |
| Range-based for-loop |
No |
Yes |
| override and final v0.8, v0.9, v1.0 |
Partial |
Yes |
| Minimal GC support |
Yes |
Yes |
| noexcept |
No |
No |
| C++11 Core Language Features: Concurrency |
VC10 |
VC11 |
| Reworded sequence points |
N/A |
N/A |
| Atomics |
No |
Yes |
| Strong compare and exchange |
No |
Yes |
| Bidirectional fences |
No |
Yes |
| Memory model |
N/A |
N/A |
| Data-dependency ordering |
No |
Yes |
| Data-dependency ordering: function annotation |
No |
No |
| exception_ptr |
Yes |
Yes |
| quick_exit and at_quick_exit |
No |
No |
| Atomics in signal handlers |
No |
No |
| Thread-local storage |
Partial |
Partial |
| Magic statics |
No |
No |
| C++11 Core Language Features: C99 |
VC10 |
VC11 |
| __func__ |
Partial |
Partial |
| C99 preprocessor |
Partial |
Partial |
| long long |
Yes |
Yes |
| Extended integer types |
N/A |
N/A |