Hdu 4920 Matrix multiplication (Matrix multiplication): 5th multi-school training sessions, hdumultiplication
Matrix multiplicationTime Limit: 4000/2000 MS (Java/Others) Memory Limit: 131072/131072 K (Java/Others)
Problem DescriptionGiven two matrices A and B of size n × n, find the product of them.
Bobo hates big integers. So you are only asked to find the result modulo 3.
InputThe input consists of several tests. For each tests:
The first line contains n (1 ≤ n ≤800 ). each of the following n lines contain n integers -- the description of the matrix. the j-th integer in the I-th line equals Aij. the next n lines describe the matrix B in similar format (0 ≤aij, Bij ≤109 ).
OutputFor each tests:
Print n lines. Each of them contain n integers -- the matrix A × B in similar format.
Sample Input
10120 12 34 56 7
Sample Output
00 12 1
Question: Two n * n matrices are given, and the product of these two matrices is obtained. The result returns the remainder of 3. Analysis: the classic matrix multiplication method is used to obtain the question. After the question is submitted, it times out. Then I searched for matrix multiplication Optimization on the Internet and found an optimization method. Unfortunately, I still don't understand how to optimize it.
#include<cstdio>#include<cstring>#include<algorithm>using namespace std;const int N = 805;int a[N][N], b[N][N], ans[N][N];void Multi(int n){ int i, j, k, L, *p2; int tmp[N], con; for(i = 0; i < n; ++i) { memset(tmp, 0, sizeof(tmp)); for(k = 0, L = (n & ~15); k < L; ++k) { con = a[i][k]; for(j = 0, p2 = b[k]; j < n; ++j, ++p2) tmp[j] += con * (*p2); if((k & 15) == 15) { for(j = 0; j < n; ++j) tmp[j] %= 3; } } for( ; k < n; ++k) { con = a[i][k]; for(j = 0, p2 = b[k]; j < n; ++j, ++p2) tmp[j] += con * (*p2); } for(j = 0; j < n; ++j) ans[i][j] = tmp[j] % 3; }}int main(){ int n, i, j, k; while(~scanf("%d",&n)) { for(i = 0; i < n; i++) for(j = 0; j < n; j++) { scanf("%d",&a[i][j]); a[i][j] %= 3; } for(i = 0; i < n; i++) for(j = 0; j < n; j++) { scanf("%d",&b[i][j]); b[i][j] %= 3; } Multi(n); for(i = 0; i < n; i++) { for(j = 0; j < n-1; j++) printf("%d ", ans[i][j]); printf("%d\n", ans[i][n-1]); } } return 0;}
Bytes.
The following method can also be used:
# Include <cstdio> # include <cstring> # include <algorithm> # include <cmath> using namespace std; const int N = 805; int a [N] [N], B [N] [N], ans [N] [N]; int main () {int n, I, j, k; while (~ Scanf ("% d", & n) {for (I = 1; I <= n; I ++) for (j = 1; j <= n; j ++) {scanf ("% d", & a [I] [j]); a [I] [j] % = 3 ;}for (I = 1; I <= n; I ++) for (j = 1; j <= n; j ++) {scanf ("% d ", & B [I] [j]); B [I] [j] % = 3;} memset (ans, 0, sizeof (ans); for (k = 1; k <= n; k ++) // in a classic algorithm, this layer of loop exists in the innermost layer, which times out, but not in the outermost layer or in the middle, I do not know why for (I = 1; I <= n; I ++) for (j = 1; j <= n; j ++) {ans [I] [j] + = a [I] [k] * B [k] [j]; // ans [I] [j] % = 3; // if the remainder of 3 is obtained here, it will time out} for (I = 1; I <= n; I ++) {for (j = 1; j <n; j ++) printf ("% d", ans [I] [j] % 3); printf ("% d \ n ", ans [I] [n] % 3) ;}} return 0 ;}