多核時代的已經來臨,軟體開發人員不得不開始關注並行編程領域。而 JDK 7 中將會加入的 Fork/Join 模式是處理並行編程的一個經典的方法,python中也早已有了Parallel Python模組支援多核並行計算,erlang等為並行計算而生的語言也大紅大紫,下面我們通過計算給定數組中資料所有素數之和這樣一個執行個體,來分別體驗一下java和python並行計算的特點和效能:
java的Fork/Join實現,需要jsr166y的下載http://g.oswego.edu/dl/concurrency-interest/
import java.util.concurrent.TimeUnit;
import jsr166y.ForkJoinPool;
import jsr166y.ForkJoinTask;
import jsr166y.RecursiveAction;
import org.junit.Test;
class Prime extends RecursiveAction{
final long[] array;
int len;
public Prime(long[] array, int len){
this.array = array;
this.len = len;
}
private boolean isPrime(long num){
if (num <2) return false;
if (num == 2) return true;
long max = (long)Math.ceil(Math.sqrt(num));
long i = 2;
while (i <= max) {
if (num % i == 0)
return false;
++i;
}
return true;
}
private long sumPrime(long num){
long sum = 0;
for(int i = 2; i < num; i++){
if(isPrime(i)){
sum += i;
}
}
return sum;
}
private boolean hasElement(long[] a,long b){
for(int i = 0; i < a.length; i++){
if (a[i]== b) return true;
}
return false;
}
@Override
protected void compute() {
if (len >= 0){
System.out.println("Sum of primes below " + array[len] + " is "+sumPrime(array[len]));
--len;
invokeAll(new Prime(array,len),new Prime(array,len-2));
}
}
}
public class TestForkJoinSimple {
long[] array = {1000000, 1000100,1000200, 1000300, 1000400};
@Test
public void testSort() throws Exception {
Date date1 = new Date();
ForkJoinTask sort = new Prime(array, 4);
ForkJoinPool fjpool = new ForkJoinPool();
fjpool.submit(sort);
fjpool.shutdown();
System.out.println("Starting prime with "+fjpool.getParallelism()+" workers");
fjpool.awaitTermination(4, TimeUnit.SECONDS);
System.out.println(new Date().getTime()-date1.getTime()); }
}
運行結果:
Starting prime with 2 workers
Sum of primes below 1000400 is 37582408783
Sum of primes below 1000300 is 37574405939
Sum of primes below 1000100 is 37556402315
Sum of primes below 1000200 is 37566403929
Sum of primes below 1000000 is 37550402023
4043
python的pp實現,參考:http://www.parallelpython.com/
import math, sys, time
import pp
def isprime(n):
if not isinstance(n, int):
raise TypeError("argument passed to is_prime is not of 'int' type")
if n < 2:
return False
if n == 2:
return True
max = int(math.ceil(math.sqrt(n)))
i = 2
while i <= max:
if n % i == 0:
return False
i += 1
return True
def sum_primes(n):
return sum([x for x in xrange(2,n) if isprime(x)])
print """Usage: python sum_primes.py [ncpus]
[ncpus] - the number of workers to run in parallel,
if omitted it will be set to the number of processors in the system
"""
ppservers = ()
if len(sys.argv) > 1:
ncpus = int(sys.argv[1])
job_server = pp.Server(ncpus, ppservers=ppservers)
else:
job_server = pp.Server(ppservers=ppservers)
print "Starting pp with", job_server.get_ncpus(), "workers"
job1 = job_server.submit(sum_primes, (100,), (isprime,), ("math",))
result = job1()
print "Sum of primes below 100 is", result
start_time = time.time()
inputs = (100000, 100100, 100200, 100300, 100400)
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,), ("math",))) for input in inputs]
for input, job in jobs:
print "Sum of primes below", input, "is", job()
print "Time elapsed: ", time.time() - start_time, "s"
job_server.print_stats()
運行結果:
Usage: python sum_primes.py [ncpus]
[ncpus] - the number of workers to run in parallel,
if omitted it will be set to the number of processors in the system
Starting pp with 2 workers
Sum of primes below 100 is 1060
Sum of primes below 100000 is 454396537
Sum of primes below 100100 is 454996777
Sum of primes below 100200 is 455898156
Sum of primes below 100300 is 456700218
Sum of primes below 100400 is 457603451
Time elapsed: 1.36199998856 s
Job execution statistics:
job count | % of all jobs | job time sum | time per job | job server
6 | 100.00 | 2.2980 | 0.383000 | local
Time elapsed since server creation 1.36199998856
從運行結果我們可以得出點什麼?