A small program that generates 10 nth integers, showing great wisdom

Source: Internet
Author: User
Tags integer numbers
A small program that generates 10 nth integers, showing great wisdom

Wei renyan 2010.8.28

I recently studied hadoop. I used the MAP/reduce model last time to write a program for summation of a large number of integer numbers. In order to test the program, I used Ruby to write a random integer, write the integer into the text file in the specified format, and then input the generated data into the MAP/reduce Program (numersum) For summation. For such a small function, I believe that anyone who has learned programming can write it out, because this problem is very simple. The actual task is to give a value N, which defines the generated data volume, when n = 3, 1000 random numbers are generated and the data is written to the file. The for loop is used to generate the data, which can be repeated for 1000 times. The result I want is the generated data file> = 10 m. The minimum requirement is at least N> = 5. Anyway, it is more beneficial. Now the problem and corresponding solutions are available. Next I will introduce my entire implementation process step by step.

 

Problem: generate a text file containing 10 nth integers. N can be infinitely large, and the running time of the program must be tolerable. That is to say, the program has a high performance requirement.

Solution:

1. define an array M, generate a random number N through random, store the value n into the array m, and the random number must contain a negative number, limit the random number N value between-99 = <n <= 99

2. Read the generated result from a number of groups and write it into the num.txt file.

 

I use two methods to implement my code, 1st methods are implemented using a simple loop method, and 2nd methods are integer tasks to be generated using multiple threads, it is divided into several small tasks and allocated to T threads for execution. After each thread completes execution, it outputs its running results to the specified file.

 

First implementation code>

#! /Usr/bin/ruby <br/> puts "Hello, world... /n "; <br/> # file Writing Method <br/> def write (filename, context) <br/> out = file. open (filename, "A") <br/> out. write (context) </P> <p> out. close <br/> end <br/> Start = time. now <br/> puts "start... # {start} "<br/> n = 10 *** 5 <br/> nmu =" "<br/> A = [] <br/> # generate an array value <br/> for I in 0 .. n-1 <br/> If I % 3 = 0 & I % 2 = 0 <br/> A [I] =-rand (100) <br/> elsif I % 4 = 0 & I % 5 = 0 <br/> A [I] =-rand (100) <br/> # A [I-1] =-A [I-1] <br/> else <br/> A [I] = rand (100) <br/> end <br/> # puts. join ("") <br/> # Read the values in the array, write File <br/> I = 0 <br/> while I <n-1 </P> <p> U = I + 9 </P> <p> if u> = n-1 <br/> U = n-1 <br/> end <br/> #10 integers in each row <br/> STR = A [I .. u]. join ("") + "/N" <br/> write ("num.txt", STR) <br/> # Puts STR </P> <p> I + = 10 <br/> end <br/> ED = time. now <br/> cs = ed-start <br/> puts "end... # {ed} "<br/> puts" consnum .. # {CS} s/n "<br/> # puts. join ("") <br/> # print the sum of all integers <br/> S = 0 <br/>. each do | I | <br/> S + = I <br/> end <br/> puts S <br/>

 

Second implementation code>

#! /Usr/bin/ruby <br/> require 'thread' <br/> puts "Hello, world... /n "; <br/> # file Writing Method <br/> def write (filename, context) <br/> out = file. open (filename, "A") <br/> out. write (context) </P> <p> out. close <br/> end <br/> # Write the generated data to a file <br/> def writearr (ARR) </P> <p> I = 0 <br/> N = arr. length <br/> STR = "" <br/> while I <n-1 </P> <p> U = I + 9 </P> <p> if u> = n-1 <br/> U = n-1 <br/> end <br/> #10 digits in each row <br/> STR = STR + arr [I .. u]. join ("") + "/N" <br/> # cache processing of write files <br/> If (U + 1) % (200000) = 0) | (u = N-1) </P> <p> write ("num.txt", STR) <br/> STR = "" <br/> end </P> <p> I + = 10 <br/> end </P> <p> end <br/> threads = [] <br/> n = 6 <br/> # Number of generated integers <br/> n = 10 ** n <br/> # Number of threads <br /> T = 4*10 ** (2) # N: 5, T: 20; N: 6, T: 600; N: 7, T, 7000 <br/> mutex = mutex. new <br/> starts = time. now <br/> puts "Start threads... # {starts} "<br/> T. times {| I | <br/> # define a thread <br/> threads <thread. new (I) {<br/> puts I <br/> arr = [] <br/> COUNT = 0 <br/> L = 0 <br/> U = N/T # Each number of Integers to be generated by the thread </P> <p> Start = time. now <br/> puts "start thread-# {I }... # {start} "<br/> loop DO </P> <p> # mutex. synchronize DO <br/> # Puts count <br/> If count> U-1 <br/> # Puts arr. join ("") <br/> ED = time. now <br/> cs = ed-start <br/> puts "End thread-# {I }... # {ed} "<br/> puts" consnum .. # {CS} s/n "<br/> puts count-1 <br/> # mutex. synchronize DO <br/> writearr (ARR) <br/> # End <br/> TT = thread. current <br/> TT. exit </P> <p> end <br/> # randomly generated integer <br/> Index = count <br/> If index % 3 = 0 & Index % 2 = 0 <br/> arr [Index] =-rand (10000) <br/> elsif index % 4 = 0 & Index % 5 = 0 <br/> arr [Index] =-rand (10000) <br/> # arr [index-1] =-Arr [index-1] <br/> else <br/> arr [Index] = rand (10000) <br/> end <br/> count + = 1 <br/> # End </P> <p> end </P> <p >}< br/> # running thread <br/> threads [I]. run <br/>}< br/> puts "Init complete" <br/> threads. each {| T. join} <br/> eds = time. now <br/> CSS = eds-starts <br/> puts "End threads... # {eds} "<br/> puts" consnum .. # {CSS} s/n "<br/> puts" done complete"

 

Program running result:

Generates 100000 integers, n = 5,

Implementation 1. 9.39 s

Implementation 2.1.453s

 

We all know that multi-thread parallel computing can improve the program running performance. When I use "implement 1" to calculate 10 ^ 10 integers, the running time is too long, so I want to use the thread to shorten the running time, at this time, I quickly came up with the general idea of "implementation 2" and wrote the code. After writing it, the running result is too bad as I expected, the running time is several times longer than "implement 1". After many debugging, I finally understood the problem and finally understood the intelligence embodied in it.

 

Key Points of running performance:

1. Number of threads

2. The number of Integers to be generated for each thread

3. Files are frequently written in parallel, and file writing waits.

 

To address the problem, my solution is as follows: I specify the number of corresponding threads and the number of processes to be processed by each thread by generating an integer,

1. Task breakdown:

Number of Integers to be generated by each thread = number of Integers to be generated/number of running threads

When a certain number of integers are generated, the corresponding threads must be run. There must be no more or fewer integers. If there are few integers, the task of each thread is too heavy, if the thread object is too small, it takes too much time to create and destroy the thread object itself.

For example, when n = 5, t = 20, and the running time is 1.453 s:

When n = 5, t = 50, and the running time is 1.563 s

When n = 6, t = 50, and the running time is 39.141 s

When n = 6, t = 200, runtime: 21.36 s

When n = 6, t = 500, runtime: 15.922 s

When n = 6, t = 600, runtime: 16.593 s

2. Use cache:

When the number of threads is too large, everyone writes files. In this case, it takes more than 70% of the generation time to wait for the file to be written. When writing files, write 20000 integers each time until the read is complete; not every read integer is written in.

For example, if n = 5, t = 50, and the running time is 15.656 S: // No cache mechanism is used

When n = 5, t = 50, running time 1.563 S: // Cache Mechanism

 

During the learning process, I learned that when hadoop is used to process data, it is very important to use the map method to break down the input data and how many map Tasker will be used, this quantity must be specified based on the size of the input data volume. When outputting data, you can merge the data multiple times to reduce the data output, thus improving the overall program running performance.

 

Note: If you need to reprint the data, please indicate the source. Thank you.

Http://blog.csdn.net/savechina

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.