Algorithms Part 1-Question 4-SCC Problems

Source: Internet
Author: User
Algorithms:
Design and Analysis, part 1

This assignment is the most difficult of an algorithm course. In addition to algorithms, I think implementation is also involved. Because many programming languages cannot process a large number of recursive calls.

Question description

Download the text file here. Zipped version here.
(Right click and save link)

The file contains the edges of a directed graph. vertices are labeled as positive integers from 1 to 875714. every row indicates an edge, the vertex label in first column is the tail and the vertex label in second column is the head (recall the graph is
Directed, and the edges are directed from the first column vertex to the second column vertex). So for example, 11th Row
Looks liks: "2 47646". This just means that the vertex with label 2 has an outgoing edge to the vertex with label 47646

Your task is to code up the algorithm from the video lectures for computing stronugly connected components (SCCS), and to run this algorithm on the given graph.

Output Format: You shoshould output the sizes of the 5 largest SCCs in the given graph, in decreasing order of sizes, separated by commas (avoid any spaces ). so if your algorithm computes the sizes of the five largest SCCs to be 500,400,300,200 and 100,
Then your answer shoshould be "500,400,300,200,100 ". if your algorithm finds less than 5 SCCs, then write 0 for the remaining terms. thus, if your algorithm computes only 3 SCCs whose sizes are 400,300, and 100, then your answer shocould be "400,300,100 ".

Warning: This is the most challenging programming assignment of the course. because of the size of the graph you may have to manage memory carefully. the best way to do this depends on your programming language and environment, and we stronugly suggest that
You exchange tips for doing this on the discussion forums.

Algorithm Implementation

The implementation of the algorithm is relatively simple, divided into three steps: the first step is to find the transpose graph, the second step is to traverse the transpose graph and get the topological sequence, step 3: Use this topological sequence to perform DFS on the graph. The point obtained by each DFS is a set of SCC/strong connections.

The Code implemented using python is as follows:

def firstdfs(vertexind):    global fs,isexplored,visitordered,mapDictT    if len(mapDictT[vertexind])>0:        for ind in mapDictT[vertexind]:            if not isexplored[ind-1]:                isexplored[ind-1]=True                firstdfs(ind)    visitordered[fs-1]=vertexind    #print(str(vertexind)+' fs: '+str(fs))    fs=fs-1def seconddfs(vertexind):    global s,secisexplored,header,mapDict    if len(mapDict[vertexind])==0:return    for ind in mapDict[vertexind]:        if not secisexplored[ind-1]:            secisexplored[ind-1]=True            seconddfs(ind)    header[s-1]+=1maplength=875714#maplength=8f=open('SCC.txt','r')mapDict={x:[] for x in range(1,maplength+1)}mapDictT={x:[] for x in range(1,maplength+1)}for line in f.readlines():    tmp=[int(x) for x in line.split()]    mapDict[tmp[0]].append(tmp[1])    mapDictT[tmp[1]].append(tmp[0])f.closefs=maplengthisexplored=[False for x in range(1,maplength+1)]secisexplored=[False for x in range(1,maplength+1)]visitordered=[0 for x in range(1,maplength+1)]header=[0 for x in range(1,maplength+1)]for ind in range(1,maplength+1):    if not isexplored[ind-1]:        #print('Begin from: '+str(ind))        isexplored[ind-1]=True        firstdfs(ind)print('Second DFS')for ind in visitordered:    if not secisexplored[ind-1]:        s=ind        secisexplored[ind-1]=True        seconddfs(ind)header.sort(reverse=True)print(header[0:20])

The graph used for testing is stored in a text file. The content of the file used for testing is as follows:

1 22 62 32 43 13 44 55 46 56 77 67 88 58 7

Note: Change maplength to 8 during testing. The first five outputs should be

3,3,2,0,0
Python iterations

Python has a default Function Iteration limit, which generally does not exceed 1000 by default. If this limit is exceeded, stack overflow errors may occur. Use the following code to change the default iteration limit and display

import syssys.setrecursionlimit(80000000)print(sys.getrecursionlimit())

Use the following code to test the number of iterations that can be achieved

def f(d):if d%500==0:print(d)f(d+1)f*(1)

After the above Code is tested, the maximum number of 8g iterations of python3.3 memory in Win8 is about 4000, and the maximum number of 16g iterations of python3.2 memory In Debian 6 is about 26000. The memory is not exhausted. Although the frequency limit is relaxed, it is still restricted for some reasons. In this case, no stack overflow error is reported, but the program will also crash.

Where is the problem? I went to the Forum and looked at other people's discussions to find out whether the stack size is insufficient.

Improved version of the program

Set the stack size to 64 MB and then OK.

The complete code is as follows:

import sys,threadingsys.setrecursionlimit(3000000)threading.stack_size(67108864)def firstdfs(vertexind):    global fs,isexplored,visitordered,mapDictT    if len(mapDictT[vertexind])>0:        for ind in mapDictT[vertexind]:            if not isexplored[ind-1]:                isexplored[ind-1]=True                firstdfs(ind)    visitordered[fs-1]=vertexind    #print(str(vertexind)+' fs: '+str(fs))    fs=fs-1def seconddfs(vertexind):    global s,secisexplored,header,mapDict    if len(mapDict[vertexind])==0:return    for ind in mapDict[vertexind]:        if not secisexplored[ind-1]:            secisexplored[ind-1]=True            seconddfs(ind)    header[s-1]+=1def sccmain():    global mapDict,mapDictT,fs,isexplored,visitordered,s,secisexplored,header    maplength=875714    #maplength=11    f=open('SCC.txt','r')    mapDict={x:[] for x in range(1,maplength+1)}    mapDictT={x:[] for x in range(1,maplength+1)}    for line in f.readlines():        tmp=[int(x) for x in line.split()]        mapDict[tmp[0]].append(tmp[1])        mapDictT[tmp[1]].append(tmp[0])    f.close    fs=maplength    isexplored=[False for x in range(1,maplength+1)]    secisexplored=[False for x in range(1,maplength+1)]    visitordered=[0 for x in range(1,maplength+1)]    header=[0 for x in range(1,maplength+1)]    for ind in range(1,maplength+1):        if not isexplored[ind-1]:            #print('Begin from: '+str(ind))            isexplored[ind-1]=True            firstdfs(ind)    print('Second DFS')    for ind in visitordered:        if not secisexplored[ind-1]:            s=ind            secisexplored[ind-1]=True            seconddfs(ind)    header.sort(reverse=True)    print(header[0:20])if __name__ =='__main__':    thread=threading.Thread(target=sccmain)    thread.start()

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.