Python Clustering Analysis LinkedIn user network

Source: Internet
Author: User

CODE:

#!/usr/bin/python #-*-Coding:utf-8-*-"Created on 2014-8-26@author:guaguastd@name:linkedin_network_clusters.py" ' Import osimport sysimport jsonfrom urllib2 import httperrorfrom cluster import kmeansclustering, centroid# A helper funct Ion to munge data and build up an XML treesys.path.append (Os.path.join (OS.GETCWD (), "E:", "Eclipse", "LinkedIn", "Dfile")) From mykml import createkmlk = 3# get geo codefrom Geo Import Geo_from_bingg = geo_from_bing () # Load the Dataconnections_d ATA = ' E:\eclipse\LinkedIn\dfile\linkedin_connections.json ' out_file = "E:\eclipse\LinkedIn\dfile\linkedin_clusters _KMEANS.KML "# Open up your saved connections with extended profiles information# or fetch them again from LinkedIn if you P Referconnections = json.loads (open (Connections_data). read ()) [' values ']locations = [c[' location '] [' name '] for C in  Connections if C.has_key (' location ')]# Some basic transformstransforms = [(' Greater ', ' "), (' area ', ')]# Step 1-tally The frequency of each locationcoords_freqs = {}for locations:if not C.has_key (' location '): Continue # Avoid unnecessary I/O and geo Reques TS by building-a cache if Coords_freqs.has_key (location): coords_freqs[location][1] + = 1 Continue t Ransformed_location = Transform in transforms:transformed_location = Transformed_location.replace (            *transform) # Handle potential IO errors with a retry pattern ... while true:num_errors = 0                Try:results = G.geocode (transformed_location, exactly_one=false) print results                    Break except Httperror, e:num_errors + = 1 if num_errors >= 3:  Sys.exit () print >> sys.stderr, e print >> sys.stderr, ' encountered An URLLIB2 error. Trying again ... ' If results is none:continue for result in results: # each RE Sultis of the form ("Description", (x, y)) coords_freqs[location] = [result[1], 1] Break # disambiguation Strategy is ' pick first ' # Step 2-build up data structure for converting locations to Kmlexpanded_coords = []for label I n coords_freqs: # Flip Lat/lon for Google Earth ((lat, lon), f) = Coords_freqs[label] Expanded_coords.append ((Lab El, [(Lon, LAT)] * f) # No need to clutter the map with unnecessary placemarks ... kml_items = [{' label ': Label, ' Co  Ords ': '%s,%s '% coords[0]} for (label, coords) in Expanded_coords] # It would also is helpful to include names of your Contacts on the map for item in kml_items:item[' contacts ' = ' \ n '. Join (['%s '%s. '% (c[' firstName '], c[' Lastnam  E ']) for C in connections if C.has_key ("location") and c[' location ' [' Name '] = = item[' label ']] # Step 3-cluster locations and extend the KML data structure with CENTROIDSC1 = kmeansclustering ([COO RDS for (label, coords_lIST) in expanded_coords for coords in coords_list]) Centroids = [{' label ': ' Controid ', ' coords ': '%s,%s  '% centroid (c)} for C in C1.getclusters (K)]kml_items.extend (centroids) # Step 4-create the final KML output and write it to a filekml = CREATEKML (kml_items) f = open (Out_file, ' W ') F.write (KML) f.close () print ' Data written to ' + Out_file

RESULT:

[Location (Beijing, Beijing, China 54m 0.0s N, $23m 0.0s E)] [Location (Beijing, Beijing, China 54m 0.0s N, $23m 0.0s E)] None[location (CA, states 43m 0.0s N, 122 15m 0.0s W)][location (Birmingham, England, and Kingdom) 29m 0.0s N , 1 55m 0.0s W), location (Birmingham, England, Kingdom-27m 0.0s N, 1 43m 0.0s W), location (Birmingham Airport, E Ngland, Kingdom 27m 0.0s N, 1 44m 0.0s W), location (Birmingham business Park, England, Kingdom, 28m 0.0 s N, 1 43m 0.0s W)][location (Birmingham, England, Kingdom 29m 0.0s N, 1 55m 0.0s W), location (Birmingham, Englan D, Kingdom 27m 0.0s N, 1 43m 0.0s W), location (Birmingham Airport, England, Kingdom, 27m 0.0s N, 1 44m 0.0s W), location (Birmingham business Park, England, the Kingdom 28m 0.0s N, 1 43m 0.0s W)][location (China, 33m 0. 0s n, 103 59m 0.0s E)][location (China/33m 0.0s N, 103 59m 0.0s E)][location (Chengdu, Sichuan, China, 40m 0.0s N, 104 5m 0.0s E)][location (Chengdu, Sichuan, China 40m 0.0s N, 104 5m 0.0s E)][location (Xingtai, Hebei, China PNs 4m 0.0s N, 29m 0.0s e) [ Location (Xingtai, Hebei, China PNs 4m 0.0s N, 98 29m 0.0s E)][location (for states, 27m 0.0s N,][locati 57m 0.0s W) On (for states 27m 0.0s N, 98 57m 0.0s W)][location (Foshan, Guangdong, China, 2m 0.0s N, 113 6m 0.0s E)][location (F Oshan, Guangdong, China 2m 0.0s N, 113 6m 0.0s E)]data written to E:\eclipse\LinkedIn\dfile\linkedin_clusters_kmeans.km L


Python Clustering Analysis LinkedIn user network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.