Get the github code library list from python

Source: Internet
Author: User

1. Background

Project requirements: Obtain the github repo api to extract repo data for analysis. After studying for a day, I finally solved this problem, although the efficiency is still relatively low.

The repo display api on github lists the details of each repo and is in json format. It seems that no method can be found to analyze data in multiple json formats, so it is silly to use the splite + re method. If you have a better method, leave a message to discuss it!

2. Code

import reimport osdef GetUrl(num): str = os.popen("curl -G https://api.github.com/repositories?since=%d"%(num)).read() pattern = '"url"' pattern1='repos' urls=str.split(',\n')   for i in urls:  if pattern in i and pattern1 in i:   #  text1=i.splite(':')  text=re.compile('"(.*?)"').findall(i)[1]  print textif __name__=='__main__': GetUrl(1000)

Here, the num value refers to the page id. We can create a loop and increase the num value to infinitely extract the repo. Because the github api imposes limits on traffic, this is a feasible method.

The result is as follows (the extracted repo api address ):

Https://api.github.com/repos/wycats/merb-core

Https://api.github.com/repos/rubinius/rubinius

Https://api.github.com/repos/mojombo/god

Https://api.github.com/repos/vanpelt/jsawesome

Https://api.github.com/repos/wycats/jspec

Https://api.github.com/repos/defunkt/exception_logger

Https://api.github.com/repos/defunkt/ambition

Https://api.github.com/repos/technoweenie/restful-authentication

Https://api.github.com/repos/technoweenie/attachment_fu

Https://api.github.com/repos/topfunky/bong

Https://api.github.com/repos/Caged/microsis

Https://api.github.com/repos/anotherjesse/s3

Https://api.github.com/repos/anotherjesse/taboo

Https://api.github.com/repos/anotherjesse/foxtracs

Https://api.github.com/repos/anotherjesse/fotomatic

Https://api.github.com/repos/mojombo/glowstick

Https://api.github.com/repos/defunkt/starling

Https://api.github.com/repos/wycats/merb-more

Https://api.github.com/repos/macournoyer/thin

Https://api.github.com/repos/jamesgolick/resource_controller

Https://api.github.com/repos/jamesgolick/markaby

Https://api.github.com/repos/jamesgolick/enum_field

Https://api.github.com/repos/defunkt/subtlety

Https://api.github.com/repos/defunkt/zippy

Https://api.github.com/repos/defunkt/cache_fu

Https://api.github.com/repos/KirinDave/phosphor

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.