一個簡單的網路爬蟲-從網上爬取美女圖片

來源:互聯網
上載者:User

標籤:爬蟲   java   

CrawlerPicture.java 檔案

package com.lym.crawlerDemo;import java.io.DataInputStream;import java.io.File;import java.io.FileOutputStream;import java.io.IOException;import java.io.InputStream;import java.net.URL;import java.util.ArrayList;import java.util.List;import org.jsoup.Jsoup;import org.jsoup.nodes.Document;import org.jsoup.select.Elements;import com.lym.mode.Picture;/** * 從  http://m.qqba.com/ 爬取美女圖片 * @author Administrator * */public class CrawlerPicture {public final static int STARTPAGE = 301;public final static int ENDPAGE = 500;//爬取的頁面數量/** * 擷取圖片的src和alt屬性值 * @return * @throws IOException */public static List<Picture> getPictureUrl() throws IOException{int number = 1;List<Picture> pics = new ArrayList<Picture>();//儲存擷取到的所有圖片的URL地址for (int i = STARTPAGE; i < ENDPAGE; i++) {String url = "http://m.qqba.com/people/list/"+i+".htm";Document doc = null;doc = Jsoup.connect(url).get();//擷取頁面文檔Elements divList = doc.body().select("div.image-cell");for (int j = 0; j < divList.size(); j++) {Elements imgList = divList.get(j).select("img");//一個網頁內所有的img標籤for (int k = 0; k < imgList.size(); k++) {Picture pic = new Picture();pic.setId(number++);pic.setSrc(imgList.get(k).attr("src"));pic.setAlt(imgList.get(k).attr("alt"));pics.add(pic);}}}return pics;}/** * 擷取圖片輸入資料流 * @param picUrl  圖片的URL地址 * @return * @throws IOException  */public static InputStream getPictureInputStream(String picUrl) throws IOException{URL url = new URL(picUrl);DataInputStream dis = new DataInputStream(url.openStream());//擷取圖片的輸入資料流return dis;}/** * 儲存圖片到本地磁碟中 * @param number 圖片編號 * @throws IOException  */public static void savePicture(InputStream in, Picture pic) throws IOException{String newImgUrl = "D:/picture/"+pic.getAlt()+"--"+pic.getId()+".jpg";//圖片在磁碟上的儲存路徑FileOutputStream fos = new FileOutputStream(new File(newImgUrl));byte[] buf = new byte[1024];int len = -1;while( (len = in.read(buf)) >0){fos.write(buf, 0, len);}fos.close();}/** * 測試 * @param args */public static void main(String[] args) {try {List<Picture> pics = getPictureUrl();System.out.println("圖片正在下載...");for (int i = 0; i < pics.size(); i++) {Picture pic = pics.get(i);String picUrl = pic.getSrc();InputStream in = getPictureInputStream(picUrl);savePicture(in, pic);in.close();}System.out.println("下載完成!");} catch (IOException e) {e.printStackTrace();}}}


Picture.java檔案

package com.lym.mode;public class Picture {/** * 圖片編號 */private int id;/** * 圖片地址 */private String src;/** * 圖片說明資訊 */private String alt;public int getId() {return id;}public void setId(int id) {this.id = id;}public String getSrc() {return src;}public void setSrc(String src) {this.src = src;}public String getAlt() {return alt;}public void setAlt(String alt) {this.alt = alt;}@Overridepublic String toString() {return "Picture [id=" + id + ", src=" + src + ", alt=" + alt + "]";}}



著作權聲明:本文為博主原創文章,未經博主允許不得轉載。

一個簡單的網路爬蟲-從網上爬取美女圖片

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.