Data mining data sources

Source: Internet
Author: User

1. Climate Monitoring Data Set http://cdiac.ornl.gov/ftp/ndp026b

2. Some useful websites for downloading test Datasets

Http://www.fs.fed.us/fire/fuelman/

Http://www.cs.toronto.edu /~ Roweis/data.html
Http://www.cs.toronto.edu /~ Roweis/data.html
Http://kdd.ics.uci.edu/summary.task.type.html
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
Http://www.phys.uni.torun.pl /~ Duch/software.html
You can find the Reuters dataset in the URL below: http://www.research.att.com /~ Lewis/reuters21578.html
The web site has a variety of datasets: http://kdd.ics.uci.edu/summary.data.type.html
Text Classification. Another dataset is usable, that is, the Rainbow dataset.
Http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html

3. Machine Learning dataset collected by UCI
Ftp://pami.sjtu.edu.cn/
Http://www.ics.uci.edu /~ Mlearn // mlrepository.htm

4. statlib
Http://liama.ia.ac.cn/SCILAB/scilabindexgb.htm
Http://lib.stat.cmu.edu/

5. Websites for fund Data Mining
Http://www.gotofund.com/index.asp

Http://lans.ece.utexas.edu /~ Strehl/

6. Perform text classification & Web
Http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html

Http://www.w3.org/TR/WD-logfile-960221.html
Http://www.w3.org/Daemon/User/Config/Logging.html#AccessLog
Http://www.w3.org/1998/11/05/WC-workshop/Papers/bala2.html
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
Http://www.web-caching.com/traces-logs.html
Http://www-2.cs.cmu.edu/webkb
Http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-75.pdf
Http://www.cs.cornell.edu/projects/kddcup/index.html

7. Time Series Data URL
Http://www.stat.wisc.edu /~ Reinsel/bjr-data/

8. test data of the Apriori algorithm
Http://www.almaden.ibm.com/cs/quest/syndata.html

9. Data generator Link
Http://www.cse.cuhk.edu.hk /~ KDD/data_collection.html
Http://www.almaden.ibm.com/cs/quest/syndata.html
10. Association:
Http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar
Http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData

11. WEKA:
Http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar
1. A jarfile containing 37 classification problems, originally obtained from the UCI Repository
Http://prdownloads.sourceforge.net/weka/datasets-UCI.jar
2. A jarfile containing 37 Regression Problems, obtained from varous sources
Http://prdownloads.sourceforge.net/weka/datasets-numeric.jar
3. A jarfile containing 30 regression datasets collected by Luis torgo
Http://prdownloads.sourceforge.net/weka/regression-datasets.jar

12. cancer genes:
Http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi

13. Financial data:
Http://lisp.vse.cz/pkdd99/Challenge/chall.htm

14. A good resource website is:Http://kdd.ics.uci.edu/, the data resources contained in the bread are as follows (by application field ):

Direct marketing 
KDD Cup 1998 data
GIS
Forest covertype
Indexing
Corel image features
Pseudo periodic Synthetic Time Series
Intrusion Detection
KDD cup 1999 data
Process Control
Synthetic Control Chart Time Series
Recommendation Systems
Entree Chicago recommendation data
Robots
Pioneer-1 mobile robot data
Robot execution failures
Sign Language Recognition
Australian sign language data
High-quality authentication Alian sign language data
Text Categorization
20 newsgroups data
Reuters-21578 Text Categorization collection
NSF research awards merge acts 199 0-2003
World Wide Web
Microsoft anonymous Web Data
MSNBC anonymous Web Data
Syskill webert Web Data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.