1. Climate Monitoring Data Set http://cdiac.ornl.gov/ftp/ndp026b
2. Some useful websites for downloading test Datasets
Http://www.cs.toronto.edu /~ Roweis/data.html
Http://www.cs.toronto.edu /~ Roweis/data.html
Http://kdd.ics.uci.edu/summary.task.type.html
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
Http://www.phys.uni.torun.pl /~ Duch/software.html
You can find the Reuters dataset http://www.research.att.com/~ in the URL below /~ Lewis/reuters21578.html
Various datasets are available on the following websites:
Http://kdd.ics.uci.edu/summary.data.type.html
Text Classification. Another dataset is usable, that is, the Rainbow dataset.
Http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html
3. After finding a lot of test datasets, the comrades who write the papers must at least use them to test the effectiveness of the algorithms.
There may be some inaccessible, but there are always some accessible:
Machine Learning dataset collected by UCI
Ftp://pami.sjtu.edu.cn/
Http://www.ics.uci.edu /~ Mlearn // mlrepository.htm
Statlib
Http://liama.ia.ac.cn/SCILAB/scilabindexgb.htm
Http://lib.stat.cmu.edu/
Sample Database
Http://kdd.ics.uci.edu/
Http://www.ics.uci.edu /~ Mlearn/mlrepository.html
Websites for fund Data Mining
Http://www.gotofund.com/index.asp
Http://lans.ece.utexas.edu /~ Strehl/
Reuters Dataset
Http://www.research.att.com /~ Lewis/reuters21578.html
Various datasets:
Http://kdd.ics.uci.edu/summary.data.type.html
Http://www.mlnet.org/cgi-bin/mlnetois.pl? File1_datasets.html
Http://lib.stat.cmu.edu/datasets/
Http://dctc.sjtu.edu.cn/adaptive/datasets/
Http://fimi.cs.helsinki.fi/data/
Http://www.almaden.ibm.com/software/quest/Resources/index.shtml
Http://miles.cnuce.cnr.it /~ Palmeri/datam/DCI/
Text Classification & Web
Http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html
Http://www.w3.org/TR/WD-logfile-960221.html
Http://www.w3.org/Daemon/User/Config/Logging.html#AccessLog
Http://www.w3.org/1998/11/05/WC-workshop/Papers/bala2.html
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
Http://www.web-caching.com/traces-logs.html
Http://www-2.cs.cmu.edu/webkb
Http://www.cs.auc.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-75.pdf
Http://www.cs.cornell.edu/projects/kddcup/index.html
URL of time series data
Http://www.stat.wisc.edu /~ Reinsel/bjr-data/
Test data of the Apriori algorithm
Http://www.almaden.ibm.com/cs/quest/syndata.html
Data generator Link
Http://www.cse.cuhk.edu.hk /~ KDD/data_collection.html
Http://www.almaden.ibm.com/cs/quest/syndata.html
Association:
Http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar
Http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#assocSynData
WEKA:
Http://flow.dl.sourceforge.net/sourceforge/weka/regression-datasets.jar
1. A jarfile containing 37 classification problems, originally obtainedfrom the UCI Repository
Http://prdownloads.sourceforge.net/weka/datasets-UCI.jar
2. A jarfile containing 37 Regression Problems, obtained from varioussources
Http://prdownloads.sourceforge.net/weka/datasets-numeric.jar
3. A jarfile containing 30 regression datasets collected by Luis torgo
Http://prdownloads.sourceforge.net/weka/regression-datasets.jar
Cancer genes:
Http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi
Financial data:
Http://lisp.vse.cz/pkdd99/Challenge/chall.htm
Provided by another person
Http://www.cs.toronto.edu /~ Roweis/data.html
Http://kdd.ics.uci.edu/summary.task.type.html
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/
Http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/
Http://www.phys.uni.torun.pl /~ Duch/software.html
You can find the Reuters dataset at the URL below.
Http://www.research.att.com /~ Lewis/reuters21578.html
Various datasets are available on the following websites:
Http://kdd.ics.uci.edu/summary.data.type.html
Text Classification. Another dataset is usable, that is, the Rainbow dataset.
Http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html
Download the financial data (~ 17.5 m zipped file ,~ 67 m unzipped data)
Download the medical data (~ 2 m zipped file ,~ 6 m unzipped data)
Http://lisp.vse.cz/pkdd99/Challenge/chall.htm
Kdnuggets-related linked dataset:
Http://www.kdnuggets.com/datasets/index.html
There is another good resource Website: http://kdd.ics.uci.edu/. The following figure shows the data resources contained in the bread ):
Direct marketing
Kddcup 1998 data
GIS
Forest covertype
Indexing
Corel image features
Pseudo periodic Synthetic Time Series
Intrusion Detection
Kddcup 1999 data
Process Control
Synthetic Control Chart Time Series
Recommendation Systems
Entree chicagorecommendation data
Robots
Pioneer-1 mobile robot data
Robot execution failures
Sign Language Recognition
Australian sign language data
High-quality authentication Alian sign language data
Text Categorization
20 newsgroups data
Reuters-21578 Text Categorization collection
Nsfresearch awards merge acts 199 0-2003
World Wide Web
Microsoft anonymous Web Data
MSNBC anonymous Web Data
Syskill webert Web Data
Here I found another one, which was found on a foreigner's blog. (One day before Children's Day)
Http://www.fs.fed.us/fire/fuelman/
(Sina Weibo: @ quanliang _ machine learning)