About the 2016 plan

Source: Internet
Author: User

2016 years passed eight months, minus the yuan month and some days of February, only 6 months, of which I spent three months to develop JavaFX (now or the rotten Tail building project), one months of time out of training, one months of time off at home with Cubs, and one months of time blind busy, do not know what to do, But the weekend counted in the same, people are not machines, my intention is not too utilitarian, through their own accumulation to do some useful work.

JavaFX's leisure time is still to complete the relevant project.

In the near period of time want to study the R language, do some data visualization and analysis of mining work. This is probably the whole plan for this year, with limited energy and the need to implement it down to earth.

Learn about the R language program

(1) R language Books read through a book, very cumbersome, not necessarily to quickly read, arrange a time, to learn in sections;

(2) Master the R Language Drawing command, the relevant data through the code into graphics;

First part Getting Started

The 1th Chapter introduces the R language.......................................... 3
1.1 Why should I use R? ......................................... 4
1.2 The acquisition and installation of R ...... ..... ..... .................. 6
The use of 1.3 R ..... ..... ..... .... ..... ..... ..................... 7
1.3.1 is a novice to the road ..... ....... ................ 7
1.3.2 to get help ....... .................... 10
1.3.3 work space ....... ..................... 10
1.3.4 input and output ........ .................. 12
1.4 bags ... ..... ... ..... ... ..... ..... ..... ..... ..... ..... ..... ..... ....... ......... 14
1.4.1: What is a package ...? ..... ..... ................ 14
The installation of the 1.4.2 package ....... ..... ................ 14
1.4.3 package Loading ..... ....... .................... 14
How to use the 1.4.4 package ......... ............. 15
1.5 Batch processing ... ..... ..... ..... ..... ..... ..... ....... .................. 15
1.6 Use the output as input--reuse of results ..... 16
1.7 Dealing with big data sets ....... ..... ..................... 16
1.8 Example practice ..... ... ..... ..... ..... ........................ 17
1.9 Summary ... ..... ..... ..... ..... ......-.....-.....-.....-.....-.....-...... 18
2nd Chapter Creating Datasets........................................ 19
The concept of the 2.1 dataset ...... ..... ...................... 19
2.2 Data structure ..... ..... ..... ..... ..... ..... ................... 20
2.2.1 vector ..... .... ..... ..... ....................... 21st
2.2.2 Matrix ..... ..... ..... ..... ...................... 22
2.2.3 array ..... ..... ..... ..... ...................... 23
2.2.4 Data Frame ..... ...... ..... .................... 24
2.2.5 factor ..... ..... ..... ..... ..... ................... 27
A list of 2.2.6 ..... .... ..... ..... ..................... 29
2.3 Input of data ..... ..... ..... ........................ 30
2.3.1 using the keyboard to enter data ........ ....... 31
2.3.2 Import data from a delimited text file ...... ..... ..... ......................... 32
2.3.3 Import Excel Data ......... ........... 33
2.3.4 Importing XML data ....... .............. 34
2.3.5 fetching data from a Web page ........ .......... 34
2.3.6 Import SPSS Data ......... ............ 34
2.3.7 Import SAS Data ........ ............. 34
2.3.8 Import Stata Data ......... ............ 35
2.3.9 Import NetCDF Data ........ .......... 35
2.3.10 Import HDF5 Data ........ .......... 35
2.3.11 access to the database management system ..... ..... 36
2.3.12 importing data via Stat/transfer ... 37
2.4 Data set annotation ...... ..... ........................ 37
2.4.1 variable tags ..... ..... .................... 38
2.4.2 value tags ..... ..... ..... ..................... 38
2.5 Practical functions for working with Data Objects ........ ....... 38
2.6 Summary ... ..... ..... ..... ..... ......-.....-.....-.....-.....-.....-...... 39
The 3rd chapter of the graphic first order............................................ 40
3.1 Use graphics ..... ..... ..... ..... ..... ..... ................... 40
3.2 A simple example ........ ..................... 42
3.3 Graphics parameters ..... ..... ..... ..... ....... .................... 43
3.3.1 symbols and lines ......... ................. 45
3.3.2 color ..... ..... ..... ..... ...................... 46
3.3.3 Text Properties ..... ..... .................... 47
3.3.4 Graphics dimensions and boundary dimensions ........ ..... 49
3.4 Add text, custom axes, and legends ..... 50
3.4.1 title ..... ..... ..... ..... ....... ................. 51
3.4.2 axes ...... ..... ....................... 52
3.4.3 reference line ...... ..... ..... .................... 54
Turing community member Matrixvirus ([email protected]) exclusive respect for copyright
XVI Directory
3.4.4 legend ..... ..... ..... ..... ...................... 54
3.4.5 text Annotation ...... ..... ................... 56
A combination of 3.5 graphics ..... .... ..... ..... ..................... 58
3.6 Summary ... ..... ..... ..... ..... ......-.....-.....-.....-.....-.....-...... 64
Chapter 4th basic Data Management................................... 65
4.1 An example ..... ..... ..... ..... ....... .................... 65
4.2 Create a new variable ..... ..... ..... ....... .................. 67
4.3 The re-encoding of the variable ..... ..... ..................... 68
4.4 Renaming of variables ...... ..... ........................ 69
4.5 Missing values ... ..... ..... ..... ..... ..... ........................ 70
4.5.1 Some values to be missing values ..... ...... 71
4.5.2 to exclude missing values in the analysis ....... ..... 72
4.6 Date Value ... ..... ..... ..... ..... ..... ..... ..................... 73
4.6.1 Convert a date to a character variable ..... 74
4.6.2 further ....... ...................... 74
4.7 type conversion ..... .... ..... ..... ....... ..................... 74
4.8 data sorting ..... ..... ..... ..... ......................... 75
4.9 Data set merging ...... ..... ........................ 76
4.9.1 add Columns ..... ..... ..... ..................... 76
4.9.2 add Line ...... ..... ..... .................... 76
4.10 Datasets Take a subset ..... ..... ..... ................... 77
4.10.1 selected (reserved) variable ........ ..... 77
4.10.2 culling (Discard) variable ..... ....... 77
4.10.3 selected into the observation ........ ................. 78
4.10.4 subset () function ........ ......... 79
4.10.5 random sampling .......... ................ 79
4.11 Using SQL statements to manipulate data frames ....... ..... 80
4.12 Summary ... ..... ..... .... ..... ............................... 81
5th Chapter Advanced Data Management................................... 82
5.1 A data processing problem ....... .................. 82
5.2 Numeric and character handling functions ......... ............. 83
5.2.1 The mathematical function ....... .................... 83
5.2.2 statistical function ..... ....................... 84
5.2.3 probability function ....... ..................... 86
5.2.4 Character handling function ....... ............... 89
5.2.5 Other utility functions ........ .............. 90
5.2.6 apply functions to matrices and data frames .... 91
5.3 A set of solutions for data processing challenges ...... 93
5.4 Control flow ..... ..... ..... ..... ..... ..... ..... ..... ................ 96
5.4.1 repeats and loops ......... ................. 97
5.4.2 conditions ....... ..... ................... 97
5.5 User self-coding function ...... ..... ...................... 99
5.6 Integration and refactoring ..... ..... ..... ..... .................... 101
5.6.1 ..... ..... ...... ..... .................... 101
5.6.2 integration of data ........ ................... 101
5.6.3 reshape bag ........ ............... 102
5.7 Summary ... ..... ..... .... ..... ............................... 105
The second part of the basic method
6th Chapter Basic Graphics......................................... 108
6.1 Bar chart ..... ..... ..... ..... ..... ..... ....... ................. 108
6.1.1 Simple bar Chart ....... .............. 109
6.1.2 Stacked bar chart and Group bar chart ... 110
6.1.3 mean bar chart ........ ................. 111
6.1.4 The fine-tuning of the bar chart ....... .......... 112
6.1.5 Spine Chart ....... ..... ..................... 113
6.2 Pie Chart ... ..... ..... ..... ..... ..... ..... ....... .................. 114
6.3 histogram ..... ... ..... ..... ..... ..... ...................... 116
6.4 Kernel density map ..... ..... ..... ..... ....................... 118
6.5 box-line drawing ..... ..... ..... ..... ..... ....................... 120
6.5.1 Cross-group using a parallel box line chart
Comparison........................................... 121
6.5.2 violin Map ....... ..................... 124
6.6-point picture ... ..... ... ..... ..... ..... ..... ..... ..... .................. 125
6.7 Summary ... ..... ..... .... ..... ............................... 128
The 7th Chapter Basic Statistic Analysis................................. 129
7.1 Descriptive statistical analysis ........ ................... 130
The 7.1.1 method gathers ... and ........ ................. 130
7.1.2 grouping to calculate descriptive statistics ..... ..... 133
7.1.3 the visualization of the results ......... .......... 136
The 7.2-frequency table and the list of tables ..... ..... ............... 136
7.2.1 Generate frequency tables ........ ................ 137
7.2.2 Independence Test ......... ............... 142
7.2.3 the measurement of relevance ...... .............. 144
7.2.4 The visualization of the results ......... .......... 144
7.2.5 Convert a table to a flat format ..... ..... 144
7.3 Related ... ..... ..... ..... ..... ..... ....... ..................-..... 146
7.3.1 related types ......... ................ 146
The significance test of the relativity of the 7.3.2 ....... 148
Visualization of 7.3.3 related relationships ......... ....... 150
7.4 t Test ..... ..... ..... ..... ..... ..... ........................ 150
T-Test of 7.4.1 Independent sample ........ ....... 150
7.4.2 T-test of non-independent samples ..... ..... 151
7.4.3 more than two groups of situations ........ ......... 152
Non-parametric test of the difference between groups of 7.5 ....... ....... 152
7.5. The comparison of the 12 groups ........ .............. 152
7.5. More than 2 more than two groups of comparisons ........ ....... 153
7.6 Visualization of the differences between groups ....... ................. 155
7.7 Summary ... ..... ..... .... ..... ............................... 155
Part III Intermediate methods
8th Chapter Return.................................................. 158
The multi-faceted nature of the 8.1 regression ....... .................... 159
8.1.1 OLS regression in the context of the application ....... 159
8.1.2 Basic review ........ .................... 160
8.2 OLS regression ...... ..... ..... ........................ 160
8.2.1 using LM () to fit the regression model ...... 161
8.2.2 Simple linear regression ......... ............ 162
8.2.3 polynomial regression ......... ................. 164
8.2.4 multi-element linear regression ......... ........... 167
8.2.5 has multiple linear regression of interaction items .... 169
8.3 regression diagnosis ..... ..... ..... ..... ........................ 171
The standard method of 8.3.1 .......... ................ 171
8.3.2 improved methods ....... .................. 175
Comprehensive validation of 8.3.3 linear model assumptions .... 180
8.3.4 multiple collinearity ....... .................. 181
8.4 Abnormal observations ..... .... ..... ..... ...................... 181
8.4.1 away from the crowd ....... ..... ................... 182
8.4.2 High leverage points ......... ................ 182
8.4.3 the strong influence point ....... ................... 183
8.5 improvement measures ..... ..... ..... ........................... 186
8.5.1 Delete the observation points ....... ................ 186
8.5.2 variable transformation ....... ..................... 186
8.5.3 Adding and deleting variables ....... ................... 187
8.5.4 try other ways ........ ............. 188
8.6. Select the "best" regression model ........ ..... 188
8.6.1 model comparison ......... ................... 188
8.6.2 variable selection ......... ................... 189
8.7 In-depth analysis ..... ..... ..... ..................... 193
8.7.1 cross-validation ........ ................... 193
The relative importance of 8.7.2 ........ .............. 194
8.8 Summary ... ..... ..... .... ..... ............................... 197
9th Chapter Analysis of Variance.......................................... 198
9.1 Terminology Express ..... ..... .............................. 198
9.2 ANOVA model fitting ......... ................. 201
9.2.1 AoV () function ......... ................ 201
9.2.2 The order of the items in an expression ....... 201
9.3 Single-factor analysis of variance ........ ................ 202
9.3.1 multiple comparisons ......... ................... 204
9.3.2 evaluation of the hypothesis of the test ..... ..... 206
9.4 Single-factor covariance analysis ......... ..... ............ 208
9.4.1 evaluation of the hypothesis of the test ..... ..... 209
9.4.2 results visualization ....... .................. 210
9.5 Two-factor analysis of variance ..... ..... ................ 211
9.6 Repeated measurement of variance analysis ........ ............... 214
9.7 Multivariate analysis of variance ...... ..... ..... .................. 216
9.7.1 evaluation hypothesis test ....... .............. 217
9.7.2 robust multivariate analysis of variance ......... ..... 219
9.8 Use the return to do the ANOVA ....... ............ 219
9.9 Summary ... ..... ..... .... ..... ............................... 221
10th Chapter Efficacy Analysis....................................... 222
10.1 Hypothesis Test quick glance ....... ..................... 222
10.2 Using the PWR package for efficacy analysis ....... .......... 225
10.2.1 t Test ..... ..... .................... 225
10.2.2 Analysis of variance ......... .............. 227
10.2.3 relevance ..... ..... .................... 227
10.2.4 linear model ......... ............... 228
10.2.5 ratio test ......... ............... 229
10.2.6-chi-square test ....... ............... 229
10.2.7 in the new situation, select the appropriate
The value of the effect ..... ....... ............... 230
10.3 Drawing the Power analysis graphics ........ ............. 232
Turing community member Matrixvirus ([email protected]) exclusive respect for copyright
XVIII Directory
10.4 Other software packages ..... ..... ..... .................... 234
10.5 Summary ... ..... ..... ..... .........-.....-.......-.......-....... 235
11th Chapter Intermediate Drawing........................................ 236
11.1 Scatter plot ..... ..... ..... ..... ..... ..... ................... 237
11.1.1 Scatter Chart Matrix ........ ............. 239
11.1.2 high density scatter plot ....... .......... 244
11.1.33-D Scatter Chart ....... ............. 247
11.1.4 bubble Map ..... ..... .................... 250
112 percent-line drawing ... ..... .... ..... ..... ........................ 252
11.3 Related pictures ..... ..... .... ..... ....... ..................... 255
11.4 Mosaic pictures ..... ..... .... ..... ..... .................... 259
11.5 Summary ... ..... ..... ..... .........-.....-.......-.......-....... 261
the 12th chapter re-sampling and self-help method........................... 263
12.1 Replacement test ..... ..... ..... ..... ....... ................. 263
12.2 with coin bag to do the replacement test ........ ....... 265
12.2.1 independent two samples and K-sample inspection ..... 266
12.2.2 the independence in the list ..... ....... 267
12.2.3 the independence of the numerical variables ...... 268
12.2.42 Sample and K-sample correlation
Inspection........................................ 268
12.2.5 in-depth exploration ......... ............. 269
12.3 The replacement test of the Lmperm package ......... ....... 269
12.3.1 simple regression and polynomial regression ..... 269
12.3. More than 2 yuan regression ......... .............. 271
12.3.3 single-Factor variance analysis and co-square
The analysis of the difference ..... ..... ................. 271
12.3.4 two-factor analysis of variance ........ ..... 272
12.4 Replacement Inspection reviews ...... ..... ................... 273
The 12.5 self-help method ..... ..... ..... ......................... 273
12.6 Boot Package in the self-help method ......... ......... 274
12.6.1 use self-help method for individual statistics ..... 275
12.6. More than 2 statistical self-help methods ..... 277
12.7 summary ... ..... ..... ..... .........-.....-.......-.......-....... 279
Part IV Advanced Methods
The 13th Chapter generalized linear Model............................... 282
13.1 Generalized linear model and GLM () function ....... 282
13.1.1 GLM () function ......... ............ 283
13.1.2 is a function of the conjunction ......... ......... 284
13.1.3 model Fitting and regression diagnosis ..... ..... 285
13.2 Logistic regression ..... ..... ..................... 285
13.2.1 explain the model parameters ....... .......... 288
13.2.2 evaluation of the impact of predictive variables on the probability of a result ......... ..... ......... 289
The 13.2.3 is too far away from the potential ......... .......... 290
13.2.4 extension ...... ..... ..................... 291
13.3 Poisson's return ...... ..... ..... ....................... 291
13.3.1 explain the model parameters ....... .......... 293
The 13.3.2 is too far away from the potential ......... .......... 294
13.3.3 extension ...... ..... ..................... 295
13.4 Summary ... ..... ..... ..... .........-.....-.......-.......-....... 297
14th Chapter principal Components and Factor analysis...................... 298
Principal components and factor analysis in 14.1 R ....... ....... 299
14.2 principal component Analysis ..... ..... ..... ..................... 300
14.2.1 to determine the number of principal components ........ 300
14.2.2 extract the main ingredient ......... ........... 60W
14.2.3 main component rotation ......... ............ 305
14.2.4 get the main ingredient score ...... ....... 306
14.3 Exploratory factor analysis .......... ................. 307
14.3.1 determine the number of common factors to be extracted ..... 308
14.3.2 extraction of common factors ....... .......... 309
14.3.3 factor rotation ......... ............... 310
14.3.4 factor score ......... ............... 313
14.3.5 other EFA-related packages ...... 313
14.4 Other models of latent variables ......... ................ 314
14.5 Summary ... ..... ..... ..... .........-.....-.......-.......-....... 314
15th. Advanced methods for dealing with missing data......... 316
15.1 steps to deal with missing values ......... ............ 317
15.2 recognition of missing values ...... ..... ...................... 318
15.3 Explore the missing values mode ....... .................. 319
The 15.3.1 list shows the missing values ....... ..... 319
15.3.2 graphics Explore missing data ....... ..... 320
15.3.3 using relevance to explore missing values ...... 322
15.4 Understanding the origins and effects of missing data ....... 324
15.5 Rational processing of incomplete data ........ ........... 325
15.6 Full Instance Analysis (row delete) ........... 326
More than 15.7 re-interpolation ...... ..... ..... ..... .................... 327
15.8 other ways to deal with missing values ........ ....... 331
15.8.10% to delete ..... ..... ....... ........... 331
15.8.2 Simple (non-random) interpolation ...... 332
15.9 Summary ... ..... ..... ..... .........-.....-.......-.......-....... 332
16th Advanced Graphics Advance............................... 333
16.1 Four graphics systems in R ......... ............ 333
16.2 lattice bag ...... ..... ..... ................... 334
16.2.1 the condition variable ........ ............... 338
16.2.2 panel function ....... ................. 339
16.2.3 group variable ....... ................. 342
16.2.4 graphics Parameters ........ ................ 345
16.2.5 page display ....... ................. 346
16.3 Ggplot2 bag ...... ..... ..... ................... 347
16.4 Interactive graphics ...... ..... ..... .................... 351
16.4.1 Interaction with Graphics: identification points ....... 351
16.4.2 playwith ......... .............. 65W
16.4.3 latticist ......... ............ 353
16.4.4 iplots Package Interactive graphics ..... 354
16.4.5 Rggobi ......... ................. 355
16.5 summary ... ..... ..... ..... .........-.....-.......-.......-....... 356
PostScript: Explore the world of R .......... ................ 357
Appendix A graphical User Interface .......... ................ 359
Appendix B Customizing the Start-up environment ......... .......... 362
Appendix C exports data from R ........ ............. 364
Appendix D produces the output of the quality of the publication grade ..... 366
The matrix operation in Appendix E R ......... ............. 374
Appendix F the extension package used in this book ..... .......... 376
Appendix G deals with Big data ........ ....... ............. 381
Appendix H Update R ... ..... ..... ..... ..... .................... 383
References ... ..... ..... ..... ..... ..... ....., ... and .....-.....-.....-.....???????? 385

About the 2016 plan

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.