Source: Internet
Author: User

R language practice

Basic Information

Original Title: R in Action: data analysis and graphics with R

Author: (US) Robert I. kabacff [Translator's introduction]

Translator: Gao Tao Xiao Nan Chen Gang

Series name: Turing programming Series

Press: People's post and telecommunications Press

ISBN: 9787115299901

Mounting time:

Published on: February 1, January 2013

Start: 16

Page number: 1

Version: 1-1

Category: Computer> Software and programming> integration> advanced programming language design

For more information, r language practices

Introduction

Books

R language practice focuses on practicality. It is a comprehensive and meticulous R guide, which summarizes the software and its powerful functions, and displays practical statistical examples, furthermore, it provides an elegant Processing Method for messy, incomplete, and non-normal data that cannot be processed using traditional methods. The author not only explores statistical analysis, but also describes a large number of graphic functions for exploring and displaying data.

R language practice is suitable for data analysts and r Users to learn.

Directory

R language practice

Part 1 entrance

Chapter 3 introduction to R 3

1.1 why r? 4

1.2 R acquisition and installation 6

1.3 R use 7

1.3.1 newbie 7

1.3.2 get help 10

1.3.3 workspace 10

1.3.4 Input and Output 12

1.4 packages 14

1.4.1 what is package 14

1.4.2 package installation 14

1.4.3 package loading 14

How to Use the 1.4.4 package 15

1.5 batch processing 15

1.6 use output as input -- reuse of Results 16

1.7 process large datasets 16

1.8 example 17

1.9 summary 18

Chapter 19 create a dataset 19

2.1 dataset concept 19

2.2 Data Structure 20

2.2.1 vector 21

2.2.2 matrix 22

2.2.3 array 23

2.2.4 Data box 24

2.2.5 factor 27

2.2.6 list 29

2.3 Data Input 30

2.3.1 use the keyboard to input data 31

2.3.2 import data from text files with delimiters 32

2.3.3 import Excel Data 33

2.3.4 importing XML data 34

2.3.5 capture data from webpages 34

2.3.6 import SPSS data 34

2.3.7 import SAS data 34

2.3.8 import Stata data 35

2.3.9 importing netcdf data 35

2.3.10 import hdf5 data 35

2.3.11 access the database management system 36

2.3.12 using STAT/transfer to import data 37

2.4 dataset annotation 37

2.4.1 variable tag 38

2.4.2 value tag 38

2.5 practical functions for processing data objects 38

2.6 summary 39

Chapter 4 graphics level 40

3.1 use graphics 40

3.2 A simple example 42

3.3 graphic parameters 43

3.3.1 symbols and lines 45

3.3.2 color 46

3.3.3 text attributes 47

3.3.4 graphic size and boundary size 49

3.4 add text, custom axis, and legend 50

3.4.1 title 51

3.4.2 axis 52

3.4.3 Reference Line 54

3.4.4 legend 54

3.4.5 text note 56

3.5 combination of images 58

3.6 summary 64

Chapter 5 Basic Data Management 65

4.1 example 65

4.2 create new variable 67

4.3 variable re-encoding 68

4.4 rename the variable 69

4.5 missing value 70

4.5.1 re-encoding some values are missing values 71

4.5.2 exclude missing values 72 during analysis

4.6 date value 73

4.6.1 convert a date to variable 74

4.6.2 further 74

4.7 type conversion 74

4.8 Data Sorting 75

4.9 dataset merging 76

4.9.1 add column 76

4.9.2 add row 76

4.10 dataset acquisition subset 77

4.10.1 select (retain) variable 77

4.10.2 remove (discard) variable 77

4.10.3 select observation 78

4.10.4 subset () function 79

4.10.5 random sampling 79

4.11 use SQL statements to operate data box 80

4.12 summary 81

Chapter 2 advanced data management 82

5.1 One Data Processing challenge 82

5.2 numeric and character processing functions 83

5.2.1 mathematical functions 83

5.2.2 statistical function 84

5.2.3 probability function 86

5.2.4 character processing function 89

5.2.5 other practical functions 90

5.2.6 apply the function to the matrix and data frame 91

5.3 a set of solutions to data processing challenges 93

5.4 Control Flow 96

5.4.1 repetition and loop 97

5.4.2 conditional execution 97

5.5 User-Defined Function 99

5.6 integration and restructuring 101

5.6.1 transpose 101

5.6.2 integrate data 101

5.6.3 reshape package 102

5.7 conclusion 105

Part 2 Basic Methods

Chapter 2 basic graphics 6th

6.1 bar chart 108

6.1.1 simple bar chart 109

6.1.2 stacked and grouped bar charts 110

6.1.3 means bar chart 111

6.1.4 fine-tune the bar chart 112

6.1.5 spine 113

6.2 pie chart 114

6.3 histogram 116

6.4 Core density chart 118

6.5 box plot 120

6.5.1 cross-group comparison using the parallel box plot 121

6.5.2 violin figure 124

6.6 point chart 125

6.7 Conclusion 128

Chapter 2 basic statistical analysis 7th

7.1 descriptive statistical analysis 130

7.1.1 method-based aggregation 130

7.1.2 Group Calculation descriptive statistics 133

7.1.3 result visualization 136

7.2 frequency table and join Table 136

7.2.1 generation frequency Table 137

7.2.2 independence test 142

7.2.3 Correlation Measurement 144

7.2.4 result visualization 144

7.2.5 convert a table to a flat format of 144

7.3 correlation 146

7.3.1 related types 146

7.3.2 significance test of correlation 148

7.3.3 visualization of related relationships 150

7.4 t-test 150

7.4.1 independent sample T-test 150

7.4.2 t-test 151 for non-independent samples

7.4.3 more than two groups 152

7.5 Non-Parameter Test of inter-group differences 152

7.5.1 comparison between the two groups: 152

7.5.2 comparison of more than 153 of the two groups

7.6 visualization of differences between groups 155

7.7 conclusion 155

Part 3 intermediate Method

Chapter 1 Regression 8th

8.1 multi-faceted regression 159

8.1.1 applicable scenarios of OLS regression 159

8.1.2 basic review 160

8.2 OLS regression 160

8.2.1 fitting regression model with LM () 161

8.2.2 simple linear regression 162

8.2.3 polynomial regression 164

8.2.4 multivariate linear regression 167

8.2.5 multivariate linear regression with interactive Items 169

8.3 Regression Diagnosis 171

8.3.1 standard method 171

8.3.2 method of improvement 175

8.3.3 comprehensive validation of linear model assumptions 180

8.3.4 multiple collinearity 181

8.4 abnormal observed values 181

8.4.1 outlier 182

8.4.2 high leverage value 182

8.4.3 strong impact: 183

8.5 improvement measures 186

8.5.1 Delete observation site 186

8.5.2 variable conversion 186

8.5.3 Add/delete variable 187

8.5.4 try other methods 188

8.6 select the "best" regression model 188

8.6.1 model comparison 188

8.6.2 variable selection 189

8.7 deep analysis 193

8.7.1 cross-validation 193

8.7.2 relative importance 194

8.8 conclusion 197

Chapter 1 variance analysis 9th

9.1 Glossary: 198

9.2 ANOVA model fitting 201

9.2.1 AOV () function 201

9.2.2 order of items in the expression 201

9.3 single-factor variance analysis 202

9.3.1 multiple comparisons 204

9.3.2 hypothesis 206

9.4 single-factor covariance analysis 208

9.4.1 assumptions for evaluation and test 209

9.4.2 result visualization 210

9.5 two-factor variance analysis 211

9.6 repeated measurement variance analysis 214

9.7 Multivariate variance analysis 216

9.7.1 evaluation hypothesis test 217

9.7.2 robust Multivariate variance analysis 219

9.8 use regression for ANOVA 219

9.9 conclusion 221

Chapter 1 Efficacy Analysis 10th

10.1 hypothesis test speed overview 222

10.2 use PWR package for efficacy analysis 225

10.2.1 t-test 225

10.2.2 variance analysis 227

10.2.3 correlations 227

10.2.4 Linear Model 228

10.2.5 ratio test 229

10.2.6 chisquare test 229

10.2.7 select an effective value of 230 in the New Situation

10.3 drawing function analysis graphics 232

10.4 other software packages 234

10.5 conclusion 235

Chapter 2 intermediate plotting 11th

11.1 scatter plot 237

11.1.1 scatter plot matrix 239

11.1.2 high-density scatter plot 244

11.1.3 three-dimensional scatter plot 247

11.1.4 bubble chart 250

11.2 line chart 252

11.3 correlation diagram 255

11.4 mosaic 259

11.5 conclusion 261

Chapter 1 Sampling and self-help method 12th

12.1 replacement test 263

12.2 use the coin package for replacement test 265

12.2.1 independent two-phase and K-sample tests 266

12.2.2 independence in the join table 267

12.2.3 independence between numerical variables 268

12.2.4 correlation test between two and K samples 268

12.2.5 in-depth exploration 269

12.3 replacement test of lmperm package 269

12.3.1 simple and polynomial regression 269

12.3.2 Multiple Regression 271

12.3.3 single-factor analysis of variance and covariance 271

12.3.4 two-factor variance analysis 272

12.4 replacement test comments 273

12.5 self-help 273

12.6 self-help 274 in the boot package

12.6.1 use the self-help method 275 for a single statistic

12.6.2 self-help 277 of multiple statistics

12.7 conclusion 279

Part 4 advanced methods

Chapter 2 generalized linear model 13th

13.1 generalized linear model and GLM () function 282

13.1.1 GLM () function 283

13.1.2 used functions 284

13.1.3 model fitting and Regression Diagnostic 285

13.2 logistic regression 285

13.2.1 interpretation model parameter 288

13.2.2 evaluate the influence of prediction variables on the result probability by 289

13.2.3 excessive disconnection: 290

13.2.4 expansion 291

13.3 Poisson Regression 291

13.3.1 interpret model parameters 293

13.3.2 excessive disconnection: 294

13.3.3 295 Extension

13.4 conclusion 297

Chapter 2 Principal Component and Factor Analysis 14th

14.1 Principal Component and Factor Analysis in R 299

14.2 principal component analysis 300

14.2.1 determine the number of principal components: 300

14.2.2 extract Principal Component 302

14.2.3 Principal Component rotation 305

14.2.4 obtain the principal component score of 306

14.3 exploratory factor analysis 307

14.3.1 determine the number of common factors to be extracted: 308

14.3.2 extract public factor 309

14.3.3 Rotating Factor 310

14.3.4 score 313

14.3.5 other EFA-related packages 313

14.4 other latent variable models 314

14.5 conclusion 314

Chapter 1 advanced methods for processing missing data 15th

15.1 process missing values 317

15.2 identify missing values 318

15.3 explore missing value mode 319

15.3.1 the list shows 319 missing values.

15.3.2 image exploration missing data 320

15.3.3 use correlation to explore missing value 322

15.4 understanding the cause and impact of missing data 324

15.5 rationally process incomplete data 325

15.6 complete instance analysis (row deletion) 326

15.7 multi-interpolation 327

15.8 other methods for processing missing values 331

15.8.1 delete in pairs 331

15.8.2 simple (non-random) interpolation 332

15.9 conclusion 332

Chapter 2 advanced graphics 16th

16.1 four graphics systems in R 333

16.2 lattice package 334

16.2.1 condition variable 338

16.2.2 panel function 339

16.2.3 grouping variable 342

16.2.4 graphic parameters 345

16.2.5 page placement 346

16.3 ggplot2 package 347

16.4 Interactive Graphics 351

16.4.1 interaction with graphics: Identification point 351

16.4.2 playwith 352

16.4.3 latticist 353

16.4.4 Interactive Graphics of the iplots package 354

16.4.5 rggobi 355

16.5 conclusion 356

Postscript: explore the world of R 357

Appendix A graphical user interface 359

Appendix B custom startup environment 362

Appendix C export data from R 364

Appendix D output of production-level quality: 366

Matrix Calculation in appendix e r 374

Appendix F expansion package 376 used in this book

Appendix G process Big Data 381

Appendix H update R 383

References 385

Source of this book: China Interactive publishing network

Related Article