Randnorm# #rnorm (3000) produces 3,000 positive too many distributions
Randdensity# # # #dnorm (randnorm) to find its density function value
Ggplot (Data.frame (x=randnorm,y=randdensity)) +aes (x=x,y=y) +geom_point () +labs (x= "Random Normal varables", y= " Randdensity ")
# #将这个你太分分布数以及对应的密度函数值作为x, y-axis values, and draw point graphs
P# #变量p作为该段代码的引用
Neg1seq# #生成一段序列, the starting value is min (randnorm), the end is to=-1, step 0.1
Lessthanneg1# #将序
Label:According to the characteristics of various industries, a variety of clustering algorithms are proposed, which are divided into several categories: hierarchy, Division, density, graph theory, grid and model. Among them, the density-based clustering algorithm is the most representative in Dbscan. Assuming a set of data, the R code of the generated data is as follows X1 0, Pi,length. out= -) Y10.1*rnorm ( -) X21.5+ SEQ (0, Pi,length. out= -) Y20.1*rnorm ( -) DataData.frame (C (X1,X2), C (y1
Label: HTTP color OS AR for SP data 2014 ad
Lm function, you can use a thread relationship to establish the development trend between two data, that is, thread fitting, and use the predict function to use the results of this development trend for data prediction.
fr=lm(Height~Weight,data=hw)coef(fr)
Establish a thread fitting method based on weight to predict the height. A straight line defined by intercept and slope. Visualized display:
library("ggplot2")
Ggplot2 Scale Related SettingsScale settings: Mainly used to adjust settings for each layer after Ggplot drawing.1. Related attribute scale settingIncludes Scale_size (), Scale_alpha (), Scale_shape ()As you can see from the name above, these three settings are primarily related to the Ggplot layer properties, including size, transparency, and shape.The main parameters for this setting are listed below:SCAL
The R language draws maps, which are often used in data analysis and can achieve very good results, and this section provides examples of how to use the R language tools to draw the ideal map.Examples of this section run smoothly under the R version 2.15.3 release, and other versions are pending.The code is as follows: The first small example# load the appropriate package, read the data, and then draw. Library (maptools), library (Ggplot2), China_map"d://map//bou2_ 4p.shp", Proj4string=crs ("+p
that the average rice yield is not 150000. Apply this test to all variables, and assume that the mean value is 15000. we have:
print ss.ttest_1samp(a = df, popmean = 15000) # OUTPUT(array([ -1.12817385, 1.07053437, -65.81425599, -4.564575 , 6.17156198]), array([ 2.62704721e-01, 2.87680340e-01, 4.15643528e-70, 1.83764399e-05, 2.82461897e-08]))
The first array is the t statistic, and the second array is the corresponding p value.
Visualization
Python has many visualization modules, and the mo
For more information, see: R, Ggplot2, shiny summaryInitial pattern:Library(GGPLOT2)DT= data. Frame(a = c(2, 7, 4, ten, 1), B = C('b', 'A ', 'C', 'D', 'E'))windowsfonts(MyFont = Windowsfont ("in italics") # # Bound FontP= Ggplot (DT, AES (x =B, y =A, fill =B) + Geom_bar (stat ="Identity", alpha =0.7) + Coord_polar ()Pthe rose chart after repair:Library (ggplot2) dt = data. Frame(A = C (2,7,4,Ten,1), B = C (' B ',' A ',' C ',' D ',' E ')) windowsfonts
a true data.frame type. the Ggplot2 package specifically provides a special version of the Fortify function for geographic data to do this workUse this function to cook the X,Geom_polygon is a function of the polygon fill path, and the map is actually a variety of combinations of polygons, so with this function, it is appropriate to draw a map. mymap=ggplot (data = fortify (x)) +geom_polygon (Aes (X=LONG,Y=LAT,GROUP=ID), colour= "Black", Fill=na) +th
, and making their scatter plot plt. scatter (forest_fires ["wind"], forest_fires ["area"]) plt. show ()
Plt. scatter (forest_fires ['wind'], forest_fires ['region']) plt. title ('wind speed vs fire area') plt. xlabel ('wind speed when fire started') plt. ylabel ('area consumed by fire') plt. show ()
# Use the list data as the axis age = [5, 10, 15, 20, 25, 30] height = [25, 45, 65, 75, 75] plt. plot (age, height) plt. title ('Age vs height') plt. xlabel ('age') plt. ylabel ('height') plt. show
, self-help sampling, K-fold cross-validation and so on. Next you can use function Evalute () to evaluate the performance of multiple evaluation algorithms using the evaluation plan.2. Example AnalysisLibrary (Recommenderlab)Library (GGPLOT2)# #数据处理与数据探索性分析Data (Movielense)Image (Movielense)# get ratingsRatings.movie Summary (ratings.movie$ratings)# # Min. 1st Qu. Median Mean 3rd Qu. Max.# 1.00 3.00 4.00 3.53 4.00 5.00Ggplot (Ratings.movie, AES (x = ratings)) + Geom_histogram (fill = "Beige", co
([ -1.12817385, 1.07053437, -65.81425599, -4.564575 , 6.17156198]), array([ 2.62704721e-01, 2.87680340e-01, 4.15643528e-70, 1.83764399e-05, 2.82461897e-08]))
The first array is the t statistic, and the second array is the corresponding P value.
Visualization
Python has many visualization modules, and the most popular one is the matpalotlib library. We can also select the bokeh and seaborn modules. In my previous blog post, I have explained the function of the box map module in the matplotlib l
([2.62704721e-01, 2.87680340e-01, 4.15643528e-70, 1.83764399e-05, 2.82461897e-08]))
The first array is the T statistic, and the second array is the corresponding P-value.
Visualization of
There are many visual modules in Python, the most popular being the Matpalotlib library. With a little mention, we can also choose the bokeh and Seaborn modules. In the previous blog post, I have explained the function of the box Whisker diagram module in the Matplotlib library.
# Import the module for Plot
3.1 Basic Bar chartLibrary (GGPLOT2)Library (Gcookbook)Pg_mean #这是用到的数据Group weight1 Ctrl 5.0322 Trt1 4.6613 Trt2 5.526Ggplot (Pg_mean, AES (X=group, Y=weight)) + Geom_bar (stat= "Identity")The x-axis is a continuous variable or a factor, and the graph is different, and the group here is the factor.STR (Pg_mean)' Data.frame ': 3 obs. of 2 variables:$ group:factor W/3 Levels "Ctrl", "Trt1",..: 1 2 3 #可以看出group是因子$ weight:num 5.03 4.66 5.53Set the fill color with fill, set the border color with co
The addition of p-value and significance markers in the R language Visual learning notesHttp://www.jianshu.com/p/b7274afff14f?from=timelineIn the previous article, I mentioned how to add the GGPUBR package to the ggplot diagram p-value and the significance of the markup, this article will be described in detail. Demo with Data set Toothgrowth#先加载包library(ggpubr)#加载数据集ToothGrowthdata("ToothGrowth")head(ToothGrowth)## len supp dose## 1 4.2 VC
(Array ([-1.12817385, 1.07053437,-65.81425599,-4.564575, 6.17156198]),
array ([2.62704721e-01, 2.87680340e-01, 4.15643528e-70,
1.83764399e-05, 2.82461897e-08])
The first array is the T statistic, and the second array is the corresponding P value.
Visualization of
Python has many visual modules, the most popular of which is the Matpalotlib library. With a little mention, we can also select Bokeh and Seaborn modules. In the previous blog post, I have explained the Matplotlib library
: Color
1.1 r: Red
1.2 B: Blue
1.3 G: Green
1.3 y: Yellow
2. Data marker Markder
2.1 o: Circle
2.2. : Dot
2.2 D: Prism
3. Line LineStyle
3.1 without parameters is the default drawing point figure
3.2--: Dashed
3.3-: Solid line
4. Transparency
Alpha
5. Size
6. Grid line
Plt.grid (true,color= ' g ', linestyle= '-', linewidth= ' 2 ')
# Region fill
import Matplotlib.pyplot as Plt
import numpy as NP
X=np.linspace (0,5*np.pi,1000)
Y1=np.sin (x)
Y2=np.sin (2*x)
plt.plot (x,y1)
plt.plot (x,y2)
# f
= pd.value_counts (pred_churn)
# Calculate True Probabilities
True_prob = {} for
prob in Counts.index:
true_prob[prob] = Np.mean (Is_churn[pred_churn = = Prob])
True_prob = PD. Series (True_prob)
# pandas-fu
counts = Pd.concat ([Counts,true_prob], Axis=1). Reset_index ()
Counts.columns = [' Pred_prob ', ' count ', ' true_prob ']
counts
Output results:We can see that a random forest predicts 75 people will have a 0.9 probability of loss, whereas in reality the group has about 0.97
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.