Label:According to the characteristics of various industries, a variety of clustering algorithms are proposed, which are divided into several categories: hierarchy, Division, density, graph theory, grid and model. Among them, the density-based clustering algorithm is the most representative in Dbscan. Assuming a set of data, the R code of the generated data is as follows X1 0, Pi,length. out= -) Y10.1*rnorm ( -) X21.5+ SEQ (0, Pi,length. out= -) Y20.1*rnorm ( -) DataData.frame (C (X1,X2), C (y1
Label: HTTP color OS AR for SP data 2014 ad
Lm function, you can use a thread relationship to establish the development trend between two data, that is, thread fitting, and use the predict function to use the results of this development trend for data prediction.
fr=lm(Height~Weight,data=hw)coef(fr)
Establish a thread fitting method based on weight to predict the height. A straight line defined by intercept and slope. Visualized display:
library("ggplot2")
Ggplot2 Scale Related SettingsScale settings: Mainly used to adjust settings for each layer after Ggplot drawing.1. Related attribute scale settingIncludes Scale_size (), Scale_alpha (), Scale_shape ()As you can see from the name above, these three settings are primarily related to the Ggplot layer properties, including size, transparency, and shape.The main parameters for this setting are listed below:SCAL
The R language draws maps, which are often used in data analysis and can achieve very good results, and this section provides examples of how to use the R language tools to draw the ideal map.Examples of this section run smoothly under the R version 2.15.3 release, and other versions are pending.The code is as follows: The first small example# load the appropriate package, read the data, and then draw. Library (maptools), library (Ggplot2), China_map"d://map//bou2_ 4p.shp", Proj4string=crs ("+p
array is the corresponding p value.
Visualization
Python has many visualization modules, and the most popular one is the matpalotlib Library. We can also select the bokeh and seaborn modules. In my previous blog post, I have explained the function of the box map module in the matplotlib library.
# Import the module for plottingimport matplotlib.pyplot as plt plt.show(df.plot(kind = 'box'))
Now, we can use the ggplot topic integrated with R in the
For more information, see: R, Ggplot2, shiny summaryInitial pattern:Library(GGPLOT2)DT= data. Frame(a = c(2, 7, 4, ten, 1), B = C('b', 'A ', 'C', 'D', 'E'))windowsfonts(MyFont = Windowsfont ("in italics") # # Bound FontP= Ggplot (DT, AES (x =B, y =A, fill =B) + Geom_bar (stat ="Identity", alpha =0.7) + Coord_polar ()Pthe rose chart after repair:Library (ggplot2) dt = data. Frame(A = C (2,7,4,Ten,1), B = C (' B ',' A ',' C ',' D ',' E ')) windowsfonts
draw.Library (Maptools)x=readshapepoly (' bou2_4p.shp ')Library (GGPLOT2)Library (mapproj)#可以看到中国地图的框框In order to further draw in the Ggplot2 package,you need to convert the Spatialpolygonsdataframe data type to a true data.frame type. the Ggplot2 package specifically provides a special version of the Fortify function for geographic data to do this workUse this function to cook the X,Geom_polygon is a function of the polygon fill path, and the map is actually a variety of combinations of polygo
. show ()
# Using Series as the coordinate axis # using wind as the X axis, burning area as the Y axis, and making their scatter plot plt. scatter (forest_fires ["wind"], forest_fires ["area"]) plt. show ()
Plt. scatter (forest_fires ['wind'], forest_fires ['region']) plt. title ('wind speed vs fire area') plt. xlabel ('wind speed when fire started') plt. ylabel ('area consumed by fire') plt. show ()
# Use the list data as the axis age = [5, 10, 15, 20, 25, 30] height = [25, 45, 65, 75, 75] plt.
my previous blog post, I have explained the function of the box map module in the matplotlib library.
# Import the module for plottingimport matplotlib.pyplot as plt plt.show(df.plot(kind = 'box'))
Now, we can use the ggplot topic Integrated with R in the pandas module to beautify the chart. To use ggplot, we only need to add a line in the above Code,
import matplotlib.pyplot as pltpd.options.display.mpl_
of the box Whisker diagram module in the Matplotlib library.
# Import the module for Plottingimport Matplotlib.pyplot as Plt plt.show (df.plot (kind = ' box '))
Now we can beautify the chart with the Ggplot theme of integrated R in the Pandas module. To use Ggplot, we just need to add a line to the above code,
Import Matplotlib.pyplot as Pltpd.options.display.mpl_style = ' Default ' # Sets the plotting di
3.1 Basic Bar chartLibrary (GGPLOT2)Library (Gcookbook)Pg_mean #这是用到的数据Group weight1 Ctrl 5.0322 Trt1 4.6613 Trt2 5.526Ggplot (Pg_mean, AES (X=group, Y=weight)) + Geom_bar (stat= "Identity")The x-axis is a continuous variable or a factor, and the graph is different, and the group here is the factor.STR (Pg_mean)' Data.frame ': 3 obs. of 2 variables:$ group:factor W/3 Levels "Ctrl", "Trt1",..: 1 2 3 #可以看出group是因子$ weight:num 5.03 4.66 5.53Set the fill color with fill, set the border color with co
The addition of p-value and significance markers in the R language Visual learning notesHttp://www.jianshu.com/p/b7274afff14f?from=timelineIn the previous article, I mentioned how to add the GGPUBR package to the ggplot diagram p-value and the significance of the markup, this article will be described in detail. Demo with Data set Toothgrowth#先加载包library(ggpubr)#加载数据集ToothGrowthdata("ToothGrowth")head(ToothGrowth)## len supp dose## 1 4.2 VC
select Bokeh and Seaborn modules. In the previous blog post, I have explained the Matplotlib library in the box diagram module function.
# import the module for plotting
Import Matplotlib.pyplot as Plt
plt.show (df.plot (kind = ' box '))
Now, we can use the Pandas module to integrate R's Ggplot theme to beautify the chart. To use Ggplot, we just need to add one more line to the code above
)
ax.fill_between (x,y1,y2,where=y2>y1,facecolor= ' green ', interpolate=true)
Plt.show ()
# style
plt.style.use (' Ggplot ')
# Print the PLT-supported style
print (plt.style.available)
# histogram and two-dimensional histogram # using the Plt method import NumPy as NP import Matplotlib.pyplot as PLT # normed standardization Specifies whether to show frequency or frequencies Mu = sigma = x = mu + sigma * NP.RANDOM.RANDN (Watts) plt.hist (x, bins=10
)
# pandas-fu
counts = Pd.concat ([Counts,true_prob], Axis=1). Reset_index ()
Counts.columns = [' Pred_prob ', ' count ', ' true_prob ']
counts
Output results:We can see that a random forest predicts 75 people will have a 0.9 probability of loss, whereas in reality the group has about 0.97 of the rate. Calibration and identification
Using the dataframe above I can draw a very simple graphic to help visualize the probability measurement. The x-axis represents the loss probability of a group
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.