According to the latest update of the New R Interface to Oracle Data Mining Available for Download on the Oracle official blog, Oracle officially started to support the simple and unofficial statement of the application of the R language in Oracle databases: oracle contributes to an additional package that provides interfaces between Oracle and R ).
Citing the introduction to R-ODM (R-Oracle Data Mining) in the blog:
R-ODM is especially useful:
Quick prototyping of vertical or domain-based applications where the Oracle Database supports the application
Scripting of "production" data mining methodologies
Customizing graphics of ODM data mining results (examples: classification, regression, anomaly detection) We all know that R has an irreplaceable advantage in implementing prototype algorithms. It is true that general data mining algorithms implemented through R can be embedded into databases. However, this interface provided by Oracle greatly improves the deployment efficiency of mining algorithms.
Today, 2010.06.08), CRAN updated version 1.0-2 of the RODM package, Supporting Windows, Linux, and MacOS X systems.
The following is an example in the RODM package help document. You can first understand the efficient algorithm deployment:
- ### GLM Regression
- ## Not run:
- x1 <- 2 * runif(200)
- noise <- 3 * runif(200) - 1.5
- y1 <- 2 + 2*x1 + x1*x1 + noise
- dataset <- data.frame(x1, y1)
- names(dataset) <- c("X1", "Y1")
- RODM_create_dbms_table(DB, "dataset")
- # Push the training table to the database
-
- glm <- RODM_create_glm_model(database = DB, # Create ODM GLM model
- data_table_name = "dataset",
- target_column_name = "Y1",
- mining_function = "regression")
-
- glm2 <- RODM_apply_model(database = DB, # Predict training data
- data_table_name = "dataset",
- model_name = "GLM_MODEL",
- supplemental_cols = "X1")
- windows(height=8, width=12)
- plot(x1, y1, pch=20, col="blue")
- points(x=glm2$model.apply.results[, "X1"],
- glm2$model.apply.results[, "PREDICTION"], pch=20, col="red")
- legend(0.5, 9, legend = c("actual", "GLM regression"), pch = c(20, 20),
- col = c("blue", "red"),
- pt.bg = c("blue", "red"), cex = 1.20, pt.cex=1.5, bty="n")
-
- RODM_drop_model(DB, "GLM_MODEL") # Drop the model
- RODM_drop_dbms_table(DB, "dataset") # Drop the database table
- RODM_close_dbms_connection(DB)
- RODM_close_dbms_connection(DB)
Let's say a digress:
In addition to supporting R interfaces in the field of statistical analysis (SAS, SPSS, and Statistica), the influence of R has developed into the field of commercial databases.
Additional reading
The R language is the language used for statistical analysis and plotting and the operating environment. R was originally developed by Ross Ihaka and Robert Gentleman from the University of Auckland, New Zealand. This is also called R). Now the R development core team is responsible for development. R is a GNU project based on the S language, so it can also be implemented as an implementation of the S language. Generally, code written in the S language can be run in the R environment without modification. The R syntax is from Scheme.
The source code of R can be freely downloaded and used, and compiled execution files can be downloaded and run on multiple platforms, including UNIX, FreeBSD, and Linux), Windows, and MacOS. R is mainly a command line operation, and several graphical user interfaces have been developed.