Vernacular Spatial Statistics 24: Geographic weighted regression (iii)

Source: Internet
Author: User

This chapter has a mathematical formula ... Beware of those who are allergic to maths ...

The previous article continued, the book connected to a back ... Last time, in the improvement of global regression on the basis of GWR finally turned out, from the Space analysis field finally has its own dedicated regression algorithm. If the spatial statistics are different from the two major characteristics of classical statistics: spatial correlation and spatial heterogeneity, and the Moran index can be used to quantify spatial correlations, then geographically weighted regression can be used to quantify spatial heterogeneity.

In the improvement of the global regression problem, local regression can be said to be the simplest method, GWR continued to apply the idea of local regression, but in the local window mode, followed the so-called "geography First Law", in the return, the use of spatial relations as a weight added to the operation, Here is an example of the basic idea of GWR.

First look at global regression and local regression:

In the local regression, set a window, and then according to the set window size, respectively, in each part of the regression calculation, in fact, it seems to be a smaller version of the global regression.

Looking at geo-weighted regression:

Geo-weighted as with other regression analysis, it is first to delimit a study area, and of course, this area can also contain the whole area of the entire research data (so that you can use spatial relationships (such as K-proximity) for local geo-weighted computing) ... Next, the most important thing is to use the different spatial location of each element to calculate the attenuation function, this is a continuous function, with this attenuation function, when you put each feature's spatial position (usually coordinate information (x, y) and the value of the feature into the function, you can get a weight value, This value can be brought into the regression equation.


So you can see that the most important thing is that the distance attenuation function, because there is a attenuation function, the different weights, this method will be called "Geographic weighted regression analysis." The theoretical basis of this decay function is that Tobler proposes the so-called "First Law of Geography (Tobler ' or Tobler's Geography): The closer the data is, the greater the effect on the result than the distance. This influence is mathematically, and it becomes the weight.


With these formulas, all sample points can be calculated at a point in time, and when each sample point is calculated, the other participating samples are given different weights based on the spatial relationships that are different from the sample point. The correlation regression coefficients for each of the different samples can then be obtained. Finally, by interpreting these coefficients, the whole analytic process of the whole geographic weighted regression analysis is completed.


has been emphasizing this decay function, so consider if there is no attenuation. Without attenuation, we find that all weights are the same (all weights are multiplied by any number, equal to their own) ... Then the equation becomes the global regression equation. This divorced from the first law of geography, and immediately turned back to the classical statistical theory.


Now look at how this decay function is calculated.

The following first put the formula, there are math phobia students please skip:

where W (UI,VI) is a spatial weights matrix, this concept please look back to the vernacular space statistics 17 ... However, in view of the difficult to turn back, I put it directly in the previous content it:

Weight matrix, let's look at what this spatial weights matrix really is:

This thing on the left, called the graph of No direction, is the so-called distance matrix by the side. As we have said before, in spatial analysis there is a need to conceptualize spatial relationships, so it is also commonly referred to as the spatial weights matrix.

Of course, this weight matrix is simple and clear, so the direct use of the shortest distance as the matrix elements, such as the distance between B and C, directly through the matrix can be queried to WBC = 2.

With the weight matrix, it is brought into the matrix and the following equation is obtained:

In the practical application, the common spatial weight function mainly has the following several kinds:
1. Gaussian function:

where B is the bandwidth (window size), the DIJ is the distance between the sample points I and J (as for which distance is chosen (European, Manhattan, Minkowski, spherical, cosine, etc.)).

2. Double square function (Bi-square)

Both of these distance functions are very dependent on bandwidth B, so this bandwidth and OK. The most common international approach is to use the cross-validation (CROSS-VALIDATION,CV) approach proposed by Cleveland (1979) and Bowman (1984) to determine:

This method uses a fitting value to perform the calculation, where

is the fitted value at I, (why not observe the value.) A: The observed values also follow a non-linear residual ... The corresponding B is the desired bandwidth when the CV value reaches the minimum, using the fitting value directly, which is easier to calculate. Since different spatial weighting functions are used to get different bandwidths, in order to get the optimal bandwidth, Fotheringham, in 2002 of the papers, puts forward a guideline: when the GWR model has the least AIC, it is the best bandwidth.

Well, here's another term: AIC ... So this article is about what this stuff is, to make an ending:

Akaike information criterion, referred to as AIC, is a standard to measure the goodness of the statistical model, which was founded and developed by the Japanese statistician Chi Hong. Based on the concept of entropy, we can weigh the complexity of the estimated model and the goodness of the model fit data. (This sentence comes from Baidu)
After listening, anyway, the shrimp God my feeling is this:

Everyone is interested to study, and finally post the history of Science information:

The following is the red Chi Hong, Japanese original:

Interested students, can go to his memorial site to see

The formula of this article, from the Beijing University Press of the "Space Econometrics" PP, and so on, in the shrimp God shared book, there are interested classmates to see.

The final need to share the book, or the same as the usual, through the public number to get the mailbox, and then send a need to what the mail can be.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.