Analytics Toolbox for BigQuery

Analytics Toolbox for BigQuery

Go back

Applying GWR to understand Airbnb listings prices

Geographically Weighted Regression (GWR) is a statistical regression method that models the local (e.g. regional or sub-regional) relationships between a set of predictor variables and an outcome of interest. Therefore, it should be used in lieu of a global model in those scenarios where these relationships vary spatially.

In this example we are going to analyze the local relationships between Airbnb’s listings in Berlin and the number of bedrooms and bathrooms available at these listings using the GWR_GRID procedure. Our input dataset, publicly available from cartobq.docs.airbnb_berlin_h3_qk, contains the Airbnb listing’s locations in H3 and quadkey cells at different resolutions, their prices, and their number of bedrooms and bathrooms.

We can run our GWR analysis by simply running this query:

𝅺
1
2
3
4
5
6
7
CALL `carto-un`.carto.GWR_GRID(
    'cartobq.docs.airbnb_berlin_h3_qk',
    ['bedrooms', 'bathrooms'], -- [ beds feature, bathrooms feature ]
    'price', -- price (target variable)
    'h3_z7', 'h3', 3, 'gaussian', TRUE,
    NULL
);

This particular configuration will run a local regression for each H3 grid cell at resolution 7. All listings at each particular grid cell and those within its neighborhood, defined as its Kring of size 3 will be taken into account to run this regression. Data points within the neighborhood will be given a weight inversely proportional to the distance to the central cell, according to the kernel function of choice, in this case, a Gaussian.

The output of our GWR analysis is a table that contains the result of each of these regressions: the coefficients for each of the predictor variables and the intercept. The following map shows the coefficients associated with the number of bedrooms (top) and bathroom (bottom), where darker/brighter areas correspond to lower/higher values:

Positive values indicate a positive association between the Airbnb’s listing prices and the presence of bedrooms and bathrooms (conditional on the other) and with larger absolute values indicating a stronger association.

We can see that overall, where listings are equipped with more bedrooms and bathrooms, their price is also higher. However, the strength of this association is weaker in some areas: for instance, the number of bedrooms clearly drives higher prices in the city center, while not as much in the outskirts of the city.

EU flag

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 960401.