sshicm
packagesshicm
package can be used to address following
issues:Information consistency-based measures of spatial stratified heterogeneity intensity for continuous and nominal variables.
Strength of spatial pattern associations based on information consistency measures.
sshicm
package“baltim” consists of Baltimore home sale prices and hedonics. In total, there are 221 instances in “baltim” data. The explanatory variables are whether it is a detached unit (DWELL), whether it has a patio (PATIO), whether it has a fireplace (FIREPL), whether it has air conditioning (AC), and whether the dwelling is in Baltimore County (CITCOU, while the target variable is the sale price of the home (PRICE).
“cinc” is derived from the 2008 Cincinnati Crime + Socio-Demographics dataset. It includes spatial data on 457 objects located on an irregular lattice. The explanatory variables are male population (MALE), female population (FEMALE), median age (MEDIAN_AGE), average family size (AVG_FAMSIZ), and population density (DENSITY), while the target variable is the existence of theft (THEFT_D).
sshicm
packagesshic()
for continuous dependent variable
sshin()
for continuous nominal variable
A function sshicm()
that yields all results in a single
line, with the type
parameter set to IC
(Continuous) or IN
(Nominal) to specify whether the
dependent variable is a continuous or nominal variable.
Note: All explanatory variables must be discretized in advance or inherently be discrete nominal variables.
$$ I_{C}\left(d,s\right) = \sum_{s_{i} \in S}p\left(s_{i}\right)\frac{ \arctan \left(\textbf{RelE} \left( f_{d_{i}} \mid \mid f \right) \right)}{\pi / 2} $$
where di is the random variable corresponding to the target variable in stratum si , and fdi and f are the density functions of di and d, respectively. Additionally, RelE(fdi ∣ ∣f) is the relative entropy of fdi and f.
$$ \textbf{RelE} \left( f_{d_{i}} \mid \mid f \right) = H \left(f_{d_{i}} , f\right) - H \left(f_{d_{i}}\right) = \sum_{i = 1}^{n} f_{d_{i}} \log \frac{1}{f} - \sum_{i = 1}^{n} f_{d_{i}} \log \frac{1}{f_{d_{i}}} = \sum_{i = 1}^{n} f_{d_{i}} \log \frac{f_{d_{i}}}{f} $$
$$ I_{N}\left(d,s\right) = \frac{I \left(d,s\right)}{I \left(d\right)} = \frac{I \left(d\right) - I \left(d \mid s\right)}{I \left(d\right)} = 1 - \frac{\sum_{s_i \in S}\sum_{x \in V_d} p\left(s_i,x\right) \log p\left(x \mid s_i\right)}{\sum_{x \in V_d} p\left(x\right) \log p\left(x\right)} $$
where p(x) is the probability of observing x in U, p(si, x) is the probability of observing si and x in U, and p(x ∣ si) is the probability of observing x given that the stratum is si.
sshicm
packagebaltim = sf::read_sf(system.file("extdata/baltim.gpkg",package = "sshicm"))
sshicm(PRICE ~ .,baltim,type = "IC")
## # A tibble: 5 × 3
## Variable Ic Pv
## <chr> <dbl> <dbl>
## 1 AC 0.223 0
## 2 PATIO 0.162 0.643
## 3 FIREPL 0.135 0.657
## 4 DWELL 0.124 0.716
## 5 CITCOU 0.0898 0.988
cinc = sf::read_sf(system.file("extdata/cinc.gpkg",package = "sshicm"))
sshicm(THEFT_D ~ .,cinc,type = "IN")
## # A tibble: 5 × 3
## Variable In Pv
## <chr> <dbl> <dbl>
## 1 DENSITY 0.776 0.0681
## 2 MEDIAN_AGE 0.228 0.0230
## 3 MALE 0.0367 0
## 4 AVG_FAMSIZ 0.0205 0.00300
## 5 FEMALE 0.00584 0.0200
Wang, J., Haining, R., Zhang, T., Xu, C., Hu, M., Yin, Q., … Chen, H. (2024). Statistical Modeling of Spatially Stratified Heterogeneous Data. Annals of the American Association of Geographers, 114(3), 499–519. https://doi.org/10.1080/24694452.2023.2289982.
Bai, H., Wang, H., Li, D., & Ge, Y. (2023). Information Consistency-Based Measures for Spatial Stratified Heterogeneity. Annals of the American Association of Geographers, 113(10), 2512–2524. https://doi.org/10.1080/24694452.2023.2223700.
Wang, J., Li, X., Christakos, G., Liao, Y., Zhang, T., Gu, X., & Zheng, X. (2010). Geographical Detectors‐Based Health Risk Assessment and its Application in the Neural Tube Defects Study of the Heshun Region, China. International Journal of Geographical Information Science, 24(1), 107–127. https://doi.org/10.1080/13658810802443457.
Wang, J. F., Zhang, T. L., & Fu, B. J. A measure of spatial stratified heterogeneity. Ecological indicators, 2016. 67, 250-256. https://doi.org/10.1016/j.ecolind.2016.02.052.