--- title: "Consistency With the Results of Existing GDM R Packages" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{consistency} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- This vignette discusses the consistency of results between the basic geographic detector in the `gdverse` package and existing R packages of GDModels (i.e., `geodetector` and `GD` packages). Use the `CollectData` data from `geodetector` package as the demo; ``` r collectdata = geodetector::CollectData res1 = geodetector::factor_detector("incidence", c("soiltype","watershed","elevation"), collectdata) res1 ## [[1]] ## q-statistic p-value ## soiltype 0.3857168 0.3632363 ## ## [[2]] ## q-statistic p-value ## watershed 0.6377737 0.0001169914 ## ## [[3]] ## q-statistic p-value ## elevation 0.6067087 0.04080407 res2 = GD::gd(incidence ~ ., data = collectdata) res2 ## variable qv sig ## 1 watershed 0.6377737 0.000128803 ## 2 soiltype 0.3857168 0.372145486 ## 3 elevation 0.6067087 0.043382244 res3 = gdverse::geodetector(incidence ~ ., data = collectdata) res3 ## Factor Detector ## ## | variable | Q-statistic | P-value | ## |:---------:|:-----------:|:-----------:| ## | watershed | 0.6377737 | 0.000128803 | ## | elevation | 0.6067087 | 0.043382244 | ## | soiltype | 0.3857168 | 0.372145486 | ``` The q-statistic calculations for all variables in the three packages are consistent, but there are slight differences in the results of the q-values. Among them, `gdverse` is consistent with the `GD` package, and there are differences with the `geodetector` package. This is caused by the inconsistent choice of the non-central F-distribution parameters of the p-value for the q-statistic; when there is only one sample in a certain stratification, it cannot calculate the variance and therefore does not contribute to the q-statistic calculation. The `gdverse` and `GD` packages use the same strategy, which is to directly remove these single-sample layers, but the `geodetector` package calculates the total sample size and stratification number directly before data processing, so it causes a slight difference in the estimation of the p-value for the q-statistic. In actual problems, this situation occurs less frequently. We believe that using the actual number of samples and stratifications participating in the calculation is more prudent, so we chose the same processing strategy as the `GD` package.