I’m happy to announce that the ipfr package is available on CRAN! The goal of this package is to make survey expansion, matrix balancing, and population synthesis easier.
A basic use case is the task of balancing a matrix to row and column targets:
library(ipfr)
library(dplyr)
mtx <- matrix(data = runif(9), nrow = 3, ncol = 3)
row_targets <- c(3, 4, 5)
column_targets <- c(5, 4, 3)
result <- ipu_matrix(mtx, row_targets, column_targets)
rowSums(result)
#> [1] 3.000001 4.000015 4.999985
colSums(result)
#> [1] 5 4 3
The example below creates a simple survey and expands it to meet known
population targets. Each row in the survey
data frame represents a household
and contains information on the number of household members (size
) and number
of autos
. The targets
list contains population targets that the survey
expansion should match. For example, there should be a total of 75 households
with 1 person.
survey <- tibble(
size = c(1, 2, 1, 1),
autos = c(0, 2, 2, 1),
weight = 1
)
targets <- list()
targets$size <- tibble(
`1` = 75,
`2` = 25
)
targets$autos <- tibble(
`0` = 25,
`1` = 50,
`2` = 25
)
result <- ipu(survey, targets)
The package also supports a number of advanced features:
- Match to household- and person-level targets simultaneously
- View and restrict the distribution of resulting weights
- Control by geography
- Handle target agreement and importance
Finally, the resulting weight table can be used to easily create a synthetic population:
synthesize(result$weight_tbl)
#> # A tibble: 100 x 4
#> new_id id size autos
#> <int> <int> <dbl> <dbl>
#> 1 1 1 1 0
#> 2 2 4 1 1
#> 3 3 1 1 0
#> 4 4 2 2 2
#> 5 5 4 1 1
#> 6 6 4 1 1
#> 7 7 2 2 2
#> 8 8 2 2 2
#> 9 9 4 1 1
#> 10 10 4 1 1
#> # ... with 90 more rows