I’m happy to announce that the ipfr package is available on CRAN! The goal of this package is to make survey expansion, matrix balancing, and population synthesis easier.
A basic use case is the task of balancing a matrix to row and column targets:
library(ipfr)
library(dplyr)mtx <- matrix(data = runif(9), nrow = 3, ncol = 3)
row_targets <- c(3, 4, 5)
column_targets <- c(5, 4, 3)
result <- ipu_matrix(mtx, row_targets, column_targets)
rowSums(result)
#> [1] 3.000001 4.000015 4.999985
colSums(result)
#> [1] 5 4 3The example below creates a simple survey and expands it to meet known
population targets. Each row in the survey data frame represents a household
and contains information on the number of household members (size) and number
of autos. The targets list contains population targets that the survey
expansion should match. For example, there should be a total of 75 households
with 1 person.
survey <- tibble(
size = c(1, 2, 1, 1),
autos = c(0, 2, 2, 1),
weight = 1
)
targets <- list()
targets$size <- tibble(
`1` = 75,
`2` = 25
)
targets$autos <- tibble(
`0` = 25,
`1` = 50,
`2` = 25
)
result <- ipu(survey, targets)The package also supports a number of advanced features:
- Match to household- and person-level targets simultaneously
- View and restrict the distribution of resulting weights
- Control by geography
- Handle target agreement and importance
Finally, the resulting weight table can be used to easily create a synthetic population:
synthesize(result$weight_tbl)
#> # A tibble: 100 x 4
#> new_id id size autos
#> <int> <int> <dbl> <dbl>
#> 1 1 1 1 0
#> 2 2 4 1 1
#> 3 3 1 1 0
#> 4 4 2 2 2
#> 5 5 4 1 1
#> 6 6 4 1 1
#> 7 7 2 2 2
#> 8 8 2 2 2
#> 9 9 4 1 1
#> 10 10 4 1 1
#> # ... with 90 more rows