Save Simulation Results with Names as Hashes from the Parameters that Generated Them
Source:R/save_objects.R
save_objects.RdSaves RDS files to a specified folder with a name that is a hash generated from a list of parameters used for the simulation. There are a number of options that control the behavior, however, the default functionality likely covers 99% of use cases.
Usage
save_objects(
folder,
results,
parameters_list = NULL,
yaml = TRUE,
ignore_na = TRUE,
alphabetical_order = TRUE,
overwrite = FALSE,
include_timestamp = TRUE,
hash_includes_timestamp = FALSE,
algo = "xxhash64",
get_script_name = TRUE,
ignore_script_name = FALSE,
incremental = FALSE,
silent = FALSE
)Arguments
- folder
Character string specifying the path to the directory where the objects will be saved.
- results
The R object or list of objects to be saved.
- parameters_list
A named list of arguments used to generate a unique hash for the file.
- yaml
Whether results should be saved in indexr.yaml if
TRUE(default) or individual parameter files ifFALSEwhich is the egacy behavior prior to version 0.3.0.- ignore_na
Logical. If
TRUE,NAvalues inparameters_listare ignored during hash generation.- alphabetical_order
Logical. If
TRUE, the names inparameters_listare sorted alphabetically before hash generation.- overwrite
Logical. If
TRUE, existing files with the same hash will be overwritten. IfFALSEand a conflict occurs, the results will be saved under a temporary hash.- include_timestamp
Logical. If
TRUE, a timestamp is added toparameters_list.- hash_includes_timestamp
Logical. If
TRUE, the timestamp is included in the hash generation.- algo
Character string specifying the hashing algorithm to use. Default is
"xxhash64". See?digest- get_script_name
Logical. If
TRUE, attempts to get the script name and add it toparameters_list. Only works if script is run from command line, in an interactive session, this will always beNULL.- ignore_script_name
Logical. If
TRUE, the script name is ignored during hash generation.- incremental
Logical. If
TRUE, results are saved in a subfolder named after the hash and can be combined withcompress_incremental. Note, ifTRUE, no checks will be done for results that already exist, the user should check this in their script withcheck_hash_existence.- silent
Logical. If
TRUE, no check is done that pairs of results files (parameters and associated results) is done. This check is not necessary, but done by default to keep the user aware of a scenario that usually results from manual file manipulation.
Details
This function saves R objects to disk with a file name based on a generated hash of the provided arguments. It supports incremental saving, where multiple results can be saved under the same hash in a subdirectory and later collected. This can be helpful for a simulation that runs and saves results in parallel for the SAME set of simulation parameters.
Examples
## Setup
tmp_dir <- file.path(tempdir(), "example")
dir.create(tmp_dir)
## Example using parameter list to run simulation and save results
parameters_list <- list(
iterations = 1000,
x_dist = "rnorm",
x_dist_options = list(n = 10, mean = 1, sd = 2),
error_dist = "rnorm",
error_dist_options = list(n = 10, mean = 0, sd = 1),
beta0 = 1,
beta1 = 1
)
betas <- numeric(parameters_list$iterations)
for (i in 1:parameters_list$iterations) {
x <- do.call(parameters_list$x_dist, parameters_list$x_dist_options)
err <- do.call(parameters_list$error_dist, parameters_list$error_dist_options)
y <- parameters_list$beta0 + parameters_list$beta1*x + err
betas[i] <- coef(lm(y ~ x))["x"]
}
save_objects(folder = tmp_dir, results = betas, parameters_list = parameters_list)
## Read back in (consider clearing environment before running)
## Re-setup
tmp_dir <- file.path(tempdir(), "example")
parameters_list <- list(
iterations = 1000,
x_dist = "rnorm",
x_dist_options = list(n = 10, mean = 1, sd = 2),
error_dist = "rnorm",
error_dist_options = list(n = 10, mean = 0, sd = 1),
beta0 = 1,
beta1 = 1
)
betas <- read_objects(folder = tmp_dir, parameters_list = parameters_list)
#> Warning: File not found for hash: 8fe7c7435185d35c
## Cleanup
unlink(tmp_dir, recursive = TRUE)