Using the IntakeBalance Package

IntakeBalance is an R package to help with implementing intake-balance methods for assessing energy intake (EI). In this vignette, we will demonstrate how to do this using an accelerometer-based approach.

We realize this might seem like an intimidating tool, since most potential users are not R programmers. Although it’s true that we can currently only provide support for using this tool in R, we want to make it as accessible as possible. On this page, our goal is to provide thorough instructions that will demystify the process and limit how intimidating it feels. But we also recognize there may be areas where things need to be clearer. If you find anything confusing, please feel free to let us know by filing an issue, and we will do our best to revise in a way that makes these instructions easier to follow.

Prerequisites

Before you can use IntakeBalance, you need to get your machine set up. The first step is to install R if you haven’t already. We also recommend installing RStudio. The default settings should do the trick, so you can just run the installers. Then open up RStudio and prepare to run some code in the console. (The bottom left pane. Code can be pasted in like it would in a document. Then press [enter] to execute the code.)

The next task is to install the IntakeBalance package. This could be done much more concisely than what’s shown in the code below, but we’re going to go the long way in hopes of avoiding any breakdowns that might occur. (Sometimes things work differently depending on your specific computer setup or prior installation of R, so this longer approach helps us to [hopefully] account for that.) Just copy, paste, and execute all of the following code:

## List the required packages to install from mainstream R archive
cran_packages <- c(
  "digest", "dplyr", "knitr", "lubridate", "magrittr",
  "nnet", "PAutilities", "PhysActBedRest", "PhysicalActivity",
  "purrr", "randomForest", "read.gt3x", "devtools", "rmarkdown",
  "rlang", "Sojourn", "stats", "testthat", "tidyr", "tree",
  "TwoRegression", "units", "utils", "xml2"
)


## Install any of the previous packages that aren't already installed
invisible(lapply(
  cran_packages,
  function(x) if (!isTRUE(requireNamespace(x, quietly = TRUE)))
    install.packages(x)
))


## List the required packages to install from GitHub
remote_packages <- c(
  "paulhibbing/anthropometry", "paulhibbing/accelEE",
  "paulhibbing/EE.Data", "paulhibbing/IntakeBalance"
)


## Install any of the GitHub packages that aren't already installed
invisible(lapply(
  remote_packages,
  function(x) if (!isTRUE(requireNamespace(x, quietly = TRUE)))
    devtools::install_github(x, dependencies = FALSE)
))


## Install one other package using a different method,
## if it hasn't already been installed

if (!requireNamespace("AGread", quietly = TRUE))
  devtools::install_version("AGread", "1.3.2")

Once you’ve finished the above, you’re ready to look at an example of using the IntakeBalance package.

Basic Usage

The core of IntakeBalance is an eponymous function that simply calculates EI based on information about energy expenditure and changes in body composition. As shown below, the function runs in two ways depending on what information you supply. You can read more about this in the function documentation – Just run
?IntakeBalance::IntakeBalance.

Let’s look at some imaginary data to show both ways this can work.

Option 1. The vector method

This is designed for making calculations when you’re directly passing in data values for the requested pieces of information (i.e., “parameters” or “arguments”). It looks like this:

## Generate the result
vector_result <- IntakeBalance::IntakeBalance(
  fm_start = 50.5,   ## Fat mass at the start of the measurement period
  fm_end = 50.7,     ## Fat mass at the end of the measurement period
  ffm_start = 75.1,  ## Fat-free mass at the start of the measurement period
  ffm_end = 74.9,    ## Fat-free mass at the end of the measurement period
  fm_units = "kg",   ## Units of measure for the fat mass value
  ffm_units = "kg",  ## Units of measure for the fat-free mass value
  ee_per_day = 1950, ## Daily energy expenditure
  ee_units = "kcal", ## Units of measure for the daily energy expenditure value
  n_days = 14        ## Number of days between the start and end measurements
)

## Show the result
knitr::kable(vector_result)

x
2071.143

This tells us that calculated EI was 2071 kcal/day.

Option 2. The data frame method

This is designed for making calculations when you have data for multiple individuals stored in a data frame. To show how it works, we first need to create some imaginary data:

## Create the data frame
info <- data.frame(
  PID = c("001", "002"),
  fm_start = c(50.5, 70.2), fm_end = c(50.7, 70.0),
  ffm_start = c(75.1, 90.3), ffm_end = c(74.9, 90.3),
  ee = c(1950, 2473), n_days = c(14, 14)
)

## Show the data frame
knitr::kable(info)

PID	fm_start	fm_end	ffm_start	ffm_end	ee	n_days
001	50.5	50.7	75.1	74.9	1950	14
002	70.2	70.0	90.3	90.3	2473	14

Now let’s calculate EI:

## Generate the result
df_result <- IntakeBalance::IntakeBalance(
  
  ## These arguments still refer to the same information outlined previously,
  ## but now we have added a layer of abstraction to reference the names of
  ## columns where that information is stored, rather than providing the
  ## values themselves
  fm_start = "fm_start",
  fm_end = "fm_end",
  ffm_start = "ffm_start",
  ffm_end = "ffm_end",
  ee_per_day = "ee",
  n_days = "n_days",
  
  ## The trick is to pass in a data frame via this extra argument. That's how R
  ## knows to interpret the other variables as column names rather than raw values
  df = info
  
)

## Show the result
knitr::kable(df_result)

PID	fm_start	fm_end	ffm_start	ffm_end	ee	n_days	delta_ES	EI
001	50.5	50.7	75.1	74.9	1950	14	121.1429	2071.143
002	70.2	70.0	90.3	90.3	2473	14	-135.7143	2337.286

You can see that this returns the original data frame, updated with new columns that indicate the change in energy storage (delta_ES) and the EI values. Furthermore, you can see that the values in the first row of info match with what was provided in the example for Option 1. Cool!

Making this Work with Data from Wearables

The above examples demonstrated the core functionality of the IntakeBalance package. But things get more powerful when we combine this with other packages to help with wrangling wearable data and incorporating it into the intake-balance framework. Here, we will show how you could use the accelEE package to process an ActiGraph file using the same methods from our doubly labeled water study. Unfortunately, the code is designed for combining gt3x and agd files (raw acceleration and activity count formats, respectively) that each contain at least one full day of data. This is an issue because sample files tend to be small and only available in one format. So, throughout the code below, you will see some extra steps we need to take, just to create a large, properly-formatted, illustration-ready dataset that plays nicely with the code. Don’t let this throw you off. We’ll tell you whenever this these extraneous steps are happening, and you won’t need to worry about them when dealing with your own data from the real world.

The accelerometer-based intake-balance process unfolds in three phases outlined below.

Phase 1. Reading and pre-formatting the wearable data

The first step is to read a “raw data” file (gt3x extension) into R. We’ll use one that was built in with one of the packages we installed at the beginning. It only contains 40 min of data, so we’ll copy it several times and pretend it spans several days instead.


## Find the file
filename <- system.file(
  "extdata/TAS1H30182785_2019-09-17.gt3x",
  package = "read.gt3x"
)

## Read the file using the `accelEE` package
ex_data <- accelEE::ee_file(filename = filename, scheme = "Hibbing 2023")

## Extend the data file for the sake of illustration
## (ignore this part when processing real-world data)
ex_data <- data.frame(
  Timestamp = ex_data$Timestamp[1] + (60*0:3199),
  do.call(rbind, replicate(80, ex_data[ ,-1], simplify = FALSE))
)

## Show some of the contents
knitr::kable(head(dplyr::select(ex_data, !matches("features"))))

Timestamp	Accelerometer_X	Accelerometer_Y	Accelerometer_Z	vm_raw_g	ENMO_mg	hildebrand_linear	hildebrand_nonlinear	hibbing_left	hibbing_right	montoye_left	montoye_right	staudenmayer_lm	staudenmayer_rf
2019-09-17 18:40:00	-0.8564652	0.1800915	0.3399457	1.645593	688.42025	0.0776331	0.0792006	0.0838148	0.0836703	0.0800848	0.1050642	0.1633019	0.0703252
2019-09-17 18:41:00	-0.6346995	1.0086462	0.5164685	1.685206	708.16089	0.1070825	0.1117054	0.1208968	0.1302841	0.0655763	0.0932124	0.1160207	0.0861830
2019-09-17 18:42:00	-0.3916835	0.8530120	0.5038665	1.141492	183.34091	0.0639203	0.0700443	0.0738617	0.0836024	0.0505405	0.0667929	0.0646504	0.0761848
2019-09-17 18:43:00	-0.3844428	0.8275600	0.5240558	1.100549	150.39621	0.0587946	0.0631705	0.0619429	0.0746901	0.0472104	0.0565958	0.0631195	0.0716907
2019-09-17 18:44:00	-0.8493325	-0.1674640	0.0766567	1.017237	27.76878	0.0205919	0.0229740	0.0270425	0.0266269	0.0311238	0.0374416	0.0284692	0.0248229
2019-09-17 18:45:00	-0.9880000	-0.2270000	-0.0200000	1.013939	13.93935	0.0145860	0.0178881	0.0212713	0.0212713	0.0201839	0.0277541	0.0170170	0.0170170

So far, this gives us minute-by-minute summaries of accelerometer variables, plus estimates of energy expenditure (kcal/kg) for the different methods. Next, we need to account for non-wear and sleep. If you have a file with agd extension, you can use syntax just like the above to acquire that information. For example, you might run agd_data <- accelEE::ee_file(filename = "myfolder/myfile.agd", scheme = "Hibbing 2023"). R will recognize that it is an agd file and not a gt3x file, and it will then run the desired non-wear and sleep procedures instead of generating estimates of energy expenditure.

For this example, though, we don’t have a built in agd file. Instead, we’ll just make up some activity count data. (You’re welcome to run the below code if you’re following along with this example, but in an actual application, as described previously, you would use an agd file rather than completing this step.)


## Set a starting point for the random number generator so that sampling is
## repeatable when generating the `Axis1` and `valid_status` columns
set.seed(610)

## Now generate the sleep/non-wear dataframe
agd_data <- data.frame(
  Timestamp = ex_data$Timestamp,
  Axis1 = sample(0:200, 3200, TRUE, exp(rnorm(201, log(30), log(5)))),
  Axis2 = 0,
  Axis3 = 0,
  valid_status = sample(
    c("Awake-Wear", "Non-Wear", "Sleep"),
    3200,
    TRUE,
    c(2, 8, 14)
  )
)

Once we have data related to both energy expenditure (.gt3x origin) and sleep/non-wear (.agd origin), we can merge the two together and get a daily summary:


full_data <- merge(ex_data, agd_data)

ee_data <- accelEE::ee_summary(full_data, scheme = "Hibbing 2023")

knitr::kable(ee_data)

Date	total_minutes	total_hildebrand_linear	total_hildebrand_nonlinear	total_hibbing_left	total_hibbing_right	total_montoye_left	total_montoye_right	total_staudenmayer_lm	total_staudenmayer_rf	total_Axis1	total_Axis2	total_Axis3	total_is_Sleep	total_is_NonWear
2019-09-18	1440	2.536927	2.892719	3.382906	3.486795	3.440164	4.036298	3.208864	3.011168	11901	0	0	828	490
2019-09-19	1440	2.512302	2.876073	3.361450	3.471130	3.412406	4.015132	3.158153	2.979310	11976	0	0	816	500

Phase 2. Getting total daily energy expenditure from the wearable data

The previous step gave us daily estimates of total energy expenditure, expressed relative to body mass (kcal/kg) and exclusively reflecting measurement during awake wear time. To calculate total (absolute) energy expenditure for the full day (including sleep/non-wear), we need to bring in our body composition data and make another calculation or two.


## First, let's get our body composition data. We'll pretend it's the first row
## of the `info` data frame we created earlier. We'll remove the `ee` column to
## avoid confusion, and we'll also assign an ID label of `001` and a body mass
## of 87 kg

  body_comp <- info[1, ]
  body_comp$ee <- NULL
  body_comp$wt_kg <- 87

  
## Next, let's determine a Schofield prediction of basal metabolism to impute
## during periods of sleep/non-wear

  body_comp$schofield <- PAutilities::get_ree(
    method = "schofield_wt_ht", sex = "M", age_yr = 55,
    wt_kg = 87, ht_m = 1.85, output = "kcal_day"
  )


## It's time to merge the body composition and energy expenditure data

  ei_data <- merge(body_comp, ee_data)
  

## Next, we want to take our EE estimates (kcal/kg for awake wear periods) and
## convert them to total EE (i.e., no longer relative to body weight, and
## imputing basal metabolic expenditure for periods of sleep/non-wear).
  
## To do this, we'll use some functions from the `dplyr` package. Note that you
## can interpret the `%>%` operator as meaning 'next do ____' or 'then do ___'.
  
## The `%<>%` operator is for assignment (i.e., saves output to the `ei_data` object).
  
  suppressPackageStartupMessages(library(dplyr))
  
  ei_data %<>%
    mutate(across(
      total_hildebrand_linear:total_staudenmayer_rf,
      ~ .x * wt_kg + schofield * ((total_is_NonWear + total_is_Sleep) / 1440),
      .names = "{gsub(\"^total\", \"EE\", .col)}"
    )) %>%
    select(
      any_of(names(body_comp)),
      !c(Date, total_hildebrand_linear:total_staudenmayer_rf)
    ) %>%
    group_by(across(any_of(names(body_comp)))) %>%
    summarise(across(
      everything(), mean, .names = "{gsub(\"^total\", \"mean\", .col)}"
    ), .groups = "drop")

  
  knitr::kable(ei_data)

PID	fm_start	fm_end	ffm_start	ffm_end	n_days	wt_kg	schofield	mean_minutes	mean_Axis1	mean_Axis2	mean_Axis3	mean_is_Sleep	mean_is_NonWear	EE_hildebrand_linear	EE_hildebrand_nonlinear	EE_hibbing_left	EE_hibbing_right	EE_montoye_left	EE_montoye_right	EE_staudenmayer_lm	EE_staudenmayer_rf
001	50.5	50.7	75.1	74.9	14	87	1870.377	1440	11938.5	0	0	822	495	1930.257	1961.558	2003.995	2013.286	2008.703	2060.853	1987.581	1971.202

  ## Note that the `n_days` column refers to the number of days between
  ## body composition measurements, not necessarily the number of days of
  ## accelerometer data included in the analysis (n = 2 in this case)

Phase 3. Calculate EI

At this point, we are ready for the rest of our calculations.

## Generate the result
ei_result <-
  ei_data %>%
  mutate(across(
    EE_hildebrand_linear:EE_staudenmayer_rf,
    ~ IntakeBalance::IntakeBalance(
      fm_start, fm_end, ffm_start, ffm_end, ee_per_day = .x
    ),
    .names = "{gsub(\"^EE\", \"EI\", .col)}"
  ))

## Show the result
knitr::kable(ei_result)

PID	fm_start	fm_end	ffm_start	ffm_end	n_days	wt_kg	schofield	mean_minutes	mean_Axis1	mean_Axis2	mean_Axis3	mean_is_Sleep	mean_is_NonWear	EE_hildebrand_linear	EE_hildebrand_nonlinear	EE_hibbing_left	EE_hibbing_right	EE_montoye_left	EE_montoye_right	EE_staudenmayer_lm	EE_staudenmayer_rf	EI_hildebrand_linear	EI_hildebrand_nonlinear	EI_hibbing_left	EI_hibbing_right	EI_montoye_left	EI_montoye_right	EI_staudenmayer_lm	EI_staudenmayer_rf
001	50.5	50.7	75.1	74.9	14	87	1870.377	1440	11938.5	0	0	822	495	1930.257	1961.558	2003.995	2013.286	2008.703	2060.853	1987.581	1971.202	3626.257	3657.558	3699.995	3709.286	3704.703	3756.853	3683.581	3667.202

Voila! The data frame now has new columns that specify EI estimates for each of the accelerometer-based methods.

Using the `IntakeBalance` Package