IntakeBalance
PackageIntakeBalance
is an R package to help with implementing
intake-balance methods for assessing energy intake (EI). In this
vignette, we will demonstrate how to do this using an
accelerometer-based approach.
We realize this might seem like an intimidating tool, since most potential users are not R programmers. Although it’s true that we can currently only provide support for using this tool in R, we want to make it as accessible as possible. On this page, our goal is to provide thorough instructions that will demystify the process and limit how intimidating it feels. But we also recognize there may be areas where things need to be clearer. If you find anything confusing, please feel free to let us know by filing an issue, and we will do our best to revise in a way that makes these instructions easier to follow.
Before you can use IntakeBalance
, you need to get your
machine set up. The first step is to install R if you haven’t already.
We also recommend installing
RStudio. The default settings should do the trick, so you can just
run the installers. Then open up RStudio and prepare to run some code in
the console. (The bottom left pane. Code can be pasted in like it would
in a document. Then press [enter]
to execute the code.)
The next task is to install the IntakeBalance
package.
This could be done much more concisely than what’s shown in the code
below, but we’re going to go the long way in hopes of avoiding any
breakdowns that might occur. (Sometimes things work differently
depending on your specific computer setup or prior installation of R, so
this longer approach helps us to [hopefully] account for that.) Just
copy, paste, and execute all of the following code:
## List the required packages to install from mainstream R archive
<- c(
cran_packages "digest", "dplyr", "knitr", "lubridate", "magrittr",
"nnet", "PAutilities", "PhysActBedRest", "PhysicalActivity",
"purrr", "randomForest", "read.gt3x", "devtools", "rmarkdown",
"rlang", "Sojourn", "stats", "testthat", "tidyr", "tree",
"TwoRegression", "units", "utils", "xml2"
)
## Install any of the previous packages that aren't already installed
invisible(lapply(
cran_packages,function(x) if (!isTRUE(requireNamespace(x, quietly = TRUE)))
install.packages(x)
))
## List the required packages to install from GitHub
<- c(
remote_packages "paulhibbing/anthropometry", "paulhibbing/accelEE",
"paulhibbing/EE.Data", "paulhibbing/IntakeBalance"
)
## Install any of the GitHub packages that aren't already installed
invisible(lapply(
remote_packages,function(x) if (!isTRUE(requireNamespace(x, quietly = TRUE)))
::install_github(x, dependencies = FALSE)
devtools
))
## Install one other package using a different method,
## if it hasn't already been installed
if (!requireNamespace("AGread", quietly = TRUE))
::install_version("AGread", "1.3.2") devtools
Once you’ve finished the above, you’re ready to look at an example of
using the IntakeBalance
package.
The core of IntakeBalance
is an eponymous function that
simply calculates EI based on information about energy expenditure and
changes in body composition. As shown below, the function runs in two
ways depending on what information you supply. You can read more about
this in the function documentation – Just run
?IntakeBalance::IntakeBalance
.
Let’s look at some imaginary data to show both ways this can work.
This is designed for making calculations when you’re directly passing in data values for the requested pieces of information (i.e., “parameters” or “arguments”). It looks like this:
## Generate the result
<- IntakeBalance::IntakeBalance(
vector_result fm_start = 50.5, ## Fat mass at the start of the measurement period
fm_end = 50.7, ## Fat mass at the end of the measurement period
ffm_start = 75.1, ## Fat-free mass at the start of the measurement period
ffm_end = 74.9, ## Fat-free mass at the end of the measurement period
fm_units = "kg", ## Units of measure for the fat mass value
ffm_units = "kg", ## Units of measure for the fat-free mass value
ee_per_day = 1950, ## Daily energy expenditure
ee_units = "kcal", ## Units of measure for the daily energy expenditure value
n_days = 14 ## Number of days between the start and end measurements
)
## Show the result
::kable(vector_result) knitr
x |
---|
2071.143 |
This tells us that calculated EI was 2071 kcal/day.
This is designed for making calculations when you have data for multiple individuals stored in a data frame. To show how it works, we first need to create some imaginary data:
## Create the data frame
<- data.frame(
info PID = c("001", "002"),
fm_start = c(50.5, 70.2), fm_end = c(50.7, 70.0),
ffm_start = c(75.1, 90.3), ffm_end = c(74.9, 90.3),
ee = c(1950, 2473), n_days = c(14, 14)
)
## Show the data frame
::kable(info) knitr
PID | fm_start | fm_end | ffm_start | ffm_end | ee | n_days |
---|---|---|---|---|---|---|
001 | 50.5 | 50.7 | 75.1 | 74.9 | 1950 | 14 |
002 | 70.2 | 70.0 | 90.3 | 90.3 | 2473 | 14 |
Now let’s calculate EI:
## Generate the result
<- IntakeBalance::IntakeBalance(
df_result
## These arguments still refer to the same information outlined previously,
## but now we have added a layer of abstraction to reference the names of
## columns where that information is stored, rather than providing the
## values themselves
fm_start = "fm_start",
fm_end = "fm_end",
ffm_start = "ffm_start",
ffm_end = "ffm_end",
ee_per_day = "ee",
n_days = "n_days",
## The trick is to pass in a data frame via this extra argument. That's how R
## knows to interpret the other variables as column names rather than raw values
df = info
)
## Show the result
::kable(df_result) knitr
PID | fm_start | fm_end | ffm_start | ffm_end | ee | n_days | delta_ES | EI |
---|---|---|---|---|---|---|---|---|
001 | 50.5 | 50.7 | 75.1 | 74.9 | 1950 | 14 | 121.1429 | 2071.143 |
002 | 70.2 | 70.0 | 90.3 | 90.3 | 2473 | 14 | -135.7143 | 2337.286 |
You can see that this returns the original data frame, updated with
new columns that indicate the change in energy storage
(delta_ES
) and the EI values. Furthermore, you can see that
the values in the first row of info
match with what was
provided in the example for Option 1. Cool!
The above examples demonstrated the core functionality of the
IntakeBalance
package. But things get more powerful when we
combine this with other packages to help with wrangling wearable data
and incorporating it into the intake-balance framework. Here, we will
show how you could use the accelEE
package to process an
ActiGraph file using the same methods from our doubly labeled water
study. Unfortunately, the code is designed for combining
gt3x
and agd
files (raw acceleration and
activity count formats, respectively) that each contain at least one
full day of data. This is an issue because sample files tend to be small
and only available in one format. So, throughout the code below, you
will see some extra steps we need to take, just to create a large,
properly-formatted, illustration-ready dataset that plays nicely with
the code. Don’t let this throw you off. We’ll tell you whenever this
these extraneous steps are happening, and you won’t need to worry about
them when dealing with your own data from the real world.
The accelerometer-based intake-balance process unfolds in three phases outlined below.
The first step is to read a “raw data” file (gt3x
extension) into R. We’ll use one that was built in with one of the
packages we installed at the beginning. It only contains 40 min of data,
so we’ll copy it several times and pretend it spans several days
instead.
## Find the file
<- system.file(
filename "extdata/TAS1H30182785_2019-09-17.gt3x",
package = "read.gt3x"
)
## Read the file using the `accelEE` package
<- accelEE::ee_file(filename = filename, scheme = "Hibbing 2023")
ex_data
## Extend the data file for the sake of illustration
## (ignore this part when processing real-world data)
<- data.frame(
ex_data Timestamp = ex_data$Timestamp[1] + (60*0:3199),
do.call(rbind, replicate(80, ex_data[ ,-1], simplify = FALSE))
)
## Show some of the contents
::kable(head(dplyr::select(ex_data, !matches("features")))) knitr
Timestamp | Accelerometer_X | Accelerometer_Y | Accelerometer_Z | vm_raw_g | ENMO_mg | hildebrand_linear | hildebrand_nonlinear | hibbing_left | hibbing_right | montoye_left | montoye_right | staudenmayer_lm | staudenmayer_rf |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2019-09-17 18:40:00 | -0.8564652 | 0.1800915 | 0.3399457 | 1.645593 | 688.42025 | 0.0776331 | 0.0792006 | 0.0838148 | 0.0836703 | 0.0800848 | 0.1050642 | 0.1633019 | 0.0703252 |
2019-09-17 18:41:00 | -0.6346995 | 1.0086462 | 0.5164685 | 1.685206 | 708.16089 | 0.1070825 | 0.1117054 | 0.1208968 | 0.1302841 | 0.0655763 | 0.0932124 | 0.1160207 | 0.0861830 |
2019-09-17 18:42:00 | -0.3916835 | 0.8530120 | 0.5038665 | 1.141492 | 183.34091 | 0.0639203 | 0.0700443 | 0.0738617 | 0.0836024 | 0.0505405 | 0.0667929 | 0.0646504 | 0.0761848 |
2019-09-17 18:43:00 | -0.3844428 | 0.8275600 | 0.5240558 | 1.100549 | 150.39621 | 0.0587946 | 0.0631705 | 0.0619429 | 0.0746901 | 0.0472104 | 0.0565958 | 0.0631195 | 0.0716907 |
2019-09-17 18:44:00 | -0.8493325 | -0.1674640 | 0.0766567 | 1.017237 | 27.76878 | 0.0205919 | 0.0229740 | 0.0270425 | 0.0266269 | 0.0311238 | 0.0374416 | 0.0284692 | 0.0248229 |
2019-09-17 18:45:00 | -0.9880000 | -0.2270000 | -0.0200000 | 1.013939 | 13.93935 | 0.0145860 | 0.0178881 | 0.0212713 | 0.0212713 | 0.0201839 | 0.0277541 | 0.0170170 | 0.0170170 |
So far, this gives us minute-by-minute summaries of accelerometer
variables, plus estimates of energy expenditure (kcal/kg) for the
different methods. Next, we need to account for non-wear and sleep. If
you have a file with agd
extension, you can use syntax just
like the above to acquire that information. For example, you might run
agd_data <- accelEE::ee_file(filename = "myfolder/myfile.agd", scheme = "Hibbing 2023")
.
R will recognize that it is an agd
file and not a
gt3x
file, and it will then run the desired non-wear and
sleep procedures instead of generating estimates of energy
expenditure.
For this example, though, we don’t have a built in agd
file. Instead, we’ll just make up some activity count data. (You’re
welcome to run the below code if you’re following along with this
example, but in an actual application, as described previously, you
would use an agd
file rather than completing this
step.)
## Set a starting point for the random number generator so that sampling is
## repeatable when generating the `Axis1` and `valid_status` columns
set.seed(610)
## Now generate the sleep/non-wear dataframe
<- data.frame(
agd_data Timestamp = ex_data$Timestamp,
Axis1 = sample(0:200, 3200, TRUE, exp(rnorm(201, log(30), log(5)))),
Axis2 = 0,
Axis3 = 0,
valid_status = sample(
c("Awake-Wear", "Non-Wear", "Sleep"),
3200,
TRUE,
c(2, 8, 14)
) )
Once we have data related to both energy expenditure
(.gt3x
origin) and sleep/non-wear (.agd
origin), we can merge the two together and get a daily summary:
<- merge(ex_data, agd_data)
full_data
<- accelEE::ee_summary(full_data, scheme = "Hibbing 2023")
ee_data
::kable(ee_data) knitr
Date | total_minutes | total_hildebrand_linear | total_hildebrand_nonlinear | total_hibbing_left | total_hibbing_right | total_montoye_left | total_montoye_right | total_staudenmayer_lm | total_staudenmayer_rf | total_Axis1 | total_Axis2 | total_Axis3 | total_is_Sleep | total_is_NonWear |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2019-09-18 | 1440 | 2.536927 | 2.892719 | 3.382906 | 3.486795 | 3.440164 | 4.036298 | 3.208864 | 3.011168 | 11901 | 0 | 0 | 828 | 490 |
2019-09-19 | 1440 | 2.512302 | 2.876073 | 3.361450 | 3.471130 | 3.412406 | 4.015132 | 3.158153 | 2.979310 | 11976 | 0 | 0 | 816 | 500 |
The previous step gave us daily estimates of total energy expenditure, expressed relative to body mass (kcal/kg) and exclusively reflecting measurement during awake wear time. To calculate total (absolute) energy expenditure for the full day (including sleep/non-wear), we need to bring in our body composition data and make another calculation or two.
## First, let's get our body composition data. We'll pretend it's the first row
## of the `info` data frame we created earlier. We'll remove the `ee` column to
## avoid confusion, and we'll also assign an ID label of `001` and a body mass
## of 87 kg
<- info[1, ]
body_comp $ee <- NULL
body_comp$wt_kg <- 87
body_comp
## Next, let's determine a Schofield prediction of basal metabolism to impute
## during periods of sleep/non-wear
$schofield <- PAutilities::get_ree(
body_compmethod = "schofield_wt_ht", sex = "M", age_yr = 55,
wt_kg = 87, ht_m = 1.85, output = "kcal_day"
)
## It's time to merge the body composition and energy expenditure data
<- merge(body_comp, ee_data)
ei_data
## Next, we want to take our EE estimates (kcal/kg for awake wear periods) and
## convert them to total EE (i.e., no longer relative to body weight, and
## imputing basal metabolic expenditure for periods of sleep/non-wear).
## To do this, we'll use some functions from the `dplyr` package. Note that you
## can interpret the `%>%` operator as meaning 'next do ____' or 'then do ___'.
## The `%<>%` operator is for assignment (i.e., saves output to the `ei_data` object).
suppressPackageStartupMessages(library(dplyr))
%<>%
ei_data mutate(across(
:total_staudenmayer_rf,
total_hildebrand_linear~ .x * wt_kg + schofield * ((total_is_NonWear + total_is_Sleep) / 1440),
.names = "{gsub(\"^total\", \"EE\", .col)}"
%>%
)) select(
any_of(names(body_comp)),
!c(Date, total_hildebrand_linear:total_staudenmayer_rf)
%>%
) group_by(across(any_of(names(body_comp)))) %>%
summarise(across(
everything(), mean, .names = "{gsub(\"^total\", \"mean\", .col)}"
.groups = "drop")
),
::kable(ei_data) knitr
PID | fm_start | fm_end | ffm_start | ffm_end | n_days | wt_kg | schofield | mean_minutes | mean_Axis1 | mean_Axis2 | mean_Axis3 | mean_is_Sleep | mean_is_NonWear | EE_hildebrand_linear | EE_hildebrand_nonlinear | EE_hibbing_left | EE_hibbing_right | EE_montoye_left | EE_montoye_right | EE_staudenmayer_lm | EE_staudenmayer_rf |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
001 | 50.5 | 50.7 | 75.1 | 74.9 | 14 | 87 | 1870.377 | 1440 | 11938.5 | 0 | 0 | 822 | 495 | 1930.257 | 1961.558 | 2003.995 | 2013.286 | 2008.703 | 2060.853 | 1987.581 | 1971.202 |
## Note that the `n_days` column refers to the number of days between
## body composition measurements, not necessarily the number of days of
## accelerometer data included in the analysis (n = 2 in this case)
At this point, we are ready for the rest of our calculations.
## Generate the result
<-
ei_result %>%
ei_data mutate(across(
:EE_staudenmayer_rf,
EE_hildebrand_linear~ IntakeBalance::IntakeBalance(
ee_per_day = .x
fm_start, fm_end, ffm_start, ffm_end,
),.names = "{gsub(\"^EE\", \"EI\", .col)}"
))
## Show the result
::kable(ei_result) knitr
PID | fm_start | fm_end | ffm_start | ffm_end | n_days | wt_kg | schofield | mean_minutes | mean_Axis1 | mean_Axis2 | mean_Axis3 | mean_is_Sleep | mean_is_NonWear | EE_hildebrand_linear | EE_hildebrand_nonlinear | EE_hibbing_left | EE_hibbing_right | EE_montoye_left | EE_montoye_right | EE_staudenmayer_lm | EE_staudenmayer_rf | EI_hildebrand_linear | EI_hildebrand_nonlinear | EI_hibbing_left | EI_hibbing_right | EI_montoye_left | EI_montoye_right | EI_staudenmayer_lm | EI_staudenmayer_rf |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
001 | 50.5 | 50.7 | 75.1 | 74.9 | 14 | 87 | 1870.377 | 1440 | 11938.5 | 0 | 0 | 822 | 495 | 1930.257 | 1961.558 | 2003.995 | 2013.286 | 2008.703 | 2060.853 | 1987.581 | 1971.202 | 3626.257 | 3657.558 | 3699.995 | 3709.286 | 3704.703 | 3756.853 | 3683.581 | 3667.202 |
Voila! The data frame now has new columns that specify EI estimates for each of the accelerometer-based methods.
Hopefully this introduction to the IntakeBalance
package
makes it less intimidating and gets you on your way with implementing it
in your own research. Although there is a lot of code in the above
examples (reflecting several steps you need to implement on your own),
it is substantially less than the number of operations you would have to
perform by hand in order to achieve the same outcome. With a little
tweaking for your own projects, our hope is that this could result in
stronger analyses, better quality control, and eventually time saved as
you grow more familiar with using this tool. As noted previously, please
let
us know if we can improve this vignette or anything else about
IntakeBalance
to make it more inviting and accessible. We
believe that tools like these have a strong potential to advance science
in the coming years, so we want to provide as much support as we can for
increasing their uptake. Thanks for your interest, and we wish you the
best with using IntakeBalance
!