
ADNIMERGE2-Derived-Data
Last Updated: July 25, 2025
Source:vignettes/ADNIMERGE2-Derived-Data.Rmd
ADNIMERGE2-Derived-Data.Rmd
Introduction
This article describes creating derived datasets from raw data
(eCRFs) for the ADNI study. The purpose of generating such standardized
dataset is to create analysis ready dataset using the PHARMERVERSE workflow, which
required standardized dataset as input. In ADNIMERGE2
R
data package, the following derived datasets will be created for
illustration purpose, and more information about of each dataset is
presented in the corresponding subsections.
- Demographic (DM)
- Subject Characteristics (SC)
- Adverse Event (AE)
- Questionnaires (QS)
- Clinical Classification (RS)
- Nervous System Finding (NV)
- Laboratory Test Result (LB)
- Genomics Findings (GF)
- Vital Sign (VS)
NOTE:
ORIGPROT variable is included in the DM dataset to identify the first study protocol/phase of subjects’ enrollment in the ADNI study.
COLPROT or —GRPID variables are also included in some of the derived dataset to identify the data collection study protocol/phase in the ADNI study.
VISITNUM variable is included only for creating study epoch across the study phases and mainly used for data merging purpose and sorting. It is not recommended to use VISITNUM variable in analysis related to theses derived dataset. Learn more about how the study visit number VISITNUM is created in
Data Preparation
section below.These derived dataset may not be fully complied with the CDISC-SDTM standardization.
Load Required R Packages
library(tidyverse)
library(assertr)
library(labelled)
library(DT)
library(measurements)
library(sdtm.oak)
# ADNI study R data package
library(ADNIMERGE2)
Data Preparation
In the study, subjects are categorized into two groups:
New Subject: Subjects that did not participate/enroll in any of the previous ADNI study phases prior to a given study phase.
Rollover Subject: Subjects that participated/enrolled at least in one of the previous study phases prior to a given study phase.
Study Visit and Visit Number
Some data wrangling was performed prior creating the specified
derived datasets. First, study epoch (EPOCH) with corresponding
study visit number (VISITNUM) was created, and stored as
EPOCH_LIST_LONG
. Next, subject-specific study visit with
corresponding study epoch was created based on the REGISTRY
and ROSTER
eCRFs along with
the EPOCH_LIST_LONG
data. Then, the result output named as
ADNI_VISIT_RECORD
. Here, rollover subjects may not have
sequential VISITNUM due to the study design which allows
rollovers to enroll in the next study phase before completing a given
study phase. Thus, VISITNUM records can be used for data
merging purpose and sorting with corresponding study visit date/form
completion date.
For more information, please refer to the package vignettes
vignette(topic = 'ADNIMERGE2-Derived-Data', package = 'ADNIMERGE2')
source script.
Assessement Completion Date and Status
A study visit date from the REGISTRY
table (eCRF)
will be used if the assessment specific completion date is missing with
known assessment result. Furthermore, any assessment with
missing/unknown results will be considered as not completed/done.
Baseline Flag
A baseline flag is included in some of the derived dataset to identify records that are closest to subject enrollment date or collected at baseline visit. A record is flagged as baseline if one of the following criteria is met:
Record collected at baseline visits within 30 days from the enrollment date
Any record collected prior to the baseline visits and closest to the enrollment date if the baseline record value is missing or not collected.
The assessment baseline flag is created using the derive_blfl_adni
function which is based the sdtm.oak::derive_blfl
function
with minor study-specific modification.
NOTE: However, there might be more than one baseline flags
per assessment per subject in a given derived dataset. For future work,
derive_blfl_adni
function will be updated to account such
cases as future.
# Load utils function from package system file
utils_file_list <- c(
"derived-dataset-sdtmoak-utils.R",
"derived-dataset-utils.R",
"derived-labdata-utils.R"
)
utils_file_path <- system.file(
utils_file_list,
package = "ADNIMERGE2",
mustWork = TRUE
)
load_utils_funs <- lapply(utils_file_path, source)
# Add study track (i.e., New or Rollover)
REGISTRY <- REGISTRY %>%
mutate(PTTYPE = adni_study_track(COLPROT, ORIGPROT))
ROSTER <- ROSTER %>%
mutate(PTTYPE = adni_study_track(COLPROT, ORIGPROT))
Building Derived Dataset
Demographic (DM)
DM
dataset contains
one records per subject when they were enrolled or screened in the ADNI
study for the first time (i.e. as new-enrollee). The DM
dataset will contains
the following characteristics:
# Join columns
dm_join_var <- c("RID", "ORIGPROT")
# Demographic data columns
dm_cols <- c("PTGENDER", "PTRACCAT", "PTETHCAT", "PTDOB")
names(dm_cols) <- c("SEX", "RACE", "ETHNIC", "BRTHDTC")
DM <- tibble(RID = unique_rid) %>%
mutate(ORIGPROT = original_study_protocol(RID = RID)) %>%
# Add PTDEMOG record
left_join(
PTDEMOG %>%
assert_non_missing(ORIGPROT, COLPROT) %>%
filter(ORIGPROT == COLPROT) %>%
group_by(RID) %>%
filter(
(any(!is.na(VISDATE)) & VISDATE == min(VISDATE, na.rm = TRUE)) |
(all(is.na(VISDATE)) & row_number() == 1)
) %>%
ungroup() %>%
assert_uniq(RID) %>%
select(RID, ORIGPROT, VISDATE, all_of(as.character(dm_cols))),
by = dm_join_var
) %>%
rename(all_of(dm_cols)) %>%
generate_oak_id_vars_adni(raw_src = "PTDEMOG")
DM <- DM %>%
# Add screening visit date
left_join(
get_adni_screen_date(
.registry = REGISTRY,
phase = "Overall",
both = FALSE,
multiple_screen_visit = FALSE
) %>%
select(RID, ORIGPROT, SESTDTC = SCREENDATE) %>%
assert(is_uniq, RID),
by = dm_join_var
) %>%
# Add enrollment/RFSTDTC date
create_rfstdtc(.registry = REGISTRY) %>%
# Compute age based on first screening date
mutate(
AGE = round(as.numeric(SESTDTC - my(BRTHDTC)) / 365.25, 1),
AGEU = "Years",
SUBJID = as.character(RID),
ARMCD = NA_character_,
ACTARM = NA_character_,
ARM = NA_character_,
ACTARMCD = NA_character_,
ACTARM = NA_character_,
ARMNRS = "Non-Interventional Study",
COUNTRY = NA_character_,
DMDTC = create_iso8601(as.character(VISDATE), .format = "y-m-d")
)
# Add death flag
DM <- DM %>%
left_join(
get_death_flag(
.studysum = STUDYSUM,
.adverse = ADVERSE,
.recadv = RECADV
) %>%
verify(all(DTHFL == "Yes")) %>%
select(RID, ORIGPROT, DTHFL, DTHDTC),
by = dm_join_var
)
# Add last known disposition date
DM <- DM %>%
left_join(
get_disposition_flag(
.registry = REGISTRY,
.studysum = STUDYSUM
) %>%
mutate(COLPROT = factor(COLPROT, levels = adni_phase())) %>%
group_by(RID) %>%
arrange(COLPROT) %>%
# Last known discontinuation/disposition date
filter(row_number() == n()) %>%
ungroup() %>%
select(RID, ORIGPROT, SDSTATUS, RFPENDTC = SDDATE),
by = dm_join_var
) %>%
mutate(RFPENDTC = case_when(
!is.na(DTHDTC) & is.na(RFPENDTC) ~ as.character(DTHDTC),
TRUE ~ as.character(RFPENDTC)
))
DM <- DM %>%
# Derive USUBJID and SITIED
derive_usubjid(
.data = .,
.registry = REGISTRY,
.roster = ROSTER,
.ptdemog = PTDEMOG,
varList = c("USUBJID", "SITEID")
) %>%
derive_study_day_adni(
sdtm_in = .,
domain = "DM",
dm_domain = .,
refdt = "RFSTDTC"
) %>%
assign_studyid_domain(
studyid = "ADNI",
domain = "DM"
) %>%
assign_vars_label(
.data = .,
data_dict = dm_data_dic,
.strict = TRUE
) %>%
assert_non_missing(SITEID, USUBJID, SUBJID)
Subject Characteristics (SC)
SC
dataset contains
subject-related data that are not collected in DM
domain and/or
characteristics that are collected over time (i.e. across the ADNI
phases: ADNI1, ADNIGO, ADNI2, ADNI3, and ADNI4). The SC
dataset will contains
the following characteristics collected from three source datasets: PTDEMOG
, ADI
and RURALITY
.
# Full demographic records ----
sc_common_cols <- c("ORIGPROT", "COLPROT", "RID", "VISCODE", "VISDATE")
SC_PTDEMOG <- PTDEMOG %>%
select(-all_of(
c(
"PTID", "VISCODE2", "ID", "SITEID", "USERDATE", "USERDATE2",
"DD_CRF_VERSION_LABEL", "LANGUAGE_CODE", "HAS_QC_ERROR", "update_stamp"
)
)) %>%
mutate(COLPROT = factor(COLPROT, levels = adni_phase())) %>%
assert_non_missing(COLPROT) %>%
group_by(RID, ORIGPROT, COLPROT) %>%
# Since it is observational study
arrange(COLPROT, VISDATE) %>%
fill(-all_of(sc_common_cols), .direction = "down") %>%
fill(-all_of(sc_common_cols), .direction = "up") %>%
ungroup() %>%
mutate(across(-all_of(sc_common_cols), as.character)) %>%
mutate(VISCODE = factor(VISCODE, levels = c(unique(EPOCH_LIST$VISCODE)))) %>%
assert_non_missing(VISCODE)
SC_DEMOG <- tibble(RID = unique_rid) %>%
mutate(ORIGPROT = original_study_protocol(RID = RID)) %>%
left_join(
SC_PTDEMOG,
by = dm_join_var
) %>%
pivot_longer(
cols = !all_of(sc_common_cols),
names_to = "SCTESTCD",
values_to = "SCORRES"
) %>%
# Remove missing values
drop_na(SCORRES) %>%
mutate(
SCCAT = "Demographic Records",
SCDTC = as.character(VISDATE)
) %>%
# Required a long format dataset
generate_oak_id_vars_adni(raw_src = "PTDEMOG")
# Area Deprivation Index (ADI) for ADNI4 phase only ----
SC_ADI <- ADI %>%
verify(all(COLPROT == adni_phase()[5])) %>%
select(
RID, ORIGPROT, COLPROT, VISCODE, ADISTATE, ADINATIONAL, ADIREV,
SCDTC = ADIDATE
) %>%
generate_oak_id_vars_adni(raw_src = "ADI") %>%
mutate(across(c(ADISTATE, ADINATIONAL, ADIREV, SCDTC), as.character)) %>%
pivot_longer(
cols = c(ADISTATE, ADINATIONAL, ADIREV),
names_to = "SCTESTCD",
values_to = "SCORRES"
) %>%
mutate(SCCAT = "Area Deprivation Index")
# RUCA and RUCC from ADNI4 phase only ----
rurality_cols <- c("RUCA", "RUCC", "RUCA_2010", "RUCC_2023")
SC_RURALITY <- RURALITY %>%
verify(all(COLPROT == adni_phase()[5])) %>%
select(
RID, ORIGPROT, COLPROT, VISCODE, all_of(rurality_cols),
SCDTC = RURDATE
) %>%
generate_oak_id_vars_adni(raw_src = "RURALITY") %>%
mutate(across(all_of(c(rurality_cols, "SCDTC")), as.character)) %>%
pivot_longer(
cols = all_of(rurality_cols),
names_to = "SCTESTCD",
values_to = "SCORRES"
) %>%
mutate(SCCAT = "Rurality")
SC <- bind_rows(SC_DEMOG, SC_ADI, SC_RURALITY) %>%
assert_uniq(RID, ORIGPROT, COLPROT, VISCODE, SCTESTCD) %>%
mutate(
COLPROT = factor(COLPROT, levels = adni_phase()),
SCGRPID = COLPROT,
SCSTAT = case_when(is.na(SCDTC) ~ "NOT DONE"),
SCSTRESN = as.numeric(SCORRES),
SCDTC = create_iso8601(SCDTC, .format = "y-m-d")
) %>%
assert_uniq(RID, ORIGPROT, COLPROT, VISCODE, SCTESTCD) %>%
derive_usubjid(varList = "USUBJID") %>%
assign_studyid_domain(
.data = .,
studyid = "ADNI",
domain = "SC"
) %>%
assign_visit_attr(
.data = .,
visit_record_data = ADNI_VISIT_RECORD,
domain = "SC",
check_missing = TRUE
) %>%
assign_epoch(
.data = .,
.epoch = EPOCH_LIST_LONG
) %>%
derive_blfl_adni(
sdtm_in = .,
dm_domain = DM,
tgt_var = "SCBLFL"
) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "SC",
refdt = "RFSTDTC"
) %>%
derive_seq(
tgt_var = "SCSEQ",
rec_vars = c("USUBJID", "SCTESTCD", "SCGRPID")
)
Adverse Events (AE)
AE
dataset contains
one records per adverse event per subject. The AE
dataset will includes
the following characteristics for subjects that have at least one
adverse events experience during the study.
NOTE: Currently, only includes records that are collected in ADNI3 and ADNI4 study phases.
# Adverse Event for ADNI3 and ADNI4
AE_ADNI34 <- ADVERSE %>%
select(
ID, RID, ORIGPROT, COLPROT, VISCODE, SITEID, AENUMBER, AEOUTCOME,
AEHONSDT, AEHCSDT, AEHDTHDT, AERELAD, AERELCM, AERELFLRBTBN, AERELFLRBPR,
AEHIMG, AERELTAU, AERELNAV, AERELMK, AERELPI, AEHLUMB, AERELCOVID, AERELPAN,
AERELATESP, AESERIOUS, AESERDATE, SAELIFE, SAEHOSPIT, SAEPROLONG, SAEDEATH,
SAECONGEN, SAEDISAB, SAEOTHER, AEHCMEDS,
AESEV0 = AEHSEVR, all_of(paste0("AESEV", 1:10))
) %>%
generate_oak_id_vars_adni(raw_src = "ADVERSE") %>%
assert_non_missing(AENUMBER) %>%
# Filter the worst severity level per RID, AENUMBER, COLPROT
mutate(across(all_of(paste0("AESEV", 0:10)), as.character)) %>%
pivot_longer(
cols = all_of(paste0("AESEV", 0:10)),
names_to = "SEVERITY_COL",
values_to = "AESEV"
) %>%
mutate(AESEV_NUM = case_when(
AESEV == "Mild" ~ 1,
AESEV == "Moderate" ~ 2,
AESEV == "Severe" ~ 3
)) %>%
group_by(RID, ORIGPROT, COLPROT, SITEID, AENUMBER, VISCODE) %>%
filter(
(all(is.na(AESEV)) & row_number() == 1) |
(any(!is.na(AESEV)) & AESEV_NUM == max(AESEV_NUM, na.rm = TRUE))
) %>%
filter(
(n() > 1 & row_number() == n()) |
(n() == 1 & row_number() == 1)
) %>%
ungroup() %>%
verify(nrow(.) == nrow(ADVERSE)) %>%
assert_uniq(RID, ORIGPROT, COLPROT, AENUMBER, VISCODE) %>%
rename(
"AESER" = AESERIOUS, "AEOUT" = AEOUTCOME,
"AESCONG" = SAECONGEN, "AESDISAB" = SAEDISAB, "AESDTH" = SAEDEATH,
"AESLIFE" = SAELIFE, "AESMIE" = SAEOTHER, "AECONTRT" = AEHCMEDS,
"AESTDTC" = AEHONSDT, "AEENDTC" = AEHCSDT
) %>%
# Adverse events for `required or prolongs hospitalization`
mutate(
AESHOSP = case_when(
SAEHOSPIT == "Yes" | SAEPROLONG == "Yes" ~ "Yes",
SAEHOSPIT == "No" & SAEPROLONG == "No" ~ "No",
SAEHOSPIT == "No" & is.na(SAEPROLONG) ~ "No",
is.na(SAEHOSPIT) & SAEPROLONG == "No" ~ "No"
)
) %>%
select(-c(SAEHOSPIT, SAEPROLONG, AESEV_NUM, SEVERITY_COL))
# Required checking for missing AENUMBER in RECADV
AE_ADNI12GO <- tibble(PHASE = NA_character_) %>%
na.omit()
AE <- AE_ADNI34 %>%
bind_rows(AE_ADNI12GO) %>%
assert_non_missing(RID) %>%
mutate(
AESTDTC = as.character(AESTDTC),
AEENDTC = as.character(AEENDTC)
) %>%
group_by(RID) %>%
arrange(AESTDTC) %>%
mutate(AESEQ = row_number()) %>%
ungroup()
AE <- AE %>%
derive_usubjid() %>%
assign_studyid_domain(domain = "AE") %>%
assign_visit_attr() %>%
assign_epoch() %>%
mutate(
AEGRPID = COLPROT,
AETERM = NA_character_,
AELLT = NA_character_,
AELLTCD = NA_character_,
AEDECOD = NA_character_,
AEPTCD = NA_character_,
AEHLT = NA_character_,
AEHLTCD = NA_character_,
AEHLGT = NA_character_,
AEHLGTCD = NA_character_,
AESOC = NA_character_,
AESOCCD = NA_character_
) %>%
assign_vars_label(data_dict = ae_data_dic)
Questionnaires (QS)
QS
dataset contains
one record per parameter finding (i.e. total score) per visit per
subject. The QS
dataset
will contains the following characteristics:
The following cognitive/functional assessment scores are included in
the QS
dataset.
qs_com_cols <- c(
"RID", "ORIGPROT", "COLPROT", "VISCODE",
"VISCODE2", "VISDATE", "SITEID"
)
# ADAS Cognitive Behavior Total Score ----
## Completed variable ??
ADAS_SCORE_DATA <- ADAS %>%
select(all_of(qs_com_cols), ADASTT11 = TOTSCORE, ADASTT13 = TOTAL13) %>%
generate_oak_id_vars_adni(raw_src = "ADAS") %>%
pivot_longer(
cols = c(ADASTT11, ADASTT13),
names_to = "QSTESTCD",
values_to = "QSSTRESC"
) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSDRVFL = "Yes",
QSSTAT = ifelse(is.na(QSSTRESC), "NOT DONE", NA_character_)
)
# Clinical Dementia Rating Score ----
## Completion columns??
CDR_SCORE_DATA <- CDR %>%
select(all_of(qs_com_cols), CDGLOBAL, CDRSB) %>%
generate_oak_id_vars_adni(raw_src = "CDR") %>%
pivot_longer(
cols = c(CDGLOBAL, CDRSB),
names_to = "QSTESTCD",
values_to = "QSSTRESC"
) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSSTAT = ifelse(is.na(QSSTRESC), "NOT DONE", NA_character_)
)
# Everyday Cognition Total Score ----
## Completion variable
ECOG_SCORE_DATA <- ECOGPT %>%
mutate(QSTESTCD = "ECOGPTTT") %>%
select(all_of(qs_com_cols), QSTESTCD, QSSTRESC = EcogPtTotal) %>%
generate_oak_id_vars_adni(raw_src = "ECOGPT") %>%
bind_rows(
ECOGSP %>%
mutate(QSTESTCD = "ECOGSPTT") %>%
select(all_of(qs_com_cols), QSTESTCD, QSSTRESC = EcogSPTotal) %>%
generate_oak_id_vars_adni(raw_src = "ECOGSP")
) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSDRVFL = "Yes",
QSSTAT = ifelse(is.na(QSSTRESC), "NOT DONE", NA_character_)
)
# Financial Capacity Instrument Short Form - Score ----
FCI_SCORE_DATA <- FCI %>%
select(all_of(qs_com_cols),
QSSTRESC = FCISCORE,
QSSTAT = DONE, QSREASND = NDREASON
) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "FCISCORE"
) %>%
generate_oak_id_vars_adni(raw_src = "FCI")
# Functional Assessments Questionnaires - Score ----
# Completion status ??
FAQ_SCORE_DATA <- FAQ %>%
select(all_of(qs_com_cols), QSSTRESC = FAQTOTAL) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "FAQTOTAL",
) %>%
generate_oak_id_vars_adni(raw_src = "FAQ")
# Geriatric Depression Scale ----
GDS_SCORE_DATA <- GDSCALE %>%
select(all_of(qs_com_cols), QSSTRESC = GDTOTAL, QSREASND = GDUNABL) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "GDTOTAL",
QSSTAT = ifelse(!is.na(QSREASND), "NOT DONE", NA_character_)
) %>%
generate_oak_id_vars_adni(raw_src = "GDSCALE")
# Mini Mental State Exam Score ----
MMSE_SCORE_DATA <- MMSE %>%
select(all_of(qs_com_cols),
QSSTRESC = MMSCORE,
QSSTAT = DONE, QSREASND = NDREASON
) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "MMSCORE",
QSDRVFL = "Yes"
) %>%
generate_oak_id_vars_adni(raw_src = "MMSE")
# Montreal Cognitive Assessments ----
# Completion status ??
MOCA_SCORE_DATA <- MOCA %>%
select(all_of(qs_com_cols), QSSTRESC = MOCA) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "MOCA",
QSDRVFL = "Yes"
) %>%
generate_oak_id_vars_adni(raw_src = "MOCA")
# Neuropsychiatric Inventory ----
# Completion status??
NPI_SCORE_DATA <- NPI %>%
rename("VISDATE" = EXAMDATE) %>%
select(all_of(qs_com_cols), QSSTRESC = NPITOTAL) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "NPITOTAL",
QSDRVFL = "Yes",
QSSTAT = ifelse(is.na(QSSTRESC), "NOT DONE", NA_character_)
) %>%
generate_oak_id_vars_adni(raw_src = "NPI")
# Neuropsychiatric Inventory Q ----
# Completion status??
NPIQ_SCORE_DATA <- NPIQ %>%
select(all_of(qs_com_cols), QSSTRESC = NPISCORE) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSTESTCD = "NPIQTOTL",
QSDRVFL = "Yes",
QSSTAT = ifelse(is.na(QSSTRESC), "NOT DONE", NA_character_)
) %>%
generate_oak_id_vars_adni(raw_src = "NPIQ")
# Logical Memory - Immediate/Delayed Recall ----
neurobat_cols <- c(
"LIMMTOTL", "LDELTOTL", "DIGITSCR", "TRABSCOR",
"RAVLTIMM", "RAVLTLRN", "RAVLTFG", "RAVLTFGP"
)
NEUROBAT_SCORE_DATA <- NEUROBAT %>%
compute_neurobat_subscore(.neurobat = .) %>%
select(all_of(c(qs_com_cols, neurobat_cols))) %>%
generate_oak_id_vars_adni(raw_src = "NEUROBAT") %>%
pivot_longer(
cols = all_of(neurobat_cols),
names_to = "QSTESTCD",
values_to = "QSSTRESC"
) %>%
mutate(
QSSTRESC = as.character(QSSTRESC),
QSDRVFL = case_when(
QSTESTCD %in% c(
"RAVLTIMM", "RAVLTRN", "RAVLTFG", "RAVLTFGP"
) ~ "Yes"
),
QSSTAT = ifelse(is.na(QSSTRESC), "NOT DONE", NA_character_)
)
# Score data names
score_data_names <- ls()[str_detect(ls(), "SCORE\\_DATA")]
QS <- mget(score_data_names) %>%
bind_rows() %>%
assert_non_missing(RID, COLPROT, VISCODE, QSTESTCD, VISCODE) %>%
mutate(
QSGRPID = COLPROT,
QSDTC = create_iso8601(as.character(VISDATE), .format = "y-m-d"),
QSSTRESN = as.numeric(QSSTRESC),
QSORRES = as.character(QSSTRESN),
QSSTAT = case_when(
QSSTAT %in% c("No", "NOT DONE") ~ "NOT DONE",
TRUE ~ NA_character_
)
) %>%
set_dom_test(
.data_list = QS_TESTCD_LIST %>%
select(QSTESTCD, QSTEST, QSCAT),
merge_by = "QSTESTCD"
) %>%
assert_uniq(RID, COLPROT, VISCODE, QSTESTCD) %>%
derive_usubjid() %>%
assign_studyid_domain(domain = "QS") %>%
assign_visit_attr() %>%
assign_epoch() %>%
derive_blfl_adni(
dm_domain = DM,
tgt_var = "QSBLFL"
) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "QS"
) %>%
derive_seq(
tgt_var = "QSSEQ",
rec_vars = c("USUBJID", "QSTESTCD", "COLPROT")
) %>%
# # Update QSORRES for derived scores
# mutate(QSORRES = case_when(is.na(QSDRVFL) ~ QSSTRESC)) %>%
assign_vars_label(data_dict = qs_data_dic)
Clinical Classification (RS)
RS
dataset will
contains one record per instrument status per visit per subject. The RS
dataset will contains
the following variables:
The RS
dataset will
contains clinical diagnostics summary of subject per study visit. The
clinical diagnostics status (i.e. either Cognitive Normal (CN), Mild
Cognitive Impairment (MCI) or Dementia/Alzheimer’s (DEM/AD)) of a
subject was determined by clinicians’ judgment.
# Clinical diagnostics status
RS <- DXSUM %>%
generate_oak_id_vars_adni(raw_src = "DXSUM") %>%
mutate(
DIAGNOSIS = case_when(
DIAGNOSIS %in% "Dementia" ~ "DEM",
TRUE ~ as.character(DIAGNOSIS)
),
RSTESTCD = "DX",
RSORRES = as.character(DIAGNOSIS),
RSSTRESC = as.character(DIAGNOSIS),
RSEVAL = SITEID,
RSDTC = create_iso8601(as.character(EXAMDATE), .format = "y-m-d"),
RSGRPID = COLPROT,
RSSTAT = NA_character_
) %>%
assert_non_missing(COLPROT)
RS <- RS %>%
set_dom_test(
.data_list = RS_TESTCD_LIST %>%
select(RSCAT, RSTEST, RSTESTCD),
merge_by = "RSTESTCD"
) %>%
derive_usubjid() %>%
assign_studyid_domain(domain = "RS") %>%
assign_visit_attr() %>%
assign_epoch() %>%
derive_blfl_adni(
dm_domain = DM,
tgt_var = "RSBLFL"
) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "RS"
) %>%
derive_seq(
tgt_var = "RSSEQ",
rec_vars = c("USUBJID", "RSTESTCD", "COLPROT")
) %>%
assign_vars_label(data_dict = rs_data_dic)
Nervous System Finding (NV)
NV
dataset contains
one record of physiological and morphological finding related to the
nervous system (including brain) per visit per subject. The NV
dataset will contains
the following variables:
# FDG-PET Analysis Results Data ----
FDG_PET_DATA <- UCBERKELEYFDG_8mm %>%
# Based on method manual document (on the data-shared platform)
select(ORIGPROT, RID, VISCODE, VISCODE2, EXAMDATE, ROINAME, MEAN) %>%
# Remove missing visit code
filter(!is.na(VISCODE)) %>%
assert_uniq(RID, VISCODE, VISCODE2, ROINAME) %>%
verify(all(ORIGPROT %in% adni_phase()[-5])) %>%
pivot_wider(
id_cols = everything(),
names_from = "ROINAME",
values_from = "MEAN"
) %>%
mutate(FDGMROI = MetaROI / Top50PonsVermis) %>%
select(-MetaROI, -Top50PonsVermis) %>%
check_duplicate_records(col_names = c("RID", "EXAMDATE", "VISCODE")) %>%
mutate(
NVMETHOD = "FDG PET",
NVTESTCD = "FDGMROI",
NVSTRESC = as.character(FDGMROI),
NVSTRESN = FDGMROI,
NVDRVFL = "Yes",
NVDTC = as.character(EXAMDATE)
) %>%
generate_oak_id_vars_adni(raw_src = "UCBERKELEYFDG_8mm")
# Mapping phase-specific visit code based opn registry
FDG_PET_DATA <- FDG_PET_DATA %>%
# Trying to map visit code from registry
use_dtplyr() %>%
left_join(
REGISTRY %>%
mutate(EXAMDATE = as.character(EXAMDATE)) %>%
select(RID, ORIGPROT, COLPROT, VISCODE, VISCODE2,
REGISTRY.EXAMDATE = EXAMDATE
) %>%
filter(!is.na(REGISTRY.EXAMDATE)) %>%
filter(RID %in% unique(FDG_PET_DATA$RID)) %>%
distinct() %>%
check_duplicate_records(
col_names = c("RID", "ORIGPROT", "VISCODE", "REGISTRY.EXAMDATE")
),
by = c("RID", "ORIGPROT", "VISCODE", "VISCODE2")
) %>%
as_tibble() %>%
mutate(COLPROT = case_when(
is.na(COLPROT) & VISCODE == VISCODE2 ~ ORIGPROT,
is.na(COLPROT) & VISCODE != VISCODE2 & VISCODE2 %in% "bl" ~ ORIGPROT,
TRUE ~ COLPROT
)) %>%
verify(nrow(.) == nrow(FDG_PET_DATA)) %>%
assert_non_missing(COLPROT) %>%
assert_uniq(RID, COLPROT, VISCODE, NVTESTCD)
# PIB PET Analysis Results Data -----
# Only for ADNI1 phase
PIB_PET_DATA <- PIBPETSUVR %>%
verify(all(ORIGPROT == adni_phase()[1])) %>%
# Based on previously generated dataset
mutate(PIB = rowMeans(across(c("ACG", "FRC", "PAR", "PRC")), na.rm = FALSE)) %>%
select(RID, ORIGPROT, VISCODE, EXAMDATE, NVSTRESN = PIB, LONIUID) %>%
check_duplicate_records(col_names = c("RID", "EXAMDATE")) %>%
mutate(
NVMETHOD = "PET SCAN",
NVTESTCD = "PIB",
NVSTRESU = NA_character_,
NVSTRESC = as.character(NVSTRESN),
NVDRVFL = "Yes",
NVDTC = as.character(EXAMDATE),
NVLNKID = as.character(LONIUID),
COLPROT = adni_phase()[1]
) %>%
assert_non_missing(VISCODE) %>%
assert_uniq(RID, VISCODE, NVTESTCD) %>%
generate_oak_id_vars_adni(raw_src = "PIBPETSUVR")
# Amyloid status ----
amystatus_lvls <- c("Non Elevated", "Elevated")
AMYREAD_DATA <- AMYREAD %>%
verify(all(COLPROT == adni_phase()[5])) %>%
# Based on a clinician decision
mutate(AMYSTAT = case_when(
str_detect(CONSENS, "No, visual read and quantification") ~ as.character(CONGRU),
str_detect(CONSENS, "Yes, this scan should be reviewed") ~ as.character(CONSENSRES)
)) %>%
mutate(
AMYSTAT = str_remove(AMYSTAT, " scan"),
TRACERTYPE = as.character(TRACERTYPE)
) %>%
verify(all(AMYSTAT %in% amystatus_lvls)) %>%
generate_oak_id_vars_adni(raw_src = "AMYREAD") %>%
pivot_longer(
cols = AMYSTAT,
names_to = "NVTESTCD",
values_to = "NVORRES"
) %>%
mutate(
NVDTC = as.character(SCANDATE),
NVSTRESC = as.character(NVORRES),
NVMETHOD = "PET",
NVDRVFL = "Yes"
)
# Common cols in PET data
pet_data_common_cols <- c(
"ORIGPROT", "LONIUID", "RID", "VISCODE", "SCANDATE", "PROCESSDATE",
"IMAGE_RESOLUTION", "TRACER", "qc_flag"
)
# Amyloid PET Data ----
amypet_cols <- list(
AMYSTAT = "AMYLOID_STATUS",
AMYSTATC = "AMYLOID_STATUS_COMPOSITE_REF",
SUVRSM = "SUMMARY_SUVR",
SUVRCR = "COMPOSITE_REF_SUVR",
CENTILOIDS = "CENTILOIDS"
)
AMYPET_DATA <- UCBERKELEY_AMY_6MM %>%
mutate(across(
all_of(as.character(amypet_cols[1:2])),
~ case_when(
.x == 1 ~ amystatus_lvls[2],
.x == 0 ~ amystatus_lvls[1]
)
)) %>%
rename_with_list(., name_char = amypet_cols, by_name = TRUE) %>%
select(all_of(c(pet_data_common_cols, names(amypet_cols)))) %>%
mutate(across(all_of(names(amypet_cols)), as.character)) %>%
generate_oak_id_vars_adni(raw_src = "UCBERKELEY_AMY_6MM") %>%
pivot_longer(
cols = all_of(names(amypet_cols)),
names_to = "NVTESTCD",
values_to = "NVSTRESC"
)
# Tau PET Data -----
taupet_cols <- list(
SUVRINFE = "INFERIORCEREBELLUM_SUVR",
SUVRSC = "ERODED_SUBCORTICALWM_SUVR",
SUVRMETA = "META_TEMPORAL_SUVR"
)
TAUPET_DATA <- UCBERKELEY_TAU_6MM %>%
# filter(!is.na(VISCODE)) %>%
rename_with_list(., name_char = taupet_cols, by_name = TRUE) %>%
select(all_of(c(pet_data_common_cols, names(taupet_cols)))) %>%
mutate(across(all_of(names(taupet_cols)), as.character)) %>%
generate_oak_id_vars_adni(raw_src = "UCBERKELEY_TAU_6MM") %>%
pivot_longer(
cols = all_of(names(taupet_cols)),
names_to = "NVTESTCD",
values_to = "NVSTRESC"
)
# Tau PET - PVC Data ----
taupet_pvc_cols <- list(
SUVRINFE = "INFERIORCEREBELLUM_SUVR",
SUVRCWM = "CEREBRAL_WHITE_MATTER_SUVR",
SUVRMETA = "META_TEMPORAL_SUVR"
)
TAUPET_PVC_DATA <- UCBERKELEY_TAUPVC_6MM %>%
mutate(
IMAGE_RESOLUTION = "None",
qc_flag = NA_real_
) %>%
rename_with_list(., name_char = taupet_pvc_cols, by_name = TRUE) %>%
select(all_of(c(pet_data_common_cols, names(taupet_pvc_cols)))) %>%
mutate(across(all_of(names(taupet_pvc_cols)), as.character)) %>%
generate_oak_id_vars_adni(raw_src = "UCBERKELEY_TAUPVC_6MM") %>%
pivot_longer(
cols = all_of(names(taupet_pvc_cols)),
names_to = "NVTESTCD",
values_to = "NVSTRESC"
)
# PET dataset
PET_join_var <- paste0(c("ORIGPROT", "RID", "SCANDATE"), "_MPL")
names(PET_join_var) <- str_remove_all(PET_join_var, "\\_MPL")
PET_DATA <- bind_rows(AMYPET_DATA, TAUPET_DATA, TAUPET_PVC_DATA) %>%
select(-VISCODE) %>%
# Fuzzy join for actual study phase and visits
left_fuzzy_join(
data1 = .,
data2 = IMAGING_MAPPING_LIST %>%
select(ORIGPROT, COLPROT, RID, VISCODE, SCANDATE, SOURCE) %>%
rename_with_list(., name_char = PET_join_var, by_name = FALSE),
join_by = PET_join_var,
check_cols = "COLPROT",
main_cols = "SCANDATE",
date_col = "SCANDATE"
) %>%
mutate(
LONIUID = as.character(LONIUID),
NVMETHOD = "PET",
NVDTC = as.character(SCANDATE),
NVLNKID = as.character(LONIUID),
NVSTRESN = as.numeric(NVSTRESC),
NVSTAT = case_when(qc_flag %in% -2:0 ~ "NOT DONE"),
NVREASND = case_when(
qc_flag == -2 ~ "CANNOT BE PROCESSED",
qc_flag == -1 ~ "NOT ASSESSED",
qc_flag == 0 ~ "FAIL"
)
) %>%
assert_non_missing(COLPROT)
NV <- bind_rows(FDG_PET_DATA, PIB_PET_DATA, AMYREAD_DATA, PET_DATA) %>%
mutate(
NVGRPID = COLPROT,
NVDTC = create_iso8601(NVDTC, .format = "y-m-d")
) %>%
left_join(
NV_TESTCD_LIST %>%
select(NVCAT, NVSCAT, NVTESTCD, NVTEST, SOURCE),
by = c("NVTESTCD" = "NVTESTCD", "raw_source" = "SOURCE"),
relationship = "many-to-many"
) %>%
assert_non_missing(NVTEST, NVSCAT, NVCAT) %>%
derive_usubjid() %>%
assign_studyid_domain(domain = "NV") %>%
assign_visit_attr(check_missing = TRUE) %>%
assign_epoch() %>%
derive_blfl_adni(
dm_domain = DM,
tgt_var = "NVBLFL"
) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "NV"
) %>%
derive_seq(
tgt_var = "NVSEQ",
rec_vars = c("USUBJID", "NVCAT", "NVSCAT", "NVTESTCD", "NVGRPID")
) %>%
assign_vars_label(nv_data_dic)
Laboratory Test Results (LB)
LB
dataset contains
laboratory test data such as hematology, clinical chemistry and
urinalysis per visit per subject. Additionally, the LB
dataset contains blood
plasma biomarkers results with the following variables:
The LB
dataset
contains the following blood plasma biomarker results in addition to the
safety lab data.
common_cols <- c(
"ORIGPROT", "COLPROT", "ID", "PTID", "RID", "SITEID", "VISCODE", "VISCODE2",
"USERDATE", "USERDATE2", "RECNO", "ACCNO", "COVVIS", "EXAMDATE", "update_stamp"
)
lab_cols <- c(
"ORIGPROT", "COLPROT", "RID", "SITEID",
"VISCODE", "VISCODE2", "EXAMDATE"
)
# Lab data for ANDI1-GO-2 phases
ADNI1GO2_CLINICAL_LABDATA <- LABDATA %>%
generate_oak_id_vars_adni(raw_src = "LABDATA") %>%
mutate(across(-all_of(c(common_cols, oak_id_vars())), as.character)) %>%
pivot_longer(
cols = everything() & -all_of(c(common_cols, oak_id_vars())),
names_to = "LBTESTCD",
values_to = "LBORRES"
) %>%
select(-all_of(common_cols[!common_cols %in% lab_cols])) %>%
filter(LBORRES != -1) %>%
mutate(LBDTC = create_iso8601(as.character(EXAMDATE), .format = "y-m-d")) %>%
adjust_lab_visitcode()
# Lab data for ANDI3-4 phases
ADNI34_CLINICAL_LABDATA <- URMC_LABDATA %>%
generate_oak_id_vars_adni(raw_src = "URMC_LABDATA") %>%
mutate(
LBNAM = "URMC",
TestID = ifelse(TestName %in% "Sodium", "NA", TestID)
) %>%
mutate(across(ends_with(c("Date", "Time")), as.character))
# Create LBDTC using `sdtm.aok::assign_datetime`
ADNI34_CLINICAL_LABDATA <- assign_datetime(
tgt_dat = ADNI34_CLINICAL_LABDATA %>%
select(-SampleDate, -SampleTime),
raw_dat = ADNI34_CLINICAL_LABDATA,
tgt_var = "LBDTC",
raw_var = c("SampleDate", "SampleTime"),
raw_fmt = c("y-m-d", "H:M:S")
)
ADNI34_CLINICAL_LABDATA <- ADNI34_CLINICAL_LABDATA %>%
# Adjust for lab test that were considered as 'not completed/done'
adjust_lab_status() %>%
adjust_lab_visitcode() %>%
mutate(
LBGRPID = COLPROT,
LBTESTCD = TestID,
LBTEST = TestName,
LBORRES = ResultValueConv_translated,
LBORRESU = UnitsConv,
LBORNRLO = LowerRangeConv,
LBORNRHI = UpperRangeConv,
LBSTRESC = ResultValueSI_translated,
LBSTRESN = as.numeric(ResultValueSI_translated),
LBSTRESU = UnitsSI,
LBSTNRLO = LowerRangeSI,
LBSTNRHI = UpperRangeSI,
LBFAST = Fasting,
LBSPCCND = Comments
)
# C2N Blood Plasma Result ----
c2n_cols <- c(
"pT217_C2N", "npT217_C2N", "AB42_C2N", "AB40_C2N", "AB42_AB40_C2N",
"pT217_npT217_C2N", "APS2_C2N"
)
names(c2n_cols) <- c(
"PT217", "NPT217", "AB42", "AB40",
"AB42AB40", "PTNPT217", "APS2"
)
C2N_PLASMA_DATA <- C2N_PRECIVITYAD2_PLASMA %>%
generate_oak_id_vars_adni(raw_src = "C2N_PRECIVITYAD2_PLASMA") %>%
rename(c2n_cols) %>%
select(
all_of(c(oak_id_vars(), names(c2n_cols))),
ORIGPROT, COLPROT, RID, VISCODE,
EXAMDATE,
LBANTREG = Primary, LBSPCCND = Comments
) %>%
mutate(across(all_of(names(c2n_cols)), as.character)) %>%
pivot_longer(
cols = all_of(names(c2n_cols)),
names_to = "LBTESTCD",
values_to = "LBORRES"
) %>%
mutate(
LBSPEC = "PLASMA",
LBDTC = create_iso8601(as.character(EXAMDATE), .format = "y-m-d")
) %>%
left_join(
get_biomarker_details(assay = "C2N"),
by = "LBTESTCD"
) %>%
assert_non_missing(LBTEST)
LB <- bind_rows(
C2N_PLASMA_DATA, ADNI1GO2_CLINICAL_LABDATA,
UPENNBIOMK_ROCHE_DATA, UPENNBIOMK_ALZBIO3_DATA,
UPENN_PLASMA_FQ_DATA
) %>%
mutate(
LBGRPID = COLPROT,
LBSTRESC = as.character(LBORRES),
LBSTRESN = as.numeric(LBORRES)
) %>%
bind_rows(ADNI34_CLINICAL_LABDATA) %>%
assert_non_missing(LBGRPID) %>%
# Required adjustment ???
# set_dom_test(
# .data_list = LB_TESTCD_LIST %>%
# select(LBTESTCD, LBTEST, LBORRESU, LBSTRESU),
# merge_by = "LBTESTCD"
# ) %>%
derive_usubjid() %>%
assign_studyid_domain(domain = "LB") %>%
assign_visit_attr(check_missing = FALSE) %>%
assign_epoch() %>%
derive_blfl_adni(
dm_domain = DM,
tgt_var = "LBBLFL"
) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "LB"
) %>%
derive_seq(
tgt_var = "LBSEQ",
rec_vars = c("USUBJID", "LBTESTCD", "LBGRPID")
) %>%
assign_vars_label(data_dict = lb_data_dic, .strict = FALSE)
Genomics Findings (GF)
GF
dataset contains
data related to genomic material of interest. The GF
dataset will contains
subjects’ APOE genotype that collected once during the study period.
NOTE: There might be some instance with duplicated records of APOE genotype per subject.
GF <- APOERES %>%
generate_oak_id_vars_adni(raw_src = "APOERES") %>%
mutate(
GFTESTCD = "APOE",
GFTEST = "Apolipoprotein E",
GFGRPID = COLPROT,
GFSTDTL = "GENOTYPE",
GFORRES = GENOTYPE,
GFDTC = create_iso8601(as.character(APTESTDT), .format = "y-m-d"),
) %>%
separate(GENOTYPE, into = c("ALLEL1", "ALLEL2"), sep = "/") %>%
mutate(across(contains("ALLEL"), ~ paste0("ε", .x))) %>%
unite("GFSTRESC", ALLEL1, ALLEL2, sep = "/") %>%
# Required to adjust for collection date ??
derive_usubjid() %>%
assign_studyid_domain(domain = "GF") %>%
assign_visit_attr(check_missing = FALSE) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "GF"
) %>%
derive_seq(
tgt_var = "GFSEQ",
rec_vars = c("USUBJID", "GFTESTCD", "GFGRPID")
) %>%
assign_vars_label(data_dict = gf_data_dic)
Vital Sign (VS)
VS
dataset contains
measurement finding of blood pressure, heart beat, respiratory rate,
temperature, weight and height per visit per subject. The VS
dataset will contains
the following variables:
# Wide format
VS <- VITALS %>%
mutate(VSDTC = create_iso8601(as.character(VISDATE), .format = "y-m-d")) %>%
select(
ORIGPROT, COLPROT, RID, VISCODE, VSDTC,
VSWEIGHT, VSWTUNIT, VSHEIGHT, VSHTUNIT, VSBPSYS, VSBPDIA, VSPULSE,
VSRESP, VSTEMP, VSTMPSRC, VSTMPUNT, VSHGTSC
) %>%
generate_oak_id_vars_adni(raw_src = "VITALS") %>%
mutate(across(c(VSWEIGHT, VSHEIGHT, VSTEMP), ~ ifelse(.x == -1, NA, .x))) %>%
verify(all(VSWTUNIT %in% c("kilograms", "pounds") | is.na(VSWTUNIT))) %>%
verify(all(VSHTUNIT %in% c("centimeters", "inches") | is.na(VSHTUNIT))) %>%
verify(all(VSTMPUNT %in% c("Fahrenheit", "Celsius") | is.na(VSTMPUNT))) %>%
mutate(
VSWTUNIT_TRANSLATED = case_when(
VSWTUNIT %in% "kilograms" ~ "kg",
VSWTUNIT %in% "pounds" ~ "LB"
),
VSHTUNIT_TRANSLATED = case_when(
VSHTUNIT %in% "centimeters" ~ "cm",
VSHTUNIT %in% "inches" ~ "inch"
),
VSTMPUNT_TRANSLATED = case_when(
VSTMPUNT %in% "Fahrenheit" ~ "F",
VSTMPUNT %in% "Celsius" ~ "C"
),
) %>%
unite("WEIGHT", VSWEIGHT, VSWTUNIT_TRANSLATED, sep = "-") %>%
unite("HEIGHT", VSHEIGHT, VSHTUNIT_TRANSLATED, sep = "-") %>%
unite("TEMP", VSTEMP, VSTMPUNT_TRANSLATED, VSTMPSRC, sep = "-") %>%
mutate(
DIABP = ifelse(!is.na(VSBPDIA), paste0(VSBPDIA, "-", "mmHg"), NA),
SYSBP = ifelse(!is.na(VSBPSYS), paste0(VSBPSYS, "-", "mmHg"), NA),
PLUSE = ifelse(!is.na(VSPULSE), paste0(VSPULSE, "-", "beats/min"), NA),
RESP = ifelse(!is.na(VSRESP), paste0(VSRESP, "-", "breaths/min"), NA)
)
# Long format
VS <- VS %>%
pivot_longer(
cols = c(WEIGHT, HEIGHT, TEMP, DIABP, SYSBP, PLUSE, RESP),
names_to = "VSTESTCD",
values_to = "VALUE"
) %>%
set_dom_test(
.data_list = VS_TESTCD_LIST %>%
select(VSTESTCD, VSTEST),
merge_by = "VSTESTCD"
) %>%
separate(VALUE, into = c("VSORRES", "VSORRESU", "VSLOC"), sep = "-") %>%
mutate(VSLOC = str_to_upper(VSLOC)) %>%
mutate(VSSTRESU = case_when(
VSORRESU %in% c("F", "C") ~ "C",
VSORRESU %in% c("cm", "inch") ~ "cm",
VSORRESU %in% c("kg", "LB") ~ "kg",
VSORRESU %in% c("mmHg", "beats/min", "breaths/min") ~ VSORRESU
)) %>%
# Unit conversion: conv_unit
mutate(
FROM_UNIT = case_when(
VSORRESU %in% "LB" ~ "lbs",
VSORRESU %in% c("C", "F", "cm", "inch", "kg") ~ VSORRESU,
TRUE ~ NA_character_
),
TO_UNIT = case_when(
VSSTRESU %in% c("C", "kg", "cm") ~ VSSTRESU,
TRUE ~ NA_character_
)
) %>%
rowwise() %>%
mutate(
VSSTRESN = ifelse(
VSTESTCD %in% c("WEIGHT", "HEIGHT", "TEMP") & !is.na(FROM_UNIT),
conv_unit(as.numeric(VSORRES), from = FROM_UNIT, to = TO_UNIT),
ifelse(VSTESTCD %in% c("DIABP", "SYSBP", "PLUSE", "RESP"),
as.numeric(VSORRES), NA_real_
)
)
) %>%
mutate(
VSGRPID = COLPROT,
VSSTRESN = round(VSSTRESN, digits = 1),
VSSTRESC = as.character(VSSTRESN),
VSDRVFL = case_when(!is.na(FROM_UNIT) & FROM_UNIT != TO_UNIT ~ "Yes")
) %>%
as_tibble() %>%
assert_non_missing(COLPROT, VISCODE, VSTEST)
VS <- VS %>%
derive_usubjid() %>%
assign_studyid_domain(domain = "VS") %>%
assign_visit_attr() %>%
assign_epoch() %>%
derive_blfl_adni(
dm_domain = DM,
tgt_var = "VSBLFL"
) %>%
derive_study_day_adni(
dm_domain = DM,
domain = "VS"
) %>%
derive_seq(
tgt_var = "VSSEQ",
rec_vars = c("USUBJID", "VSTESTCD", "VSGRPID")
) %>%
assign_vars_label(data_dict = vs_data_dic)