Package: ukbflow 0.3.4

ukbflow: Streamlined Workflow for UK Biobank Data Extraction, Analysis, and Visualization

Provides an R-native, RAP-aware workflow system for UK Biobank controlled-data analysis on the Research Analysis Platform (RAP). Includes tools for phenotype extraction and decoding, variable derivation, survival and association analysis, genetic risk score computation, audit records, and publication-quality visualization. For details on the UK Biobank resource, see Bycroft et al. (2018) <doi:10.1038/s41586-018-0579-z>.

Authors:Yibin Zhou [aut, cre]

ukbflow_0.3.4.tar.gz
ukbflow_0.3.4.zip(r-4.7)ukbflow_0.3.4.zip(r-4.6)ukbflow_0.3.4.zip(r-4.5)
ukbflow_0.3.4.tgz(r-4.6-any)ukbflow_0.3.4.tgz(r-4.5-any)
ukbflow_0.3.4.tar.gz(r-4.7-any)ukbflow_0.3.4.tar.gz(r-4.6-any)
ukbflow_0.3.4.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
ukbflow/json (API)

# Install 'ukbflow' in R:
install.packages('ukbflow', repos = c('https://evanbio.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/evanbio/ukbflow/issues

Pkgdown/docs site:https://evanbio.github.io

On CRAN:

Conda:

6.52 score 2 stars 19 scripts 476 downloads 76 exports 74 dependencies

Last updated from:4e0ac9fe0f. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK221
source / vignettesOK237
linux-release-x86_64OK191
macos-release-arm64OK189
macos-oldrel-arm64OK165
windows-develOK179
windows-releaseOK191
windows-oldrelOK168
wasm-releaseOK161

Exports:assoc_competingassoc_coxassoc_coxphassoc_coxph_zphassoc_fgassoc_lagassoc_linearassoc_lmassoc_logisticassoc_logitassoc_subassoc_subgroupassoc_trassoc_trendassoc_zphaudit_colsaudit_fieldsaudit_jobaudit_modelaudit_phenoaudit_snapshotaudit_startaudit_writeauth_list_projectsauth_loginauth_logoutauth_select_projectauth_statusdecode_namesdecode_valuesderive_agederive_cancer_registryderive_casederive_covariatederive_cutderive_death_registryderive_first_occurrencederive_followupderive_hesderive_icd10derive_missingderive_selfreportderive_timingextract_batchextract_lsextract_phenofetch_fieldfetch_filefetch_lsfetch_metadatafetch_treefetch_urlgrs_bgen2pgengrs_checkgrs_scoregrs_standardizegrs_validategrs_zscorejob_lsjob_pathjob_resultjob_statusjob_waitops_fieldsops_fields_commonops_naops_set_safe_colsops_setupops_snapshotops_snapshot_colsops_snapshot_diffops_snapshot_removeops_toyops_withdrawplot_forestplot_tableone

Dependencies:backportsbase64encbigDbitopsbroombslibcachemcardscardxclicommonmarkcpp11curldata.tabledigestdplyrevaluatefarverfastmapfontawesomeforestploterfsgenericsgluegridExtragtgtablegtsummaryhighrhtmltoolshtmlwidgetsjquerylibjsonlitejuicyjuiceknitrlabelinglatticelifecyclelitedownmagrittrmarkdownMatrixmemoisemimepillarpkgconfigprocessxpspurrrR6rappdirsRColorBrewerRcppreactablereactRrlangrmarkdownsassscalesstringistringrsurvivaltibbletidyrtidyselecttinytexutf8V8vctrsviridisLitewithrxfunxml2yaml

Exploring and Fetching RAP Files
Overview | Prerequisites | Exploring Remote Files | List files and folders | Browse the directory tree | Generating Download URLs | Downloading Files | Single file or folder | Metadata shortcuts | Common Options | Getting Help

Last update: 2026-06-02
Started: 2026-03-06

Supported Phenotype Sources and Current Limitations
Purpose | Supported Sources | Default Reconciliation | Not Currently Supported | Design Principle | Related Articles

Last update: 2026-05-15
Started: 2026-05-15

Get Started with ukbflow
Welcome to ukbflow | Installation | A Quick Taste | Load data | Derive a disease phenotype | Run an association model | Plot the results | Full Function Overview | End-to-End Case Study | Additional Resources

Last update: 2026-05-14
Started: 2026-03-06

Genetic Risk Score (GRS) Analysis on UKB RAP
Overview | Step 1: Validate the Weights File -- grs_check() | Step 2: Convert BGEN to PGEN -- grs_bgen2pgen() | Step 3: Calculate GRS Scores -- grs_score() | Step 4: Standardise GRS Columns -- grs_standardize() | Step 5: Validate Predictive Performance -- grs_validate() | Logistic (cross-sectional) | Cox (survival) | Complete Pipeline Example | Getting Help

Last update: 2026-05-14
Started: 2026-03-06

Smoking and Lung Cancer Risk: A Synthetic Workflow Demonstration
1 Introduction | 2 Data Loading | 3 Decode Column Names | 4 Derive Phenotypes | 5 Exposure Definition | 6 Cohort Assembly | 7 Association Analysis | 8 Visualisation | Getting Help | 9 Session Info | 10 References

Last update: 2026-05-14
Started: 2026-03-10

Association Analysis for UKB Outcomes
Overview | The Three-Model Framework | Step 1: Cox Proportional Hazards | Step 2: Logistic Regression | Step 3: Linear Regression | Step 4: Proportional Hazards Assumption Test | Step 5: Subgroup Analysis | Step 6: Dose-Response Trend Analysis | Step 7: Fine-Gray Competing Risks | Working with Results | Step 8: Lag Sensitivity Analysis | Getting Help

Last update: 2026-05-14
Started: 2026-03-06

Analysis Audit and Reproducibility
Overview | Start an Audit | Record Field IDs | Snapshot Data States | Record Phenotype Summaries | Record Cohort Assembly | Record Model Results | Review and Write the Manifest | Suggested Audit Points

Last update: 2026-05-13
Started: 2026-05-13

Deriving Disease Phenotypes from UKB Data
Overview | Setup | Step 1: Handle Informative Missing Labels | Step 2: Prepare Covariates | Step 3: Bin Continuous Variables | Step 4: Self-Reported Disease | Step 5: HES Inpatient Records | Step 6: First Occurrence Fields | Step 7: Cancer Registry | Step 8: Death Registry | Step 9: Combine Sources with derive_icd10() | Step 10: Final Case Definition | Getting Help

Last update: 2026-05-12
Started: 2026-03-06

Installation Guide
Overview | Quick Install | From CRAN (recommended) | From GitHub (latest development version) | System Requirements | Dependencies | Core Dependencies | Analysis Dependencies | Visualization Dependencies | Optional Dependencies | Install dxpy (Local Mode Only) | Authentication Setup | Local → RAP | RAP → RAP | Verify Installation | Update ukbflow | From GitHub | From CRAN | Troubleshooting | dx not found | Token expired or session lost | Installation fails on Windows | Network / Firewall issues | Uninstall | Getting Help | Next Steps

Last update: 2026-04-07
Started: 2026-03-06

Authentication and Project Setup
Overview | Obtaining an API Token | Storing Your Token Securely | Logging In | Local → RAP | RAP → RAP | Checking Authentication Status | Listing Available Projects | Selecting a Project | Logging Out | Token Expiry | Full Local → RAP Workflow | Getting Help

Last update: 2026-04-01
Started: 2026-03-06

Decoding UKB Column Names and Values
Overview | Recommended Workflow | Step 1: Decode Values | What gets decoded | Step 2: Decode Names | Name conversion examples | Long names | Getting Help

Last update: 2026-04-01
Started: 2026-03-06

Operational Utilities: Setup, Diagnostics, and Pipeline Tracking
Overview | ops_setup() — Environment Health Check | ops_toy() — Synthetic UKB Data | Cohort scenario | Forest scenario | Reproducibility | ops_na() — Missing Value Diagnostics | Controlling CLI output with threshold | Programmatic use | ops_snapshot() — Pipeline Checkpoints | Recording snapshots | Viewing the full history | Silent recording | Resetting history | Snapshot Helpers | ops_snapshot_cols() — column names at a checkpoint | ops_snapshot_diff() — compare two checkpoints | ops_snapshot_remove() — drop raw columns after deriving | ops_set_safe_cols() — register study-specific protected columns | ops_withdraw() — Exclude Withdrawn Participants | Typical Workflow | Getting Help

Last update: 2026-04-01
Started: 2026-03-09

Publication-Ready Visualisation
Overview | plot_forest() — Forest Plot | Minimal example | Building the input data frame from assoc_*() results | Key parameters | plot_tableone() — Baseline Characteristics Table | With SMD, custom labels, and export | Getting Help

Last update: 2026-04-01
Started: 2026-03-06

Survival Analysis Setup for UKB Outcomes
Overview | Step 1: Classify Timing — Prevalent vs. Incident | Step 2: Age at Event | Step 3: Follow-Up Time | Prevalent cases receive NA follow-up time | Auto-detection of death and lost-to-follow-up columns | Full Survival-Ready Pipeline | Getting Help

Last update: 2026-04-01
Started: 2026-03-06

Monitoring and Retrieving Extraction Jobs
Overview | Typical Workflow | Monitoring a Job | Check status | Wait for completion | Retrieving Results | Get the file path | Load into R | Browsing Job History | Getting Help

Last update: 2026-03-23
Started: 2026-03-06

Extracting Phenotype Data
Overview | Prerequisites | Step 1: Browse Available Fields | Step 2: Extract Data | Recommended: extract_batch() | Instance type | Quick inspection: extract_pheno() | A Note on Column Names | Getting Help

Last update: 2026-03-22
Started: 2026-03-06

Readme and manuals

Help Manual

Help pageTopics
Fine-Gray competing risks association analysisassoc_competing assoc_fg
Cox proportional hazards association analysisassoc_cox assoc_coxph
Proportional hazards assumption test for Cox regressionassoc_coxph_zph assoc_zph
Cox regression lag sensitivity analysisassoc_lag
Linear regression association analysisassoc_linear assoc_lm
Logistic regression association analysisassoc_logistic assoc_logit
Subgroup association analysis with optional interaction testassoc_sub assoc_subgroup
Dose-response trend analysisassoc_tr assoc_trend
Retrieve column names from an audit snapshotaudit_cols
Record UKB field IDs used for extractionaudit_fields
Record a DNAnexus job in an audit manifestaudit_job
Record an association model resultaudit_model
Record a derived phenotype audit summaryaudit_pheno
Record a data snapshot in a ukbflow audit objectaudit_snapshot
Start a ukbflow audit recordaudit_start
Write a ukbflow audit manifestaudit_write
List available DNAnexus projectsauth_list_projects
Login to DNAnexus with a tokenauth_login
Logout from DNAnexusauth_logout
Select a DNAnexus projectauth_select_project
Check current DNAnexus authentication statusauth_status
Rename UKB field ID columns to human-readable snake_case namesdecode_names
Decode UKB categorical column values using Showcase metadatadecode_values
Compute age at event for one or more UKB outcomesderive_age
Derive a binary disease flag from UKB cancer registryderive_cancer_registry
Combine self-report and ICD-10 sources into a unified case definitionderive_case
Prepare UKB covariates for analysisderive_covariate
Cut a continuous UKB variable into quantile-based or custom groupsderive_cut
Derive a binary disease flag from UKB death registryderive_death_registry
Derive a binary disease flag from UKB First Occurrence fieldsderive_first_occurrence
Compute follow-up end date and follow-up time for survival analysisderive_followup
Derive a binary disease flag from UKB HES inpatient diagnosesderive_hes
Derive a unified ICD-10 disease flag across multiple UKB data sourcesderive_icd10
Handle informative missing labels in UKB decoded dataderive_missing
Define a self-reported phenotype from UKB touchscreen dataderive_selfreport
Classify disease timing relative to UKB baseline assessmentderive_timing
Submit a large-scale phenotype extraction job via table-exporterextract_batch
List all approved fields in the UKB datasetextract_ls
Extract phenotype data from a UKB datasetextract_pheno
Download the UKB field dictionary filefetch_field
Download a file from RAP project storagefetch_file
List files and folders at a remote RAP pathfetch_ls
Download the Showcase metadata folderfetch_metadata
Print a remote RAP directory treefetch_tree
Get pre-authenticated download URL(s) for a remote RAP file or folderfetch_url
Convert UKB imputed BGEN files to PGEN on RAPgrs_bgen2pgen
Check and export a GRS weights filegrs_check
Calculate genetic risk scores from PGEN files on RAPgrs_score
Standardise GRS columns by Z-score transformationgrs_standardize grs_zscore
Validate GRS predictive performancegrs_validate
List recent DNAnexus jobs in the current projectjob_ls
Get the RAP file path of a completed DNAnexus job outputjob_path
Load the result of a completed DNAnexus job into Rjob_result
Check the current state of a DNAnexus jobjob_status
Wait for a DNAnexus job to finishjob_wait
Search approved UKB fields in the current projectops_fields
Common UK Biobank fields for quick referenceops_fields_common
Summarise missing values by columnops_na
Register additional safe columns protected from snapshot-based dropsops_set_safe_cols
Check the ukbflow operating environmentops_setup
Record and review dataset pipeline snapshotsops_snapshot
Retrieve column names recorded at a snapshotops_snapshot_cols
Compare column names between two snapshotsops_snapshot_diff
Remove raw source columns recorded at a snapshotops_snapshot_remove
Generate toy UKB-like data for testing and developmentops_toy
Exclude withdrawn participants from a datasetops_withdraw
Publication-ready forest plotplot_forest
Publication-ready Table 1 (Baseline Characteristics)plot_tableone