--:--:-- PST
§WorkInstitute of Systems Biology
← All roles

Computational Biology & Data Science Intern

Institute of Systems Biology · Seattle, WA · Jun — Aug 2021

StackPython · NumPy · Pandas · Microbiome data
LocationSeattle, WA
DatesJun — Aug 2021

Spent the summer at the Institute of Systems Biology doing computational work on microbiome data from the American Gut Project — a large, open dataset of stool-sample 16S sequencing paired with extensive lifestyle and dietary survey data.

The core question was whether and how lifestyle factors (diet, exercise, sleep, medication history) correlate with microbial diversity and community composition. I built pipelines in Python / NumPy / Pandas to clean the survey + sequencing data, compute diversity metrics across cohorts, and run group comparisons to identify which lifestyle axes actually move the needle on microbial community structure.

First real exposure to systems-biology research and to the messy reality of biological survey datasets — a lot of the work was upstream data cleaning and figuring out which signals were real vs. confounded by sampling artifacts.

Highlights

  • 01Computational analysis of American Gut Project microbiome data
  • 02Pipelines in Python / NumPy / Pandas for diversity metrics and group comparisons
  • 03Surfaced links between lifestyle factors and microbial community composition