PhD Year 2 In Review

13 minute read

The second year of this PhD was strange. I don’t feel like I was particularly productive and I’m not sure if I can disentangle how much was pandemic and how much was regular un-productivity.

Things I accomplished by the end of August 2021:

  1. Published manuscript on opioid overdose
  2. Built
  3. Conducted a usability and submitted a manuscript on
  4. Got in to the Biodesign program at UCLA (after getting rejected from tons of other things)
  5. Did initial analysis and helped with data related planning of Depression Grand Challenge + Apple study
  6. Wrote a draft of my specific aims which I am now basically fully rewriting because I feel like I didn’t know enough about the field I was in theory focused on researching
  7. Took and passed 6 courses maintaining a 4.0 GPA (which means much much less than in undergrad, but I never got a 4.0 any quarter of undergrad so was a goal of mine to have one in grad school)
  8. Reached a state where I more or less know what I want to study
  9. Helped organize and presented at a series covering bias in medicine over last summer in my program
  10. Built a web-scraper for NEMSIS opioid overdose data

Things I thought I’d have done by the end of August:

  1. Gain a deep understanding of the literature in my specific field
  2. My full draft of prospectus
  3. Assembled a committee
  4. Applied to a conference
  5. Been in process of publishing NEMSIS cdc overdose related manuscript
  6. Had made progress towards a publication in my desired specific area

It has truly taken me by surprise just how nebulous the task of writing a dissertation proposal is. I went from broad ideals (doing good in the world) to specific details (e.g, describing a particular implementation of an auto-encoder as part of an experiment). And in the middle was just kind of floundering. That being said part of me really loved just sitting there with my only task to “figure out what I want to do”. The other part of me was anxious I was going down a hole with no end, and a third part was resisting the temptation to stay in a state of perpetual distraction. But I feel good about where I am now.

Below are my notes I took to capture my experience of each quarter as I was experiencing it. My primary task from last summer in to Fall 2020 was a literature search of the mobile health field since I had a vague interest there. Other than that I was doing classes and side projects. In the Summer of 2021 I was going deep on my getting my dissertation proposal done.

Also at some point we cared for an abandoned husky named Maple that we re-homed after a couple of months of looking for the original owners. Maple the Dog


Fall 2020

Class: Biostats 201A

I thought this class would be ultra advanced statistics, but seemed like more of an intro things assuming you know fundamentals of like t-tests and z-tests and such. I’m by no means an expert, but having the ability to run the aov command in R made this course much easier. The professor recorded lectures and discussion and labs, so I ended up not attending as often and watching playbacks at 2x.

Issues of social justice and systemic inequalities really stuck as the problem I wanted to work on after the Summer. Started focusing more on social determinants of health especially after reading the “A call for social informatics”.


Been trying to find internships. Got rejected even with referrals from Verily and Google.

Got involved in a project to create a website visualization for two other students who had just submitted a manuscript related to comparing global COVID-19 mortality forecasting groups. A member of the WHO reached out to us on twitter to collaborate and we got a good overall reception.

In December 2020 starting to put together a protocol to have public health people use our site for a usability study.

CDC Overdose Data

Joe reached out about seeing if we can do some math to get monthly overdose data from the CDC’s aggregated data. It worked and within the span of a week we are going from idea —> implementation —> submission to Science.

Winter 2021

Update Times:

Jan 15 2021

March 2 2021

Class: HLT POL 214

This class covers Health Related Quality of Life (HRQoL) measures. The professor Ron D. Hays worked on many of the key milestones in the field which is wild. Our readings all have his last name some where in the list. This is probably the first class where I felt like the professor is the best possible person to teach it.

It is interesting seeing what goes in to a questionnaire and its scoring. There is a lot of statistics involved in creating things like “preference based” scores and things like Item Response Theory are cool to see as they evolve a field. Lots of practical advice on how to score questionnaires, administer them, research them, etc.

Class: BIOSTAT 201B

This class is a continuation of 201A but covers things in more depth. We are learning about Generalized Linear Models (GLMs) and it is crazy how useful it is to have this knowledge taught from a stats and not a machine learning perspective. The idea of maximum likelihood estimation made 0 sense before. I also now know what a bootstrap really is.

The professor is teaching well, Dr. Catherine Sugar puts lots of time and effort in to her notes. Very useful to just look at notes and transcribe them as studying for me. Frequent homeworks and 3 tests instead of a midterm/final structure. Definitely geared towards public health students so no support for R based analysis, only STATA and SAS.

Internship Hunt

  • Rejected from those that responded

Biodesign fellowship full day interview I think went well. Waiting on results but was a time.

Got in to biodesign, cancelling internship hunt huzzah 🎉

Still pushing the IRB through, got a web IRB account, likely exempt study

IRB still not approved. I did present to a classroom as part of Skype-a-Scientist which went really well.

DGC Depression Analysis

Had meetings to think about how to do initial analyses of data and how to store and structure databases. Beginning to look at REDCap data. Right now looking at EMA vs clinical assessments for depression. Sensor data will take some time to get.

Sensor data almost ready. Looked at predicting CAT-MH from EMA, results look promising. However, not many subjects enter or exit depression making predictive utility hard to assess. Beginning to use findings to help guide planning for the larger Phase 2 3k participant study. I want to push for EMAs to be based on some instruments, more thought to event triggered EMAs and replace scheduled EMA’s with weekly-ish surveys. I also REALLY think we need more lower income participants and will push for that.

Note: I think this EMA basis push was my attempt to apply the class I was taking to my research


I’m apparently leading the only student involved part of recruitment on Feb 9th. Asked for help from other students and now 7 of us will lead a panel and answer questions.

The applicant-student meeting went really well. I’ve talked to 2 of the accepted students for follow ups and they seem really cool.

CDC Overdose work

  • Submitted to Nature and science and got rejected
  • Submitted to Addiction and rejected
  • Accepted for publication to AJPH!

Now trying to automate data scraping from NEMSIS data cube using selenium + python. Hopefully that will help push the states that are hesitant to allow for an API realize the utility of being able to do daily scraping.

Spring 2021

Update Times:

May 3 2021

May 30 2021

Life Context

  • COVID still happening
  • Been between SF, Santa Clara, and LA
  • Vaccinated!

  • been in LA for 4 weeks


HPM 253 - Health Systems Transformation

This class is taught by Neil Halfon and Peter Long. Peter is the Chief Transformation Officer at Blue Shield of California (a non-profit health insurer). They have brought in several guest speakers, but most classes have been led by the two of them. This course has been really eye opening. The premise is to learn about current health systems, whats wrong, frameworks to make change, and how to implement them. The class project is to think of and flesh out a plan for health system transformation in teams of 2-3. Our project is related to digital health monitoring for real time assessment of community (zipcode level) mental health for resource allocation and intervention result tracking.

One of the biggest things I’ve taken so far from the course is the three horizons framework which in brief looks at 3 timelines at once to tackle large problems. The first is the status quo, the second is short term innovation like start ups and such, and the third is the long term vision and future dominant system. The key point I like is that in the second horizon (startups, innovation, etc.) there is a key distinction between efforts that simply prop up the status quo versus those that actively push towards the goal third horizon.

We’ve been working on our class project a lot recently and just submitted the final paper. Our project is really interesting and has me really considering what the applications of my research could be. Our initiative is called the California Adolescent Mental health Initiative (CAMI) in which adolescent (high schoolers) will be taught in class how to interpret their wearable device (initially just smartphone) collected data for their mental health and then using aggregated data by the school level to assess if it has helped mental health and monitor effect of other interventions on mental health. We interviewed 3 people, Dr. Elizabeth Ozer from UCSF, Dr. Mitchell Wong form UCLA, and Sonya Harris from the California department of public health. Fascinating interviews and followed up with Dr. Ozer and Harris for some future work.

STATS 256 - Causality

This class taught by Chad Hazlette brings together the Rubin & Neyman potential outcomes framework and the Pearl causal model. We started by learning about potential outcomes, then introduced to structural causal models and DAGs (directed acyclic graphs). The class has definitely begun to clarify what one means by “Causal”. The more advanced methods and talks I tend to get lost on → hard for me to see utility in knowing methods that feel esoteric and make a lot of pretty strict assumptions that are unverifiable. It’s interesting how the prediction space of ML pays no attention to the types of constraints and statistical guarantees that the causal literature looks in to.

I’ve legitimately retained nothing from the last 4 weeks of class. I’ve come to the conclusion that the material is too math-y and assumptions to unverifiable for me to feel engaged. Not that I shouldn’t pay attention, its just hard for me when things aren’t either simple or tangible. We’ve entered the realm of proofs and statistical theory. Our project is to give an hour long lecture on an advanced causal literature topic, ours is Dynamic treatment regimes. I’m lost and confused.

MIMGC 134 - Ethics

This course is a program requirement and is structured interestingly. There is about a 30-45 minute lecture at the beginning by either the lead professor or a guest and then the remaining time of the 2 hours is spent in break out rooms of ~15 students with 1 guest faculty (rotating). We discuss 5 cases and hypothetical scenarios with 1 student assigned to 1 perspective of the case each time. I have learned things and the discussions are interesting. For some areas they just don’t apply to me. Hard to stay motivated in a required course that is 2 hours. Although to be fair its low work and only 2 hours a week.

I actually liked this course overall. I’m glad I took it, learned about some interesting ethical scenarios even if all not applicable. Broadly happy it exists and is required.



I’ve spent so long trying to figure out what my actual research direction will be. I finally wrote a couple drafts of a specific aims page, and am now working on experiment planning. I thought it would take a week after the SA page but no, this is taking many weeks and is surprisingly hard.

Gotten some solid experimental ideas down, need more, going a bit too slow though, I am supposed to have a draft of my prospectus in like a month 😫 but I’m so behind. I think 1 or 2 more experiments planned out should be enough. Excited about the heart rate variability stuff, need to assess if we have enough data for it.

We’ve gotten through 7 of 8 (minimum) participants in the usability study. So far results looking promising. We got approved through exemption from the IRB. Pretty happy with how things are proceeding. Excited to wrap up, write a manuscript, and hopefully have my first first-author publication.

I built a very simple ensemble of model predictions, toyed around with writing a paper but after spending a good amount of time and realizing how much more it would take to have enough results/validation to write a useful paper I have decided against publishing on the ensemble.

Finished all 8 interviews. Will start writing paper next week I think.

DGC Depression Work

This quarter we’ve had a series of meetings with a lot of key faculty and students to help guide the final phase. There are several key questions with different groups interested in different pieces. I finished initial analysis of the EMA related data and presented that to Dr. Loes Olde-Loohuis’ lab in April. No project updates for a couple of weeks as of May 4th.

Still no project updates from me as of May 30th.

CDC Overdose Work

Joe and I published our work in AJPH on April 15th

The American Journal of Public Health (AJPH) from the American Public Health Association (APHA) publications

I built most of the web scraper to get data from NEMSIS that will the basis of a web tool to nowcast overdose.

Has been on pause.

Summer 2021

update 1: 06/30/2021

Looking for housing hardcore this week, definitely delaying research by a couple of days, but has to be done. Excited to move in with Sharvari by end of July.



I’ve written a rough rough draft of the research strategy page that Alex read and gave me feedback on. Now starting to write/put together a presentation on the background of my research to get feedback from lab meeting in 2 weeks.

DGC Study

Haven’t done much actual research. As my research strategy solidifies i’m starting to use the data for investigation of heart rate variability related things. The DIG1 (phase 1) is finishing up and data is becoming available slowly. Alex mentioned I should write and disclose my research plan ASAP otherwise if Apple just hears some ideas they may take and license.

Wrote draft 1 and 2 of manuscript, abstract, and cover letter. Writing cover letters is weird. Really have to sell yourself to the editors, its a skill I will need to develop. Main other part of covidcompare is migrating to the UCLA health servers. Its going to be a lot of work making things function with puppet and all that. Glad its on Linux machines with MariaDB so limited tech differences. Just infrastructure adjusting and stable system building.

FORE Grant

Through Joe, Alex and I have also now been added as co-investigators on the FORE grant being submitted by a new faculty member Dr. Chelsea Shover. I’ve acted as like a website technical consultant and now been on meetings with non-profits and google to help out where I can. Its really interesting work and the center that Dr. Shover makes will use the data Joe and I are scraping from NEMSIS as part of their display.

Overdose mortality

On a similar note, nothing has happened recently in terms of work on overdose mortality. Kinda just waiting on this one.