… Personal projects and class projects …
Tidy Tuesdays
TidyTuesday is a weekly community activity by the R4DS Online Learning Community. The goal is to help R learners learn in real-world contexts. My contributions and visualizations are also at my github.
Graduate capstone project:
The relationship between depression and sociodemographic and economic characteristics in New York City adults 2016-2017
- Completed as a graduation requirement Columbia University in 2019, under the guidance of Lauren Martini at the New York City Department of Health and Mental Hygiene.
Analysis of the NYC Community Health Survey (2016 - 2017), investigating depression, socioeconomic factors, and discrimination in NYC. Data management and analysis was in SAS and SAS-callable SUDAAN, and I produced a poster and manuscript report.
Dashboard tutorial and shell
- Completed during my internship at the New York City Department of Health and Mental Hygiene.
Recreation of Tableau dashboard with fake data, aimed at beginner R-users and tutorial information of how to create a Shiny flexdashboard in R.
There was an existing Tableau dashboard about maternal prenatal and postpartum depression screening. Since sharing Tableau dashboards requires the user to download a reader, I was tasked with creating a dashboard that could be shared easily with Shiny, and so that the real data could easily be added. One of the main asks of this project was to incorporate a flexible and user friendly way to compare sites, which is why there are three options.
A Geolocation and Sentiment Analysis of Tweets
- Final project for Data Science 1 course at Columbia University Mailman School of Public Health, in collaboration with: Kathryn Addabbo, Peter Batten, Morgan de Ferrante, Nadiya Pavlishyn.
I collaborated with peers to choose a data set appropriate for assignment, and discuss ideas for potential analyses. I mapped sentiments and attitude in the US tweets using ggplot in R, and converted this visualization into a shiny app. I assisted with some debugging for collaborators. I contributed to creation of project Rmarkdown report including discussion about US tweets, and over-all findings from all collaborators. I also assisted in the creation of a webpage in Rmarkdown and screencast describing the project.
Classification Methods for Predicting Tumor Malignancy
- Final project for Data Science 2 course at Columbia University Mailman School of Public Health, in collaboration with: Peter Batten, Morgan de Ferrante.
I collaborated with peers to search for a data set appropriate for classification, and that was also publicly available for use. The BreastCancer data was used from the mlbench R package. I cleaned the data by removing NA’s and converting factors to numeric values, performed some exploratory analysis, and performed logistic analyis in R.
Application and analysis of linear model for predicting hospital length of stay
- Final project for Biostatistics Methods 1 course at Columbia University Mailman School of Public Health, in collaboration with: Manqi Cai, Peng Lin-Lin, Yixi Xu.
The dataset was given to us by the professor. Using R, I ran exploratory analysis of the cleaned data set, to see any possible correlations, outliers, or need to transform data. I recoded some variables for interpretability, tidied the data set and exported for use in SAS. I performed variable selection in SAS, which required me to independently learn how to use SAS Studio and the proper code. I then ran a linear model in R. I did some of the literature review necessary for this project, helped collaborators create summary tables, and proof-read majority of the report.
Undergraduate capstone project:
Overview of General Relativity
- Completed at Fairfield University in 2015, under the guidance of Professor Shawn Rafalski.
I solved the Schwarzschild Solution to Einstein’s equations, with simplifying assumptions, such as a 2-D space. These solutions were applied in the MATLAB ODE solver, to visualize the effects of different initial conditions on the behavior of a test particle.