Power companies and coal-fired plants across the U.S. have dumped coal ash into landfills and ponds without regard to toxic contaminants that leak into groundwater for much of the last century, posing devastating health risks to the public, including cancer, neurological impairments to children, and human reproductive defects. 

Under the mentorship of Dr. Rachel Nethery and graduate student Luli Zou, Tony Ni, Jose Lopez, and I, thus set out to investigate and address the prominence of coal ash contamination amongst upgradient groundwater wells in Illinois through exploratory statistical analysis and classification/cluster-based machine learning approaches. 

We gathered from running a K-means clustering algorithm on averaged massive amounts of sampled groundwater data, that high levels of coal ash contamination can be identified in this way and ought to be addressed. My team and I had the opportunity to give the following (virtual) presentation regarding our significant work on this environmental research project at Harvard’s annual symposium “Pipelines into Biostatistics Symposium”.