Data Mining [CS 470]

Taught by https://www.cs.emory.edu/~kshu5/

Introduction

This course was about learning the foundations and techniques of data mining — how to find patterns, trends, and insights from large datasets. We worked with real-world data to understand how to clean it, explore it, and build models that make predictions or discover hidden structures.

We learned how algorithms like decision trees, clustering, and association rules work under the hood, and implemented many of them from scratch in Python. We also practiced evaluating models using metrics like accuracy, precision, and recall, and understood the tradeoffs between different methods.

Assignments

Heart Data Analysis

Apriori Algorithm

Decision Tree Induction

Heart Disease Prediction Project

Resources

https://colab.research.google.com/drive/1XIvxhkd4mvGLCi4O0AsGyyo84GbjQGLK?usp=sharing 1 [PPT 2a]

https://colab.research.google.com/drive/1EfPMAktHNatkaKoj8HdXR5nTyHliJJHu?usp=sharing#scrollTo=w79fgYCus3T_ 2 [PPT 3a]