We are pleased to announce a new course offered in the Spring of 2022 – Data Mining and Analysis taught by Dr Yang Ni.

This course is an introduction to concepts, methods, and practices in statistical data mining. We will provide a broad overview of topics that are related to supervised and unsupervised learning.

Students will learn how and when to apply statistical learning techniques, their comparative strengths and weaknesses, and how to critically evaluate the performance of learning algorithms. Students who successfully complete this course should be able to apply basic statistical learning methods to build predictive models or perform exploratory analysis, and make sense of their findings.

    Tentative Topics

  • Regression and classification
  • Bootstrap and cross-validation
  • Regularization
  • Decision tree, bagging, random forest, boosting
  • Neural network
  • Clustering
  • Principle component analysis
  • Community detection

Prerequisites:

Familiarity with programming language R and knowledge of basic multivariate calculus, statistical inference, and linear algebra is expected. Students should be comfortable with the following concepts: probability distribution functions, expectations, conditional distributions, likelihood functions, random samples, estimators and linear regression models. 

Credits: 3

Check out our Courses page for the full list.