Programming for Data Science

Data science has evolved significantly in the past decade, with the convergence of computer science and statistics. As statisticians, we are trained to understand data using statistical and mathematical models. However, data scientists can spend up to 80% of their time pulling, organizing, and cleaning data, which requires a basic understanding of databases. Furthermore, machine learning positions are increasingly focused on software engineering rather than math and theory. Many data science interviews include questions about algorithms, data structures, and databases, which are not typically covered in a statistics curriculum.

In this workshop, we will explore how data science has evolved over the past decade and how working in the industry differs from academia. We will also cover engineering aspects and teach the programming fundamentals necessary for a career in data science, which are often tested in data science and machine learning interviews. The course will use SQL and Python.