QTM 151 - Introduction to Statistical Computing II

Lecture 01 - Introduction and Course Overview

Danilo Freire

Department of Quantitative Theory and Methods
Emory University

28 August, 2024

Welcome to QTM 151 - Introduction to Statistical Computing II! 🥳 🎉

Lecture Overview


  • Introduction
  • Motivation
  • Class Logistics
  • Computing Set up

Course Materials

Course repository: https://github.com/danilofreire/qtm151

Course website: https://danilofreire.github.io/qtm151

This course is hosted on GitHub, where you will find lecture materials, code samples, our discussion space, assignments, and final project instructions. Canvas will be used for course management, including assignment submissions, grades, and announcements. Please familiarise yourself with both platforms, and reach out if you have any questions.

Note

Please remember to check the course website regularly for updates and announcements.

Nice to meet you!

Instructor

A bit about me

Visiting Assistant Professor in the QTM

MA from the Graduate Institute Geneva, PhD from King’s College London, Postdoc at Brown University, Senior Lecturer at the University of Lincoln, UK.

Research interests: computational social science, experimental methods, policy evaluation, political violence, organised crime.

My teaching philosophy

What you can expect from me


  • I love teaching and aim to make learning fun
  • Classes where students participate are the best
  • Hands-on activities help you learn better
  • I am always available to help and answer questions. And I mean it
  • Your feedback helps me improve my teaching. Please let me know what is working and what is not

Teaching assistants


  • The teaching assistants for this course will be confirmed soon

  • They will be answering questions during our lectures and holding office hours (see Canvas or the course website for office hours information)

  • They will also be grading your assignments and quizzes (with my oversight)

  • We are all here to help you! So feel free to ask questions during class, office hours, or via email 😊

Office hours: What for and what not for


  • What office hours are meant for:
    • Applying tools in practice
    • Discussion of issues related to the assignments
    • Boosting your knowledge of data science
  • What these sessions are not meant for:
    • Solving the assignments for you
    • Taking care of developing your coding skills

Class etiquette

  • Coding can be tough and push you out of your comfort zone. If the course pace is too fast, let us know. I expect your commitment, but I do not want anyone to fail
  • You are all keen on data science, but your backgrounds vary. That is great! Some sessions might be more engaging than others. If you are bored, help others or explore new data science areas
  • Always be respectful to each other
  • Ask questions whenever you need to!

Why Python and SQL?

Why Python

Why Python

Great community and easy to learn

There are thousands of Python user groups worldwide

The Python community is very active and welcoming!

  • Java:
public class Welcome {
    public static void main(String[] args) {
        System.out.println("Welcome to QTM151!");
    }
}
  • Python:
print("Welcome to QTM151!")

Why Python

Salaries are good!

Why SQL

  • SQL is the standard language for relational database management systems
  • Easy to learn
  • Standardised
  • Manage huge amounts of data
  • Widely used in industry
  • Great for data analysis

Why SQL

Salaries are good too!

Course Logistics

Course Objectives

  1. Perform basic operations and write functions in Python
  2. Conduct data wrangling and manipulate data using Python libraries such as Pandas
  3. Merge and manage databases using SQL
  4. Create visualisations to effectively communicate data insights
  5. Implement linear models and understand the principles of time series analysis
  6. Use Jupyter Notebooks for reproducible research
  7. Develop problem-solving skills relevant to data analysis and statistical computing

Grades and Late Policy

  • Assignments (x10): 50%
    • Practice class concepts​
  • Quizzes (x5): 30%
    • Questions are given in advance​
    • Data is provided in the class​
  • Final Project: 20%​
    • Will provide guidelines on Canvas​ and GitHub
    • Due at the end of the semester​
  • All materials will be available on the course website​ and GitHub​ repository
  • Late assignments will automatically be graded for half-credit​
  • To account for unforeseen circumstances, we will drop the worst assignment and the worst quiz​
  • Watch out for the assignments to install software. You will need these to be able to use the lectures notes

Computing Set Up

Our Class in a Nutshell

Installing Python using Anaconda

Anaconda has (almost) all the libraries we need


  • Follow the instructions on our GitHub repository
  • We are using Anaconda virtual environments for this class (I will cover this in more detail next class)​
  • For now: Anaconda comes with a Python installation​
  • Questions?

Installing VSCode and Connecting Anaconda​


  • Follow the instructions Installing Visual Studio Code and Connecting it with Anaconda from GitHub
  • For now: know that “base” is the Anaconda virtual environment that comes by default with the installation​
  • The next step is to check if the connection between VSCode and Anaconda worked -> the Python: Select Interpreter option should include Python (“base”)​
  • Next step: we will create a new folder for the QTM151 course and download our virtual GitHub folder and opening it in VSCode

Jupyter Notebooks

  • We will use Jupyter Notebooks for our classes. We have a tutorial on how to use them too
  • Jupyter Notebooks are a great way to combine code, text, and visualisations
  • It is encouraged that you bring your laptop to class
  • Lecture notes are designed to be follow-along. There will be many “try it yourself” exercises throughout the lectures!

Next Class

  • Don’t worry if you are not able to follow along with the installation instructions. We will have time to do this next class
  • We will start with the basics of Jupyter Notebooks
  • We will also cover the basics of Anaconda and VSCode
  • Please remember to check the course repository, mainly the tutorials: https://github.com/danilofreire/qtm151
  • And please do not forget:
    • Coding ability can be developed
    • Academic skills and abilities are acquired through hard work, mistakes, and perseverance. Coding is no different
    • My only goal here is that you learn the material. Please ask me questions! 😊

Questions?

Thank you very much for your attention! 😃 🥳