Lectures
The course covers the following topics, with corresponding lecture materials available in the lectures folder. Please refer to the syllabus for additional suggested readings on each topic. Links will be added as the materials are posted.
Module 01: Introduction
August 27, 2025:
- Syllabus and course repository: https://github.com/danilofreire/qtm350.
- Lecture 01: Welcome to QTM 350 - Introduction.
- Course Tutorials: How to Install Anaconda, Jupyter, PostgreSQL, VSCode, and Open a Free Educational Account on GitHub.
Suggested references:
- Cleveland, W. S. (2001). Data science: An action plan for expanding the technical areas of the field of statistics. International Statistical Review, 69(1), 21-26.
- Donoho, D. (2017). 50 Years of Data Science. Journal of Computational and Graphical Statistics, 26(4), 745-766.
- Breiman, L. (2001). Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author). Statistical Science, 16(3), 199-231.
- Brady, H. E. (2019). The Challenge of Big Data and Data Science. Annual Review of Political Science, 22(1), 297-323.
- Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A., & Hoffman, M. M. (2019). Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities. Information Fusion, 50, 71-91.
September 3, 2025:
- Lecture 02: Computational Literacy.
- Assignment 01: Problem Set 01.
Suggested references:
- Campbell-Kelly, M., Aspray, W. F., Yost, J. R., Tinn, H., & Díaz, G. C. (2023). Computer: A History of the Information Machine. Routledge.
- Shalf, J. (2020). The Future of Computing beyond Moore’s Law. Philosophical Transactions of the Royal Society A, 378(2166), 20190061.
- Al-Hashimi, H. M. (2023). Turing, von Neumann, and The Computational Architecture of Biological Machines. Proceedings of the National Academy of Sciences, 120(25), e2220022120.
- Wing, J. M. (2006). Computational Thinking. Communications of the ACM, 49(3), 33-35.
- Videos: David J. Malan - Abstraction, Khan Academy - Hexadecimal Number System, Matthias Wandel - Marble Adding Machine, Crash Course - Early Computing and Electronic Computing (the last two are quite entertaining!).
Module 02: Introduction to the Command-Line Interface and Version Control
September 8, 2025:
Suggested references:
- Janssens, J. (2021). Data Science at the Command Line: Obtain, Scrub, Explore, and Model Data with Unix Power Tools (2nd ed.). O’Reilly Media.
- Levy, J. (2024). The Art of Command Line. GitHub.
- Shotts, W. (2019). The Linux Command Line: A Complete Introduction. No Starch Press.
- Healy, K. (2019). The Plain Person’s Guide to Plain Text Social Science. Chapters 1-5.
September 10, 2025:
- Lecture 04: More Command Line Tools, Text Files and Scripting.
- Assignment 01 due (5%).
- Assignment 02: Problem Set 02.
- Kahoot Quiz.
Suggested references:
- Kerr, D. (2024). Effective Shell.
- Irianto, I. (2021). Learn Vim (the Smart Way).
- Neil, D. (2015). Practical Vim: Edit Text at the Speed of Thought. Pragmatic Bookshelf.
- Dennis, J. Your problem with Vim is that you don’t grok vi. (Stack Overflow).
- Vim Adventures. (Instructor’s note: this is a fun, albeit cringy, way to learn Vim).
- Videos: freeCodeCamp - Command line crash course, Percy Grunwald - Absolute beginner guide to the macOS terminal, NetworkChuck - 50 macOS tips and tricks using terminal
September 15, 2025:
- Lecture 05: Version control with Git and GitHub.
- Kahoot Quiz.
Suggested references:
- Chacon, S. and Straub, B. (2014). Pro Git. Apress. (Instructor’s note: this is the book on Git).
- GitHub tutorials: GitHub skills (recommended), Git guides, GitHub learning lab, Best practices for repositories.
September 17, 2025:
- Lecture 06: More Git and GitHub: pull requests, issues, pages, and collaboration features.
- Kahoot Quiz.
- Assignment 02 due (5%).
- Assignment 03: Problem Set 03.
Suggested references:
- Perez-Riverol, Y., Gatto, L., Wang, R., Sachsenberg, T., Uszkoreit, J., Leprevost, F. da V., Fufezan, C., Ternent, T., Eglen, S. J., Katz, D. S., Pollard, T. J., Konovalov, A., Flight, R. M., Blin, K., & Vizcaíno, J. A. (2016). Ten Simple Rules for Taking Advantage of Git and GitHub. PLOS Computational Biology, 12(7), e1004947.
- Beckman, M. D., Çetinkaya-Rundel, M., Horton, N. J., Rundel, C. W., Sullivan, A. J., & Tackett, M. (2021). Implementing version control with git and GitHub as a learning objective in statistics and data science courses. Journal of Statistics and Data Science Education, 29(sup1), S132-S144.
- Escamilla, E., Klein, M., Cooper, T., Rampin, V., Weigle, M. C., & Nelson, M. L. (2022). The Rise of GitHub in Scholarly Publications. arXiv preprint arXiv:2208.04895.
September 22, 2025:
- Lecture 07: Git and GitHub Continued.
- Kahoot Quiz.
Module 03: Literate Programming with Markdown, Quarto, and Jupyter
September 24, 2025:
- Lecture 08: Quiz 01: Git and Github (6%).
- Assignment 03 due (5%).
- Assignment 04: Problem Set 04.
Suggested references:
- Quarto official website.
- Awesome Quarto: https://github.com/mcanouil/awesome-quarto. Note: this repository contains dozens of tutorials, examples, and resources.
- Çetinkaya-Rundel, M. & Lowndes, J. S. (2022) Keynote talk: Hello Quarto: Share • Collaborate • Teach • Reimagine. Slides and source code. This is one of the nicest Quarto presentations I have seen.
- Getting Started with Quarto (YouTube). Note: Posit (formerly RStudio) has a series of tutorials on Quarto on their YouTube channel. You can find their playlist here.
- Markdown Guide.
- Jupyter Notebooks Documentation.
- Codecademy - How to use Jupyter Notebooks
- Course tutorial: Jupyter and Markdown
September 29, 2025:
- Lecture 09: Introduction to Quarto.
Suggested references:
- Quarto official website.
- Awesome Quarto: https://github.com/mcanouil/awesome-quarto. Note: this repository contains dozens of tutorials, examples, and resources.
- Çetinkaya-Rundel, M. & Lowndes, J. S. (2022) Keynote talk: Hello Quarto: Share • Collaborate • Teach • Reimagine. Slides and source code. This is one of the nicest Quarto presentations I have seen.
- Getting Started with Quarto (YouTube). Note: Posit (formerly RStudio) has a series of tutorials on Quarto on their YouTube channel. You can find their playlist here.
October 1, 2025:
- Lecture 10: Writing Documents, Presentations, and Websites with Quarto.
- Assignment 04 due (5%).
- Assignment 05: Problem Set 05.
- Kahoot quiz
Suggested references:
- Quarto Documentation - Presentations and Websites.
- GitHub Pages Documentation.
- French, J. (2023). Creating Websites with Quarto and GitHub Pages (YouTube Playlist).
- Taylor, I. (2022). Publishing a Quarto Site to GitHub Pages.
Module 04: AI-Assisted Programming
October 6, 2025:
Suggested references:
- Matarazzo, A & Torlone, R. (2025). A Survey on Large Language Models with some Insights on their Capabilities and Limitations. arXiv preprint arXiv:2501.04040.”
- Cihon, P. & Demirer, M. (2023). How AI-powered software development may affect labor markets. Brookings Institution
- Poldrack, R. A., Lu, T., & Beguš, G. (2023). AI-assisted Coding: Experiments with GPT-4. arXiv preprint arXiv:2304.13187.
- Lau, S & Guo, P. (2023). From “Ban It Till We Understand It” to “Resistance is Futile”: How University Programming Instructors Plan to Adapt as More Students Use AI Code Generation and Explanation Tools such as ChatGPT and GitHub Copilot. In Proceedings of the 2023 ACM Conference on International Computing Education Research V.1 (ICER ’23 V1), August 07–11, 2023, Chicago, IL, USA. ACM, New York, NY, USA 16 Pages.
- Linus Torvalds Discusses the Impact of AI on Programming (YouTube).
- Using GitHub Copilot in the Command Line
- GitHub Copilot YouTube Playlist
October 8, 2025:
- Lecture 12: Quiz 02: Literate Programming (6%).
- Assignment 04 due (5%).
- Assignment 05: Problem Set 05.
October 13, 2025:
- Lecture 13: AI-Assisted Programming, APIs, and Agents..
Module 05: Introduction to Cloud Computing
October 15, 2025:
- Lecture 14: Introduction to Cloud Computing.
- Kahoot Quiz.
- Assignment 05 due (5%).
Suggested references:
- Amazon Web Services (AWS) Documentation.
- AWS Educate.
- AWS Training and Certification.
- AWS Cloud Practitioner Tutorial
- Ollama Documentation.
- LM Studio Documentation.
- Browser Use Documentation
- Hugging Face Documentation. Note: great resource for NLP models and tools.
- Newman, S. (2021). Building microservices: designing fine-grained systems. O’Reilly Media, Inc.
- Erl, T., Puttini, R., & Mahmood, Z. (2013). Cloud computing: concepts, technology & architecture. Pearson Education.
October 20, 2025:
October 22, 2025:
- Lecture 16: Quiz 03: AI-Assisted Programming and Cloud Computing (6%).
- Assignment 07: Problem Set 07.
Module 06: Introduction to SQL Databases
October 27, 2025:
- Lecture 17: Introduction to SQL: Data Types, Tables, and Queries.
- Instructions for the Final Project.
Suggested references:
- Mode Analytics: SQL Tutorial.
- Real Python: SQL Databases and SQLite.
- Khan Academy: SQL Basics. (Note: Khan Academy is a great resource for learning SQL and other programming languages).
- SQLite Cheat Sheet.
- SQLite Documentation.
- SQL for Data Science.
October 29, 2025:
- Lecture 18: SQL in Python: Connecting to Databases with Pandas.
- Kahoot Quiz.
- Assignment 07 due (5%).
- Assignment 08: Problem Set 08.
November 3, 2025:
- Lecture 19: Merging Tables in SQL.
- Kahoot Quiz.
Suggested references:
November 5, 2025:
- Lecture 20: Quiz 04: SQL Databases (6%).
- Assignment 08 due (5%).
- Assignment 09: Problem Set 09.
Module 07: Parallel Computing
November 10, 2025:
- Lecture 21: Parallel Computing with Dask.
Suggested references:
- Dask Documentation
- Dask Tutorial
- Coiled - Intro to Dask Tutorial (YouTube).
- Rocklin, M. (2017). Dask: Flexible Library for Parallel Computing in Python. In Proceedings of the 16th Python in Science Conference (Vol. 126, p. 130).
November 12, 2025:
- Lecture 22: Application: Parallelising Data Analysis with Dask and AutoML.
- Kahoot Quiz.
- Assignment 09 due (5%).
- Assignment 10: Problem Set 10
Suggested references:
- Dask Documentation: Machine Learning.
- He, X., Zhao, K., & Chu, X. (2021). AutoML: A Survey of the State-of-The-Art. Knowledge-based systems, 212, 106622.
- TPOT Documentation.
Module 08: Containers and Reproducibility
November 17, 2025:
Suggested references:
- Docker Documentation
- ComposeCraft. A tool to help you manage, edit and share docker compose files in a GUI way.
November 19, 2025:
- Lecture 24: Docker for Data Science.
- Assignment 10 due (5%).
November 24, 2025:
- Lecture 25: Course Review.
- Kahoot Quiz.
December 1, 2025:
- Lecture 26: Quiz 05: Dask, Docker and Containers (6%).
December 3, 2025:
- Lecture 27: No specific agenda; students can drop in for help with their projects.
December 8, 2025:
- No lecture.
- Final Project due (20%).