Python code for Quantitative Social Science: An Introduction

March 15, 2025 ยท View on GitHub

Welcome! This repository contains Python companion guides for Kosuke Imai's Quantitative Social Science (QSS).

The qsspy code is available for all book chapters (in .py, .ipynb, and .pdf file formats):

  1. Introduction
  2. Causality
  3. Measurement
  4. Prediction
  5. Discovery
  6. Probability
  7. Uncertainty

Setup

The companion guides focus on replicating the QSS analysis in Python. Currently, they do not contain detailed instructions on setting up a Python environment. Fortunately, there are many excellent and free resources available online that provide guidance in this respect. Below are a few recommendations.

Installation and Package Management

A good option is to follow the Installation and Setup instructions in Wes McKinney's Python for Data Analysis, 3E. The instructions walk readers through a few important steps:

The approach outlined above is one among many options. Other popular ways to install Python and manage packages include:

  • Installing Python from the Python Software Foundation, using the pip package manager to install packages from the Python Package Index (PyPI), and managing packages with Python virtual environments. This approach is similar to installing miniconda and using conda, conda-forge, and conda environments. Note that these approaches are not mutually exclusive. For example, you can use pip to install packages from PyPI into a conda environment.
  • Installing an Anaconda distribution of Python. Anaconda comes pre-loaded with many data analysis packages. Anaconda offers a free individual version.
  • Using container-based approaches, such as Docker. This is a more advanced workflow that is widely used for deploying applications.

Required Packages

After setting up a Python environment, you will need to install the packages that are necessary for running the qsspy code. The following packages are used extensively:

Chapter 5, Discovery, also makes targeted use of nltk, wordcloud, python-igraph, and geopandas.

If you are working with conda, you can use the environment.yml file contained in this repository to create a conda environment that includes the required packages.

Integrated Development Environments

There are a variety of popular integrated development environments (IDEs) for conducting data analysis in Python, including:

The companion guides were built using the VS Code IDE and VS Code's Jupyter notebooks integration. If you elect to use VS Code, helpful documentation pages include:

For those coming from an R background, Spyder is similar to RStudio.

References

Imai, Kosuke (2017). Quantitative Social Science: An Introduction. Princeton University Press.

McKinney, Wes (2022). Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter, 3rd Edition. O'Reilly Media.