I defended my BSc thesis Causal Discovery from Interventional Data and earned the degree of Bachelor of Science in Mathematics on September 8th, 2021. You can download the thesis, the slides, find the simulation scripts on GitHub, or read the abstract below.
Abstract of the thesis
We consider the task of learning the causes of a response in three closely related problems, all related to the scenario of two separate sets of interventional experiments with hidden variables and unknown intervention targets. In the first problem (which we haven’t seen studied before), the covariates are observed in the first set of experiments and the response is observed in the second set of experiments, yielding unpaired data. We present a novel method, POLS, for its solution. In the second problem both the response and the covariates are observed in both sets of experiments; in the third problem we only conduct one set of experiments and then, in order to be able to use the methods from the first two problems (which require a second data set), we permute the rows to emulate data from a second set of experiments. We present another novel method, DPOLS, for the last two problems, and give a proof that it will select the correct parents asymptotically in a specific case. We give a strengthened version of Reichenbach’s Common Cause Principle as motivation for the methods and investigate their performance through large scale simulation experiments. Our results show that both POLS and DPOLS beat the baseline methods. The results also indicate that DPOLS finds the correct parents asymptotically in many cases, including on permuted data, and that it is even useful for finding extra ancestors after having selected all parents.