Tidal Motions of Causality
Waves: An Introduction
Just as the waves ebb and flow, the principles of causality, cause and effect push and pull the world around us into the way that it is. Despite the fact that causality in principle is something that is astoundingly intuitive and simple to understand, defining it precisely and mathematically turns out to be quite a challenge. Bothering philosophers and logicians alike for millennia (way back to Aristotle), defining causality in a rigid sense can be traced to Hume in the1700s and more recently, to Joseph Halpern and Judea Pearl. Halpern and Pearl are two computer scientists who really picked up the pace in this area that few others pioneered. Even today, in thousands of courts across the country, lawyers struggle and fight over who is to blame, who the actual cause of the trial was. Perhaps creating such concrete definitions for causality will be the building blocks for creating an AI suitable as a lawyer.
Before I start this review, I wanted to give a bit of an overview. Halpern and Pearl are primarily concerned with creating a definition for an actual cause, which I will define better later on. Intuitively, we think of an actual cause as the event that immediately caused some effect. I can make that claim that everything in your life so far has lead up to you reading these words, and while that’s kind of true, we don’t think of your entire life as a cause for this specific action. We think of you deciding to go onto Medium and you actively choosing to click on this article as the cause for you reading these words.
While it is easy enough to understand intuitively, coming up with a precise, general, mathematical definition for actual cause turns out to be pretty difficult.
To set things up, Halpern primarily deals with causal models, which consist of some model composed of exogenous and endogenous variables. You can think of the model as a sort of graph. where all variables are represented as nodes, which are connected by directed edges. The edges point from cause to effect. In the case of Halpern’s paper, we only consider recursive (or acyclic) models, meaning that there is no form of feedback loop in causal model. In other words, we are only looking at DAGs (Directed Acyclic Graphs).
Exogenous variables are those that do not have any causes that we care about for the purposes of the model. For example, the sky is blue. Or we are currently alive. Endogenous variables are those that are the effect of some exogenous and/or endogenous variables.
There are also structural equations that are part of the causal model, but they assign values to the variables in the model and we will not be disscussing them here today.
The other half of a causal model is the context. This basically assigns values to the exogenous variables and lets everything basically filter down as the structural equations determines the values of everything else. Different contexts essentially access different “possible worlds” and the context that is most useful is the one for the actual world, or the one that we are observing right now.
To solve the lack of formal, rigid definitions, Halpern gives actual cause a three pronged definition. It is helpful here to clarify that actual causes focuses on specific events. It is NOT looking at the general case. The definition is the following, paraphrased from the Halpern paper:
Condition 1: That both the cause and effect have to happen (or be true) at the same time.
This is fairly obvious, there cannot be a cause of some effect where the cause does not actually cause the effect in the world that we are looking at or where the cause does not happen but the effect does. This just would not make sense.
Condition 2: A) Had cause A not have happened, but all other variables stay the same, then effect B would not happen B) It must be true that if cause A happened and ALL possible subsets of all other variables maintained their values from what actually happened, then effect B would still happen
This condition is really the heart of the definition as Halpern stresses over and over in his talk about this paper. A variable only needs to satisfy one of these two requirements. I will talk about why this is important later on. Before it was modified, part a creates a notion of a but-for cause, which is essentially a cause because we say effect B would NOT have happened BUT-FOR cause A. Now, it addresses the concern of the Throwing Rocks Example (see below). The idea is that we are allowed to freeze the value or status of all other variables except for the cause(s). Then we see if the effect, B still happens.
Part b creates the case of sufficiency. It states that if some cause A happens for sure and all other variables can but don’t have to change for all possible subsets of variables, then the effect B is still sure to happen. This will make more sense in the Wildfire example below.
Condition 3: Minimality. No subset of the set of variables of causes satisfies conditions 1 and 2.
This one is pretty straightforward. We just don’t want irrelevant things.
Consider we have five dominoes. I want to say that the actual cause of the fifth domino falling was the fourth domino. To do this, I use part a of condition 2, which states that domino 4 does not fall, but the rest do (this would prove that domino 4 is the actual cause). Now this is strange to me. I have five perfectly lined up dominoes, and yet I’m allowed this intervention by not allowing domino 4 to fall.
As a foreigner entering this field, I thought it was rather odd and found it difficult to accept that Halpern could just simply create a definition that was so out of touch with reality. As in, his definition bent the model such that it could not have happened in real life in many interesting cases. I had a lot of difficulty resolving this issue that he was just able to change whatever he wanted. However, I have learned that it is actually pretty common. It is not rare for researchers to come up with many examples and edge cases and then to find some definition that fits all of them. In some cases it works well and in others, it doesn’t.
Halpern himself addresses these concerns in his talk, and mentions that it is very possible that this definition will change again, although he does not anticipate (and hopes that it will not) happen.
In my opinion, the best way to learn about these conditions is through examples. I have provided some below.
This is the classic example that is also very interesting. Consider a world where Suzy and Billy are throwing rocks at some bottle. Suzy throws a little bit harder and her rock hits the bottle first, shattering it. Billy is also incredibly precise, but because Suzy already shattered the bottle, Billy does not hit the bottle because there is nothing there: i.e; nothing to shatter.
How do we justify that Suzy is the actual cause?
This becomes difficult and I will leave it as an exercise for you to see why modified part a of condition 2 is important here. Please leave a comment if you need the answer.
Consider a fire in the woods of California. We know that it is caused by two careless hikers who dropped matches. There are two possibilities below.
Disjunction — This is the case where either match caused the fire. This means that in order for the fire to not happen, both must have NOT happened. This is where part b of condition 2 comes in. We verify that both matches are part of the clause, but they together are not both the clause.
Conjunction — This is the case where BOTH matches combined cause the fire. This means that the fire could only have NOT happened if either one did not happen. Therefore, it’s easy to see that both are causes by using part a of condition two.
Halpern discusses two more factors that play a role in causality, specifically normality, responsibility.
Normality is the thought that the more something deviates from normal, the more it is strange. For example, it is not odd if I eat ice cream in the summer, but if I do so in the winter, it stands out more. In the cause of a cause, let’s say there are two pens on a receptionist’s table. The pens can be taken by administrators but not professors. A professor takes one pen and then an administrator takes another. The receptionist, in dire need of a pen, has no more pens. Who is to blame? In most cases, people would say the professor since he is not supposed to take pens. Normality is defined by convention and tradition, it is what we are used to, so the more we deviate from it, the more likely we are to assign blame to that.
Responsibility is the calculation of how much we are to blame for some event. For example if an election wins 6–5 vs 11–0, there is a big difference between how we view those events/who is to blame for the loss (other than the candidate of course). Halpern discusses mathematical ways to specifically assign this, but I will not go into it right now.
Taking in models, variables, actual causes, normality, and responsibility into consideration, it seems that we have the building blocks for an AI that is able to find causes of many events. Perhaps in the future, it will be these machines instead of lawyers that we find in court. But of course, the progress of this will rely on the discovery of exotic edge cases and examples. Those that stretch the limits of logic and philosophy. It is up to us to find them.