Legal Interpretation by Big Data

Larry Solan

Brooklyn Law School

Friday, October 12, 2018
2 p.m.

513 Lattimore Hall


Over the past eight years, legal analysts have begun using linguistic corpora in statutory and constitutional interpretation. Judges often presume that the legislature intends the words in laws to be understood in their ordinary sense. Similarly, the originalist movement in constitutional interpretation search for what they call “original public meaning” of terms in the Constitution. By this they mean the meaning that an educated person, living at that time, would typically assign to the term. A typical illustration is that “domestic violence” probably meant “armed insurrection” rather than “spousal abuse” at the time, and therefore, the Constitution should be understood accordingly.

Using the BYU corpora COCA (Corpus of Contemporary American English), COHA (Corpus of Historical American English), and a new corpus of 18th century American English. Analysts have been drawing inferences of legislative intent based on the relative frequency of one sense over another in the relevant corpus.

This practice has generated concern among some writers. I am among them. Linguist Tammy Gales and I have been attempting to establish criteria for the use of corpus analysis being most efficacious. Among them are appropriate reliance on the legal decision that ordinary meaning should prevail; an understanding of what makes ordinary meaning ordinary; knowing what search to conduct; avoiding excessive inferences from the absence of a sense in the relevant corpus; and choosing the right corpus. We have also begun to expand corpus analysis from reliance on relative frequency to making use of an analysis of collocates, among other corpus linguistic tools.

Others have argued that for contemporary laws, experimental survey studies may be more efficacious than corpus analysis. The presentation discusses this and other potential limitations of corpus methodology in legal interpretation.