Carnival
Booth: An Algorithm for Defeating the Computer-Assisted Passenger Screening
System
Samidh
Chakrabarti
Aaron
Strauss
6.806:
Law and Ethics on the Electronic Frontier
To
improve the efficiency of airport security screening, the FAA deployed the
Computer Assisted Passenger Screening system (CAPS) in 1999. CAPS attempts to
identify potential terrorists through the use of profiles so that security
personnel can focus the bulk of their attention on high-risk individuals. In
this paper, we show that since CAPS uses profiles to select passengers for
increased scrutiny, it is actually less secure than systems that employ random
searches. In particular, we present an algorithm called Carnival Booth
that demonstrates how a terrorist cell can defeat the CAPS system. Using a
combination of statistical analysis and computer simulation, we evaluate the
efficacy of Carnival Booth and illustrate that CAPS is an ineffective
security measure. Based on these findings, we argue that CAPS should not be
legally permissible since it does not satisfy court-interpreted exemptions to
the Fourth Amendment. Finally, based both on our analysis of CAPS and
historical case studies, we provide policy recommendations on how to improve
air security.
Table of Contents
5.1 Ressam’s 1999 Terrorist Attempt
6.1 Administrative Search
Exception
On the
morning of
Almost
immediately, the finger pointing began between federal agencies over who was
responsible for not thwarting this attack—the most destructive ever carried out
on the
With the
vulnerabilities of the nation’s air transportation infrastructure made
painfully clear, the FAA had a new sense of urgency in plugging the holes in a
security system that was widely recognized as being as porous as a sieve. But
the challenges in doing so were, and still are, daunting. Over 639 million
passengers pass through airports annually in the
To address
these vexing security problems, the FAA has been trying in recent years to
employ information technology to boost the overall efficiency of security
screening. The kernel idea behind their approach is to be more intelligent
about which passengers are selected for rigorous inspections. Intuitively, the
FAA argues, if you only have the ability to scrutinize a small percentage of
passengers, it seems best to spend the bulk of time carefully searching those
who are likely to be terrorists and not waste much time searching those who
have a small chance of posing harm. Why frisk Eleanor, the 80-year-old
grandmother from
Drawing
from these intuitive underpinnings, the crown jewel of the FAA’s information
technology efforts is a system called the Computer Assisted Passenger Screening
system (CAPS). The FAA contends that since CAPS uses profiles to pinpoint
potential terrorists for closer inspection, it will not only result in the
apprehension of more criminals, but will also make security screening more
expedient for well-meaning citizens. Though in place since 1999, CAPS has
gained much more attention as a promising counter-terrorism tool in the wake of
September 11. The FAA already augmented the system in January, and plans for
further expansion are underway.[3]
In our
paper, we show that although these intuitive foundations might be compelling,
their implementation in CAPS is flawed. That is to say that any CAPS-like
airport security system that uses profiles to select passengers for increased
scrutiny is bound to be less secure than systems that randomly select
passengers for thorough inspection. Using mathematical models and computer
simulation, we show how a terrorist cell can increase their chances of mounting
a successful attack under the CAPS system as opposed to a security system that
uses only random searches. Instinct may suggest that CAPS strengthens security,
but it in fact introduces a gaping security hole easily exploitable by
terrorist cells. It should be noted that CAPS has also received immense
criticism from privacy advocates and civil libertarians[4],
but in this paper we restrict our discussion to a purely technical perspective
and the legal and policy implications of such an analysis.
In Section
2, we better define how the CAPS system operates. In Section 3, we present our
algorithm for defeating CAPS. To evaluate the algorithm’s efficacy, in Section
4 we present the results of a probabilistic analysis and computer simulation of
airport security. In Section 5, we discuss a few case studies to understand
what security techniques have been effective historically. In Section 6, we
discuss the legal implications of our finding that CAPS performs worse than
random search. And finally, we conclude in Section 7 with policy
recommendations for how to improve air security.
During the
second term of his administration, President Clinton convened a panel to
develop a set of recommendations to improve air transportation security. The
resulting White House Commission on Aviation Safety and Security (chaired by
Vice-President Gore) published its final report[5]
in February of 1997, giving the federal seal of approval to automated passenger
profiling. “Based on information that is already in computer databases,” the
Gore Commission wrote, “passengers could be separated into a very large
majority who present little or no risk, and a small minority who merit
additional attention.” This statement also serves to articulate the federally
supported two-stage architecture behind passenger profiling systems. First, the
system should develop a secret profile describing characteristics of high-risk
individuals (profile development). And then security manpower should be focused
on those individuals who, based on available data, match the profile (profile
evaluation). Such passengers are referred to as selectees.
The Gore
Commission was cognizant of the constitutional fragility of such a system, so
they invited testimony from civil liberty groups who were concerned that the
derived profile might violate Fourteenth Amendment equal protection. To pacify
these fears, the Commission dictated that “No profile should contain or be
based on material of a constitutionally suspect nature.” The Commission also
insisted that the FAA should periodically consult the Department of Justice “to
ensure that selection is not impermissibly based on national origin, racial,
ethnic, religious or gender characteristics.” Finally, to ensure that no one
group is singled out, the Commission recommended that passenger profiling
systems also choose random people who do not fit the profile as selectees.[6]
Using the
guidelines set forth by the Gore Commission, the FAA and Northwest Airlines
jointly completed development of the first version of CAPS in 1998. The federal
government closely guards the details of how the system operates, citing a
compelling national security interest— if the specifications were to be
released, it would be trivial for potential criminals to defeat the system. But
drawing from an assortment of news articles[7],
interviews with airport personnel, the Gore Commission report[8],
leaks captured on the congressional record,[9]
and prototypes written by software companies bidding to develop future versions[10],
a cohesive if not comprehensive understanding of CAPS can be painted.
CAPS
operates according to the same two-stage model described in the Gore Commission
report: profile development followed by profile evaluation. First, based on a
historical record of data pertaining to known terrorist activities, the
software attempts to detect subtle patterns in the data that correlate with
prior terrorist plots and anti-correlate with the activities of non-criminals.
For instance, the software might find that those people who bought one-way
tickets in cash and traveled abroad frequently had an elevated chance of being
terrorists. CAPS then assembles these patterns into a secret profile suitable
for inspection by the Department of Justice. Since the DOJ certification is
only held periodically, the derived master profile is presumably static for
long periods of time.
On a more
technical level, CAPS likely accomplishes pattern detection through the use of
a three-layer neural network. The first layer of the network contains hundreds
if not thousands of nodes, the third layer contains a single output node, and
the second layer contains an intermediate number of nodes. Each field of data
available for profile development is fed into a separate node of the first
layer. Using a standard training procedure, such as back propagation, the
weights of each connection are set such that when data from a terrorist is fed
into the network, the output layer returns a value close to one. But when data
from a non-criminal is fed into the network, the output layer returns a value
close to zero. If training of the network is successful, the matrix of
connection weights serves as the profile.
The CAPS
system installed in 1999 and currently in use is only capable of doing profile
development over data pertaining to the history of ticket purchases. Future
versions of CAPS, however, will be able to incorporate a richer set of data,
including driving history, credit card purchases, telephone call logs, and
criminal records, among other information. Though allowing CAPS to access some
of this data would require changes in privacy legislation, Congress, following
on the heels of the PATRIOT act, is poised to facilitate.[11]
Once CAPS
crafts a profile, it is incorporated into software that is accessible from
every airline check-in counter nationwide. When a passenger checks in, the
ticket agent enters the passenger’s name into the CAPS console. Data mining
software linked to government databases then scours for information about the
passenger, retrieving data relevant to the profile. The software compares the
similarity of the acquired data to the profile and computes a “threat index”
assessing how much potential risk that passenger may pose.[12]
In the technical scenario outlined above, the computed threat index would
simply be the product of the profile matrix with the passenger’s mined data
vector.
If the
passenger has one of the top 3-8% of threat indices relative to the other
people on his flight, then CAPS flags him for “special treatment.” To protect
the integrity of the CAPS system, the precise percentage of people flagged by
CAPS is unpublished, but it is known to be in that range.[13]
To comply with the Gore Commission guidelines, a small percentage of people on
each flight are randomly flagged as well. In all, the total number of people
flagged by CAPS is limited by the security personnel resources available at
each particular airport.
What
exactly do these “special treatment” flags entail? Before September 11, CAPS
flags were only tied to the passenger’s checked baggage, which were scanned
using costly explosives-detection equipment. This represented the conventional
wisdom of the time that a terrorist would try to smuggle explosives only in his
checked luggage[14].
But now, the FAA is tying CAPS flags to individuals. Someone flagged by CAPS
may have her carryon bags specially inspected, she may be subject to
questioning, she may be asked to stand in a separate line, she may be asked to
comply with a search of her body, or a guard may even escort her directly to
the gate.[15]
In an
article in Slate magazine, Microsoft Chief Architect Charles Simonyi
related his experience of being flagged by CAPS.[16]
During a routine business trip, security personnel insisted on completely
unpacking and repacking all of his carryon bags. This happened time after time.
”Then it hit me,” Simonyi writes. “It was not that security was especially
tight: It was only me they wanted. The label my friendly hometown airline had
affixed to my bags had unexpectedly made me a marked man, someone selected for
some unknown special treatment.” The bottom line is that if CAPS flag you,
you’ll be treated differently, and you’ll know it.
This
transparency is the Achilles’ Heel of CAPS; the fact that individuals know
their CAPS status enables the system to be reverse engineered. You, like
Simonyi, know if you’re carryons have been manually inspected. You know if
you’ve been questioned. You know if you’re asked to stand in a special line.
You know if you’ve been frisked. All of this open scrutiny makes it possible to
learn an anti-profile to defeat CAPS, even if the profile itself is always kept
secret. We call this the “Carnival Booth Effect” since, like a carnie, it
entices terrorists to “Step Right Up! See if you’re a winner!” In this case,
the terrorist can step right up and see if he’s been flagged.
We will now
present an algorithm that a terrorist cell can employ to increase their
probability of mounting a successful attack under the CAPS system as opposed to
an airport security system that employs only random searches. The key idea is
that a terrorist cell can probe the security system to ascertain which of their
members have low CAPS scores. Then they can send these members on destructive
missions. Since security manpower is disproportionately spent on people with
high CAPS scores, and the operative has a low score, he will most likely face
reduced scrutiny.
The
algorithm, which we call Carnival Booth, then is as follows: (1) Probe
the system by sending an operative on a flight. The operative has no intent of
causing harm. He has no explosives. He has no weapons. He has nothing. He
simply takes the flight and notes whether or not CAPS flags him. (2) If he is
flagged, then send another operative in the same manner. (3) Repeat this
process until a member who consistently eludes CAPS flags is found. (4) Now
send this operative on a mission with intent to harm, complete with weapons or
explosives. Since CAPS didn’t flag him last time, he likely won’t be flagged
this time, so he incurs much less risk of special scrutiny.
To better
understand how a terrorist cell using this algorithm stands a better chance of
success under CAPS than under a random system, let’s consider the numerical
example illustrated in Figure 1. Suppose an airport only has the personnel
resources to give 8% of people special scrutiny; the other 92% undergo standard
screening through a metal detector. Under a system where people are selected at
random, this airport can afford to flag 8% randomly. This means that every time
a terrorist attempts to go through security, he stands an 8% chance of
increased scrutiny. This will be true no matter what tactic or algorithm the
terrorist uses.
(Random) (Metal Detectors)
Random System Terrorist Activity Increased Scrutiny No Terrorist Activity Standard Scrutiny



![]()
![]()
![]()
![]()
Figure 1: Regions of Terrorist Activity
under CAPS and a Random System
Now compare this to the same airport using a CAPS system,
which may for example flag the 6% of passengers with the highest threat indices
and 2% randomly in order to equal their personnel-constrained 8% limit. By
employing the algorithm described above, the terrorist cell knows that since
their operative has previously probed the system without a flag, CAPS likely
will not flag him again. In essence, the terrorist cell is able to relegate its
harmful activities outside of the 6% CAPS flag zone. Now, their operative only
has a 2% chance of calling up a thorough inspection. Compare this to the 8%
chance the terrorist would incur under the random system. It’s clear that
terrorist cells would therefore prefer airports fortified by CAPS.
Upon
reading this analysis, it’s natural to feel uneasy. How is it that such an
intuitive system can actually weaken security? For clarity, it is useful to
draw a distinction between individual terrorists and the coordinated activities
of an entire terrorist cell. It is entirely probable that even a rudimentary
CAPS profile can flag many individual terrorists, shrinking the viable pool of
recruits that a terrorist cell can send on a mission. But so long as the cell
as a unit can identify those members who slip through CAPS, even if they are
few in number, it has an enhanced probability of mounting a successful attack.
It only takes one person to do harm. Carnival Booth makes the process of
identifying such a person simple.
Evolutionary
biology validates this perspective. The most famous example is the peppered
moths of London. In the 1950s, Dr. H.B.D. Kettlewell, a physician, noticed that
the peppered moth population living near industrialized areas of London started
to change in color from light gray to dark gray. Through a series of
experiments, Kettlewell showed that dark colored moths were more apt to survive
near industrialized areas because, by matching the color of the smokestack soot
that coated the ground, they could evade predators.[17]
Individual light gray moths were eliminated from the gene pool, but since the
population had a few dark gray moths (maybe only one), the species survived and
proliferated. Even in nature, a non-conscious species can subvert static
environmental conditions that expose them to danger. Conscious terrorists, one
would expect, can likewise circumvent a profile.
Terrorists,
of course, are not moths. In fact, the evolutionary biology example points to
several assumptions we must make in order to claim the effectiveness of our
CAPS circumvention algorithm. Unlike a natural species, terrorists do not have,
for example, infinite variation and boundless evolutionary time. This leads to
three key questions: Do terrorist cells have a diverse enough membership to
successfully use this algorithm? Do they have the money, patience, and planning
skill to execute it? And above all, are they smart enough to know to use it?
Using findings from our nation’s “War on Terror” as a case study, we claim that
it is safe to assume that the answers to all of these questions are
affirmative.
The
terrorists revealed in recent months are strikingly diverse. Most famously,
John Walker Lindh— the “American Taliban”[18]—
is a nineteen-year-old Caucasian boy from Marin County, the yuppie capital of
California. He was fighting on the front lines for the Taliban in Afghanistan
against American soldiers. If he had never been discovered, would it have been
difficult for Al Qaeda to send him back to the United States on a terrorist mission?
Then there is “shoe bomber” Richard Reid, accused of trying to detonate
explosives in his sneakers during a transatlantic flight.[19]
He is a British citizen with an English mother and Jamaican father. And just
last week the FBI apprehended Lucas Helder, a 21-year-old art major at the
University of Wisconsin-Stout.[20]
Helder allegedly planted 18 pipe bombs in mailboxes in five different states to
form a “smiley face” pattern. And who can forget Ted Kaczynski and Timothy
McVeigh? Terrorists clearly have no shortage of diversity.
As the
hijackers of September 11 showed, they also have no shortage of money,
patience, and planning acumen. Mohammed Atta planned the attack years in
advance, studying diagrams of the World Trade Center, going to flight school,
and analyzing airplane specifications on the Internet. The money trail funding
Atta’s operation weaves through anonymous bank accounts in multiple countries.
It is so intricate that the FBI still does not completely understand it.
Terrorist cells like Al Qaeda have demonstrated that they have the resources
required in terms of money, patience, and planning to use the algorithm.
What may be more alarming is that evidence from the September 11 investigation shows that Atta already knew the kernel idea behind this algorithm. Newsweek reported[21] that in the weeks before September 11, Atta and his conspirators practiced their attack by boarding the exact same target flights they intended to later hijack (same planes, same times, same origins and destinations). They wanted to ensure that they didn’t raise any suspicions or red flags. This is a clear demonstration of Atta’s cleverness. Like Atta, terrorists are smart. They already know this algorithm. And they are already using it.
A combination
of a probabilistic analysis and results from a computer simulation demonstrate
in more concrete terms if it is possible for a terrorist cell to use Carnival
Booth to defeat the CAPS system. In particular, we wanted to find out under
what conditions CAPS outperforms or underperforms random search, and we wanted
to quantitatively determine how much a terrorist cell could benefit from using
the algorithm. To accomplish this, we compared three systems: (1) the CAPS
system, (2) a system where passengers are randomly selected, and (3) a random
system with more advanced administrative searching. The results are clear. The
less a system relies on profiling and the more advanced its administrative
searching, the more terrorists it will catch.
The
analysis makes two reasonable assumptions. First, a terrorist must bring a
weapon or bomb onboard the airplane to cause damage. Modifications to cockpit
doors, sky marshals, and heightened passenger awareness after September 11th
have forced potential terrorists to use more than cardboard cutters to gain
control of a plane. Second, a future terrorist that is flying with no weapons
and no criminal intent will not be apprehended by law enforcement. Even if this
person has the highest threat index possible, there would be no reason to
apprehend him or her (barring an outstanding warrant or alleged connection to a
previous terrorist act).
We model
all three flavors of airport security with a two-stage architecture. The first
stage is administrative screening, which includes the usage of metal detectors
and basic questioning about luggage contents. All passengers are subjected to
this level of screening. The second stage is increased screening for those who
are either flagged by CAPS or randomly selected.
Using
probability theory, we developed a generalized function for these two-stage
systems that determines the probability that a terrorist will be caught. First,
the total probability of the terrorist being arrested is the sum of whether the
terrorist is arrested during a 2nd level search prompted randomly, a 2nd level
search prompted by a CAPS flag, or a 1st level administrative search:
Pr(Terrorist Arrested) =
Pr(Terrorist Arrested During
Randomly-flagged 2nd Level Search)
+
Pr(Terrorist Arrested During CAPS-flagged 2nd Level Search)
+
Pr(Terrorist Arrested During Administrative 1st Level Search)
Introducing some notation, let A be whether the terrorist is
arrested, R be whether the terrorist is randomly chosen for 2nd level
screening, and C be whether the terrorist is selected by CAPS for 2nd level
screening. Since the probability that the terrorist will undergo administrative
search is equal to the probability that the he is not flagged for 2nd level
search either by a CAPS flag or a random flag, the full equation becomes:
Pr(A) =
Pr(A | R) * Pr(R)
+ Pr(A | C)
* Pr(C)
+ Pr(A |
¬(R U C)) * Pr(¬(R U C))
To analyze
the dynamics of this equation, there are several constraints we can impose on
these probabilities. First observe that the probability that 2nd level
interrogation results in an arrest should be the same whether the terrorist is
flagged by the CAPS profile or randomly. Hence, Pr(A | R) must be the same as
Pr(A | C). Also, since the 2nd level searches are more thorough than 1st level
searches, logic dictates that Pr(A | ¬(R U C)) be lower than either Pr(A | R)
or Pr(A | C). Finally, in all the simulations, the number of passengers that
will be subjected to heightened security will be the same.
Although
information about the CAPS system is limited, the total percentage of people
flagged by CAPS, either through a profile or randomly, is reported to be
between 2-8%. Accordingly, we choose Pr(R) to be 2% under CAPS, and 8% under a
random system where all personnel resources can be devoted to random
inspections. Since second level searches are presumably more effective than
first level searches, our system arbitrarily sets Pr(A | R) at 75% and Pr(A | ¬
(R U C)) at 25% for the first two systems. For the third system employing better
administrative searches, Pr(A | ¬(R U C)) is set to 40%. We can begin to
compare the systems now that all values except Pr(C) are now known:
(CAPS system) Pr(A)
= 0.75 * 0.02 +
0.75
* Pr(C) +
0.25
* (1 - 0.02 - Pr(C))
(Random system) Pr(A)
= 0.75 * 0.08 +
0.75
* 0 +
0.25
* 0.92 = 29%
(Random w/ admin+) Pr(A)
= 0.75 * 0.08 +
0.75
* 0 +
0.40
* 0.92 = 42.8%