Machine + Human: How artificial intelligence will transform airport security screening

Over the years, the industry has come to recognise that an X-ray machine is only as effective as its operator. But humans are emotional creatures and far from perfect, so when the stakes are high it makes sense to invest in advanced technology to mitigate human error and enhance operations. Yves Duguay discusses the role artificial intelligence might have on aviation security processes in the future.

Yves Duguay

Artificial Intelligence, or AI, is everywhere. It is one of the emerging technologies that will profoundly transform how we live and work. At its core is a component called machine learning, a radical new approach to engineer self-learning algorithms through ‘deep learning’, a process that mimics some of the human brain’s functions and structures.

These deep learning machines and their advanced algorithms may eventually be able to sift quickly through massive amount of data to arrive at a conclusion or decision, very much like the thousands of decisions screeners make every day in airports: to clear or not to clear…a bag! Accordingly, it would stand to reason that we should explore how AI can contribute to the effectiveness of the screening operations and the improvement of the passenger experience.

Could AI mitigate and significantly reduce the risks inherent to human errors and assist us in refocusing our efforts on detection? Although there has been a great deal of attention in the past five years on the balance between security imperatives and passenger facilitation, we should remind ourselves that any gain in expediency is rendered worthless if we cannot detect threat items. Passengers expect to be screened effectively and efficiently – in that order.

As part of this article, we will explore how human factors and cognitive limitations can cause errors that are affecting the detection of threat items at checkpoints, and how AI could eventually prevent those errors while speeding up the process and reducing security costs.

“…reducing the screening time from 10 minutes or so for a screener, to a few seconds with AI. Considering that the Port of Rotterdam processes 8000 containers a day, this would increase the number of containers scanned each day by the authorities…”

Recently we came upon an article published in the Economist1 describing how a group of scientists, headed by Dr. Lewis Griffin of University College London (UCL)2, was experimenting with AI to screen cargo containers in the Port of Rotterdam. The laboratory-based trial conducted by Dr. Griffin and his team focussed on the increased capacity for the detection of metallic objects, such as weapons, using X-ray technology by reducing the screening time from 10 minutes or so for a screener, to a few seconds with AI. Considering that the Port of Rotterdam processes 8000 containers a day, this would increase the number of containers scanned each day by the authorities. We reached out to Professor Griffin who agreed to contribute to this article and explain the benefits and challenges of AI, in the context of airport security screening. But before looking at this potential solution, let’s take a look at our current performance.

Screening authorities have developed systematic approaches to assess their operational performance by reviewing the results of their quality assurance programs. Leveraging techniques borrowed from Six Sigma and lean manufacturing, operational results are subjected to a succession of “whys”, in an attempt to identify the root causes behind performance issues.

While working at CATSA, we noticed that two causes surfaced more frequently than others: the inattention of screeners and the failure to comply with standard operating practices (SOPs). Why would screeners not pay attention or comply with procedures that are meant to protect them and to assist them in making difficult decisions under risk? To answers these questions in 2013 we began to review the literature concerning human errors, while, in parallel, observing screeners in action during their daily interactions with screening equipment, SOPs and passengers.

“…we estimated the probability for a screener of being confronted with a real or simulated threat to be extremely low – in the neighborhood of 0.001% over a one-year period…”

The first step in our research was to assess the risk of failure to detect threat items. Based on limited unclassified data found in reports published by the Government Accounting Office (GAO)3 and by the Homeland Security Inspector General4, we assessed this risk as significant. Our next step was to determine the importance of human errors on detection failures and how cognitive biases can impact our capacity to detect ultra-rare items. It is generally acknowledged that human factors account for 80%5 of errors and it could actually be higher for screening operations. These statistics were validated by our field observations and the information gathered through interviews with screeners in airports.

Every day, screeners at security checkpoints are searching passengers and luggage, aided by technologies and guided by SOPs, to detect weapons, explosive devices or other items that would constitute a risk to civil aviation. With the exception of proxies such as simulated infiltration testing (SIT), ‘Red Team’ exercises, and pseudo-targets created by threat image projection (TIPs), most screeners will not be exposed to such threats for long periods of time. As part of our research, we estimated the probability for a screener of being confronted with a real or simulated threat to be extremely low – in the neighborhood of 0.001% over a one-year period.

This low prevalence of targets has a direct impact on screening officers’ performance and their capacity to detect a real or pseudo threat when it does present itself during the screening process. We believe that this ultra-rare item effect, combined with the repetitive nature of the screening tasks, can lead to a loss of situational awareness (SA) and attention, which in turn generate human errors. In fact, “Ultra-rare items are highly vulnerable to being missed in visual search; in a task in which relatively frequent items were detected more than 90% of the time, ultra-rare items (those with frequency rates below 0.15%) were largely undetected.”6

To reduce and mitigate the impact of this ultra-rare item effect, we worked with clients to develop a training programme to raise the level of situational awareness (SA) of screeners, inspired by an initiative implemented by the U.S. Coast Guard. Once screeners understand how their performance can be affected by a loss of SA, they are better prepared to recognise the error precursors that can lead to a failure to detect. But this can only go so far.

“…ultra-rare items are highly vulnerable to being missed in visual search; in a task in which relatively frequent items were detected more than 90% of the time, ultra-rare items (those with frequency rates below 0.15%) were largely undetected…”

What if we could teach “tireless, attentive and compliant” machines (X-ray or tomography) to continuously learn from every image they analyse, to easily differentiate ordinary items from prohibited items within less than a second? Think about how the rate of detection would be improved by removing the high percentage of human errors and how this effectiveness would be accompanied by increased speed and lower costs per passenger. Is this attainable and realistic? This is why we consulted and spoke with Dr. Griffin to better understand the challenges associated with AI and the next steps for our industry.

An Interview with Dr. Lewis Griffin

Lewis Griffin

Q: Dr. Griffin, what are, in your opinion, the initial challenges to overcome before we could be using AI in the context of airport security screening operations?
I think there are three technical challenges that need to be confronted.

The first challenge is to assemble a large amount of high-quality labelled data. To better understand what I mean by labelled data, let’s consider how we would go about detecting firearms. The current AI approach to problems like this is to let the machine learn what a firearm looks like from data or images, rather than engineering an algorithm; so we need a large dataset of images with and without firearms for this learning stage. This dataset must be of high quality, free from unintended detection cues, such as those caused by different X-ray equipment, trays or re-use of benign contents in staged-bags. If false cues like these can be eliminated and if we can obtain large diverse datasets, then we will be able to use the latest deep learning methods to detect threat items like firearms.

The second challenge lies in the breadth of the detection scope. It’s quite a job to assemble a large enough, high-quality dataset for any single threat class like firearms. It gets even harder when we have to repeat this for knives, bottles, batteries, cash, and potential IED components. Achieving this breadth of detection capability is technically feasible but it will require smart engineering to make it practically attainable.

The third challenge is to automate the anomaly detection skills of screeners, which is something different from recognising threats. Anomalies are oddities that catch the eye and give a hint that something is suspicious. For example, an IED might be too well concealed within a laptop to be recognisable, but a wonkiness of some of the components might prompt an alert screener to have a closer look at the computer. This is a technically challenging problem, for which we do not yet have algorithms that can perform as well as humans, but research is active, and I think this will be achieved.

Q: Where do you foresee the capacity for these deep learning machines five years down the road?
The vision that my group is currently working towards is a system that would report on the recognisable presence of danger, illegality and value within a bag or cargo. I’m very confident that a system with super-human performance, and processing times less than a second, is achievable within five years. Of course, these systems, while being able to perform better than humans, would also be tireless, attentive, compliant and beyond corruption. But, as I mentioned earlier, automated anomaly detection is much more difficult and I’m less confident, though still optimistic, that we can achieve human levels of performance in that timescale.
Q: What are, in your view, the next steps in developing this approach and possibly in testing it, in an actual airport setting?
First the technology needs to be brought to a readiness level in a lab setting. We’re currently working on scanning hundreds of different categories of objects so that we can get general-purpose recognition of bag contents working. That will give us a base from which to extend to specific threat targets. We’re also working on anomaly detection and we’re getting better at appearance and semantic anomalies, however appearance-given-semantics (e.g. it looks like a laptop, but somehow wrong) is much tougher. Initially we’ll test the system on our own staged benign and threat bags.

Next, we’ll be working with a UK-based company called Iconal, who is establishing testing methodologies and datasets that are suitable for bridging the gap between lab testing and testing on site. If we can get good enough performance on their tests, then we’d be ready for something on site. It’s not easy for academics in universities to understand the processes of getting test systems into airports, but the Home Office in the UK has been helpful and proactive in facilitating this. Testing in situ should be relatively painless since we’re not changing the configuration of the screening pipeline; we just need to grab the images from the scanner and have our software report on the images. Then our software results (detections and false alarms) can be compared to what the screeners actually report in some real bags. Since real threats are so rare we’d need to send some staged threats and see how often humans and the software can pick them up. It’s a process that requires great care. Frankly the developers of AI systems can’t be trusted to do the testing, which is why companies like Iconal who are really thinking hard about the problem, but without skin in the game, are so important to our research.

Our thanks to Dr. Griffin for sharing his insights in a leading-edge technology that will improve screening operations, we hope, in the near future. We’re looking forward to reading about his work in future publications.

In closing, we believe that AI will extend the human capability at checkpoints as it has already done for a number of other business processes, through a collaborative process involving Machine + Human: “In essence, machines are doing what they do best: performing repetitive tasks, analysing huge data sets, and handling routine cases. And humans are doing what they do best: resolving ambiguous information, exercising judgment in difficult cases, and dealing with dissatisfied customers”7.

Yves Duguay is a security professional who has held a number of executive positions in public safety and civil aviation. While at the Royal Canadian Mounted Police, he directed the national programme to combat money laundering. He joined Air Canada in 2000, to lead its security operations and to develop the first IOSA certified SeMS in North America. While at Air Canada he also chaired the IATA security committee. He joined the Canadian Air Transport Security Authority in 2007 as Senior Vice President of operations and customer experience, where he was responsible for the screening of passengers in 89 airports. He left CATSA in 2013 to launch HCiWorld, where he continues serving clients involved in the transportation industry. Yves completed his MBA from McGill University and HEC Montreal in 2012 and has received the ICD.D certification from the Institute of Corporate Directors. In March 2018, he was named to the Transportation Appeal Tribunal of Canada.

  1. https://www.economist.com/news/science-and-technology/21711016-artificial-intelligence-moves-security-scanning-machines-are-learning-find
  2. http://compass.cs.ucl.ac.uk/
  3. Obtained from: http://www.gao.gov/new.items/d08958.pdf
  4. Obtained from: https://www.oig.dhs.gov/assets/Mgmt/2015/OIG-15-150-Sep15.pdf
  5. US Department Of Energy (DOE) Standard, (2009), « Human performance improvement handbook, Vol. 1 », p. 1-10; retrieved from http://energy.gov/sites/prod/files/2013/06/f1/doe-hdbk-1028-2009_volume1.pdf
  6. Mitoff, Stephen R. and Biggs, Adam T. (2014), « The ultra-rare-item effect: visual search for exceedingly rare items is highly susceptible to error », Association for Psychological Science (APS), Vol. 25(1) 284-289
  7. Daugher, Paul R., Wilson, James H., ((2018), Human + Machine, Reimagining Work in the Age of AI, Harvard Business Review Press, Boston Massachusetts (e-book version).