Technology

The Realities of AI in Cybersecurity: Catastrophic Forgetting

BY Soko Directory Team · March 25, 2021 02:03 pm

KEY POINTS

There is a lot of hype about the use of artificial intelligence (AI) in cybersecurity. The truth is that the role and potential of AI in security are still evolving and often require experimentation and evaluation.

SophosAI is committed to openly sharing its data science research with the security community in order to make the use of AI more transparent and influence how AI is positioned and discussed in cybersecurity. Details of other initiatives shared as part of this objective are available in the SophosAI blog.

Catastrophic forgetting: What is it?

Trade Smarter, Grow Faster with Kingdom Bank Trade Finance

Malware detection is the cornerstone of IT security and AI is the only approach capable of learning patterns from millions of new malware samples within a matter of days.

But there’s a catch: should the model keep all malware samples forever for optimum detection but slower learning and updates; or go for selective fine-tuning that enables the model to better keep up with the rate of change of malware, but runs the risk of forgetting older patterns (known as catastrophic forgetting)?

Retraining the whole model takes about one week. A good fine-tuning model should take about one hour to update.

SophosAI wanted to see if it was possible to have a fine-tuning model that could keep up with the evolving threat landscape, learn new patterns but still remember older ones while minimizing the impact on performance. Researcher Hillary Sanders evaluated a number of update options and has detailed her findings in the Sophos AI blog.

The detection dilemma

Keeping detection capabilities up to date is a constant battle. With every step we take towards defending against a malicious attack, adversaries are already developing new ways to get around it, releasing updates with different code or techniques. The result is that hundreds of thousands of new malware samples appear every day.

Detection is made even harder by the fact that the latest-and-greatest malware is rarely completely “new.” Instead, it is more likely to be a combination of new, old, shared, borrowed, or stolen code and adopted and adapted behaviors. Further, old malware can re-emerge after years in the wilderness, co-opted into an adversary’s latest arsenal to take defenses by surprise.

Detection models need to ensure they can continue to detect older malware samples and not just the most recent ones.

Updating AI detection models

When it comes to updating AI detection models with new malware samples, vendors have a choice between two options.

The second is to only update the detection model on new samples. This is known as fine-tuning. During each step of the fine-tuning process, the model updates its understanding according to the new knowledge added and the impact of this on the patterns seen overall. As a result, the model can “forget” the old patterns it learned previously (“catastrophic forgetting”). However, training a model on less data means the model updates faster and can be released more frequently, keeping better pace with the rapid rate of change of malware.

Regardless of the option chosen, the need to keep training AI detection models on new samples is critical.

The patterns that AI learns from malware samples enable it to generalize and detect not only what it was trained on, but also never-before-seen samples that bear at least some resemblance to the training data. Over time, however, new samples will begin to deviate enough that an old model’s effectiveness will decay, and it will need to be updated.

The following figure visualizes how detection performance declines over time if models are not updated when new samples appear. On the left are the older samples the model has been trained on. The detection rate is consistently strong. To the right are the new samples the model has not yet learned, so detection is weaker.