Big Data and Machine Learning Solutions

By Ben Zilberman December 03, 2019

When was the last time someone tried to sell you technology that has no machine learning algorithms in it? Are computing systems not based on machine learning any good still? How can you tell the difference?

Why Machine Learning?

Machine learning is trying to solve a critical problem: how to identify malicious behaviors. But it can also be used by attackers to overcome information security technologies. Most machine learning engines will analyze log files trying to find anomalous patterns to learn from, and better protect the network or the application in the future. It is the traffic with malicious intent that can trick our known and static security heuristics that we’re after.

In addition, it is impossible for humans to sift through the massive quantity of data generated by network logs and other sources of data to identify potential attacks. The number of false alarms that waste the limited resources of the security team is enormous.

On the other hand, machine learning algorithms perform exceptionally well with leveraging log data analysis to accurately identify and classify anomalous behavior or subtle differences that indicate a compromise attempt.

Supervised vs. Unsupervised Machine Learning

For detection, two approaches apply: supervised and unsupervised machine learning.

Supervised learning entails providing the algorithm a “training set” of examples that include pairs of input data and the desired or predetermined output or classification.

In the case of attack detection, the training set includes input data for both benign and malicious behaviors paired with the correct classification or identification. When applied to attack detection, supervised machine learning leverages a rich training set through rigorous analysis of communication attributes such as day/time stamp, duration, path, periodicity (and dozens of others) AND the inter-relationships between these attributes.

[adbutler zone_id="276005"]

When a new and unknown data set is introduced, the algorithm determines whether it contains a record of benign or malicious communication. The learning algorithm will also provide a confidence level of its identification. Security policies can then define the appropriate course of action that are built around the identification and confidence level. Supervised machine learning algorithms are not constrained to recognizing only those patterns found in the training set or even the updated knowledge set and can identify brand new malicious attacks using on the underlying algorithms.

The risk? False negatives! When it doesn’t recognize a new pattern, a bad actor will be let through.

In unsupervised learning, there is no training set and no predetermined identification. As it consumes data, the algorithm typically looks for common patterns so it can create clusters. To keep it simple – normal vs. anomalous behavior. For example, communication to unusual site in non-standard time of the day. The algorithm provides indicators for detected anomalies. These anomalies might relate to malicious or benign communication and may well require additional effort to thoroughly investigate and characterize.

The risk? False Positives! Since unsupervised learning lacks guidance, there is a chance it will block real users from accessing the required resource.

If a team has enough security experts and data scientists, a security tool focused on unsupervised machine learning may be the right choice. Unfortunately, most teams aren’t so privileged.

The Best of Both Worlds

Semi-supervised deep learning algorithms cluster behaviors by commonalities (unsupervised), while receiving some guidance (supervised). The key to success is exposure to as much data as possible, so the engine can improve its self-tuning capabilities (feedback loops) based on as many variations of behaviors. In addition, a big data lake facilitates accounting more and more indicators and parameters required to eventually make classification as accurate as possible.

The best example is the detection of sophisticated bots. When a bot can mimic human behavior and bypass security controls, it is classified as a legitimate user. Such a bot can generate non-linear mouse movements and keystrokes. The supervised algorithm knows that such behavior is legitimate, and will let the bot in. The unsupervised will add this behavior to the cluster of common behaviors without knowing if it’s good or bad. Next, it might block a real user. If it happens time and again, this source will be blocked at a certain point. This is called shallow learning.

Some attackers know that, and don’t launch the attack until the algorithm clears their source. Therefore, as some bots distribute their activities over multiple sessions, correlation of activities and violations over time is key to store in the data lake and leverage for decision making. Was there enough data, perhaps it had been classified otherwise. For example, source IP, device ID, TPS (transactions per second), and response to challenges are some indicators that help determine whether this is a bot, blocking the first transaction.

5 Questions to Ask

When evaluating machine learning solutions, be sure to ask the following questions:

What does it learn?
What does it avoid learning?
How quickly will it adapt to changes?
Can it self-tune automatically?
How rich is the data it collects?

Don’t be naïve. Learn more.

Read “The Ultimate Guide to Bot Management” to learn more.

Download Now

Ben Zilberman

Ben Zilberman is a director of product-marketing, covering application security at Radware. In this role, Ben specializes in web application and API protection, as well as bot management solutions. In parallel, Ben drives some of Radware’s thought leadership and research programs. Ben has over 10 years of diverse experience in the industry, leading marketing programs for network and application security solutions, including firewalls, threat prevention, web security and DDoS protection technologies. Prior to joining Radware, Ben served as a trusted advisor at Check Point Software Technologies, where he led channel partnerships and sales operations. Ben holds a BA in Economics and a MBA from Tel Aviv University.

Security MegaMedusa, RipperSec’s Public Web DDoS Attack Tool Key insightsRipperSec is a pro-Palestinian, pro-Muslim hacktivist group operating from MalaysiaRipperSec has been operating on Telegram since June 2023 and Pascal Geenens |August 19, 2024

Security Radware Successfully Launches Cloud-Based ERP System: Driving Innovation in Cybersecurity At Radware, we have always prided ourselves on being at the forefront of innovation. As a leader in the cybersecurity industry, we are committed to leveraging innovative technology not only to protect our clients but also to optimize our own operations. Arik Rosenbaum |January 28, 2025

Security Innovating the Future: Highlights from Radware’s Global AI Hackathon 2024 The Radware Hackathon 2024 – AI in Action – was a thrilling event that brought together the brightest minds from our various global functions and teams including R&D, Services, Delivery, Marketing, and Sales. From Tel Aviv to Bangalore, Chennai to virtual Zoom rooms, teams collaborated to redefine AI-powered innovations in cybersecurity and application delivery. Tamir Ron |December 09, 2024

Big Data and Machine Learning Solutions

Why Machine Learning?

Supervised vs. Unsupervised Machine Learning

The Best of Both Worlds

5 Questions to Ask

Read “The Ultimate Guide to Bot Management” to learn more.

Ben Zilberman

Contact Radware Sales

Already a Customer?

Get Social

By Industry

By Use Case

Application Protection

DDoS Protection

Application Delivery

Application Protection

DDoS Protection

Application Delivery

Protect Your Website From Dangerous Bad Bots

Documents

Blog

Free Assessment Tools

Events

Security Research Center

WHY RADWARE? Learn how Radware EPIC-AI™ rapidly resolves issues

CUSTOMERS Read case studies, reviews and customer testimonials

DIVERSITY & INCLUSION Get to know Radware’s fair and supportive culture

INVESTORS Get the latest news, earnings and upcoming events

PARTNERS Access the new partner tools, services and expertise

LOCATIONS Discover Radware’s offices and strong global presence

CAREERS Learn about our team, values and latest job openings

TRAINING Join in-depth training, live classes, workshops and more

CONTACT US Connect with a Radware expert today

Watch Radware’s New Series: Threat Bytes

Big Data and Machine Learning Solutions

Why Machine Learning?

Supervised vs. Unsupervised Machine Learning

The Best of Both Worlds

5 Questions to Ask

Read “The Ultimate Guide to Bot Management” to learn more.

Ben Zilberman

Related Articles

Contact Radware Sales

Already a Customer?

Get Social

What are you looking for?

Protect Your Website From Dangerous Bad Bots

WHY RADWARE? Learn how Radware EPIC-AI™ rapidly resolves issues

CUSTOMERS Read case studies, reviews and customer testimonials

DIVERSITY & INCLUSION Get to know Radware’s fair and supportive culture

INVESTORS Get the latest news, earnings and upcoming events

PARTNERS Access the new partner tools, services and expertise

LOCATIONS Discover Radware’s offices and strong global presence

CAREERS Learn about our team, values and latest job openings

TRAINING Join in-depth training, live classes, workshops and more

CONTACT US Connect with a Radware expert today

Watch Radware’s New Series: Threat Bytes