home@gauravsuryawanshi: ~$
Back to blog

The Double-Edged Sword: What I Learned About ML in Cybersecurity and the Rise of AI Defense

6 min readGaurav Suryawanshi
#Machine Learning#AI Security#Cybersecurity#Adversarial ML#HiddenLayer#Data Poisoning#Model Extraction#MLOps#AI Defense#Secure AI#LLM Security

Over the past few weeks, I've been diving deep into one of the most fascinating intersections in technology today: the fusion of Machine Learning (ML) and Cybersecurity.

Through a series of lectures, discussions, and an eye-opening guest session led by expert Kasimir Schulz from HiddenLayer, I've had the chance to explore not just how ML is used to strengthen cybersecurity, but also how AI systems themselves introduce new, critical security risks.

I went into this experience curious about how ML could automate detection. I came out with a fundamental shift in perspective: AI is both a powerful defensive tool and an emerging attack surface that requires its own category of protection.

Here is a breakdown of what I learned—from the fundamentals of ML-driven defense to the cutting-edge world of adversarial attacks on AI.


Why Machine Learning is No Longer Optional

I used to view cybersecurity through a traditional lens: firewalls, static rules, signatures, and human analysts combating alerts. However, early in the lecture series, it became obvious that the sheer scale and complexity of modern threats have outgrown those approaches.

Attackers are automating. They are mutating malware and generating phishing domains at scale. Defenders can no longer rely on manual processes or static rules. This is where Machine Learning steps in, offering capabilities that traditional security desperately needs:

  • Learning from Data, Not Rules: ML models analyze terabytes of logs and network flows to spot patterns humans would never explicitly codify.
  • Generalization: Instead of depending on known signatures, ML systems can detect anomalies based on historical baselines—crucial for catching zero-days.
  • Automation at Scale: ML can triage alerts and surface the most urgent findings, relieving overwhelmed security teams.
  • Real-Time Inference: Detections often happen in milliseconds, far faster than any human investigation.

We covered applications ranging from Intrusion Detection Systems (IDS) and Malware Classification to User and Entity Behavior Analytics (UEBA). But as we dug deeper, I realized that building these tools is far from simple.


The Pipeline Problem: It's Messier Than It Looks

One of the biggest lessons I learned is that the model is only a small fraction of the system. A real ML-driven cybersecurity system involves a massive pipeline—data ingestion, cleaning, feature engineering, training, deployment, and monitoring.

And this pipeline is fragile.

Security datasets are often noisy and heavily imbalanced (far more benign data than malicious). Labels can be incorrect, and external libraries used for feature extraction introduce their own assumptions. But the most significant challenge is concept drift. The threat landscape evolves constantly; if a model isn't continuously retrained, even small shifts in attacker behavior can silently break detection performance.

This fragility sets the stage for something I did not fully appreciate until our guest lecture: The models themselves are high-value targets.


The HiddenLayer Session: When AI Becomes the Attack Surface

Kasimir Schulz from HiddenLayer joined us to discuss a topic often overlooked in academic courses: Securing AI systems themselves.

This was the most impactful session of the semester. HiddenLayer specializes in identifying vulnerabilities in ML systems, and their insights made me realize that ML introduces vulnerabilities that simply do not exist in traditional software.

They broke these risks down into four primary categories:

1. Data Poisoning Attacks

Attackers manipulate training data so the model learns the "wrong" thing. This creates a backdoor. What struck me is how easy this is in systems that rely on user-submitted data or open-source datasets. Even a small percentage of poisoned samples can have devastating downstream effects.

2. Evasion Attacks

These happen at inference time. Attackers craft inputs—like malware with subtly altered bytes or adversarial images with invisible perturbations—specifically designed to fool the model. Kasimir demonstrated how small these changes can be; if an attacker bypasses the classifier, the security system becomes blind.

3. Model Extraction

This is effectively "stealing the model." By querying an API enough times, an attacker can reconstruct decision boundaries and clone the model's weights. For security companies, this means their intellectual property and detection logic can be leaked and reverse-engineered.

4. Inference Manipulation & Prompt Injection

Specific to Generative AI and LLMs, attackers can use "jailbreak" prompts or hidden adversarial strings to bypass safety filters or hijack agentic workflows.

A key insight from Kasimir:
"LLMs are not 'code' in the traditional sense. They're stochastic, unpredictable systems with undocumented behaviors—and attackers love that."


Why Traditional Security Fails AI

One of the most important takeaways was understanding why we can't just use standard security tools to protect AI.

  • Traditional Vulnerabilities stem from code injection, bad access control, or memory corruption.
  • AI Vulnerabilities stem from mathematical properties, data sensitivity, and non-deterministic behavior.

You cannot simply "patch" a model that generalizes in unsafe ways; you often have to retrain it. AI security is continuous, probabilistic, and adversarial by design.

The demos provided by the HiddenLayer team showing models falling apart due to mathematical tricks—made it clear: If organizations deploy AI without securing it, attackers will weaponize the quirks of ML faster than defenders can respond.


The Path Forward: Responsible and Secure AI

Connecting these lectures, the duality of AI in cybersecurity is clear. We depend on ML for defense, but attackers will target ML to disable us. It is a loop.

Most organizations are currently unprepared. They deploy models without threat modeling, lack visibility into their AI supply chain, and don't monitor for adversarial attacks. To stay ahead, we need to treat ML models as assets, dependencies, and attack surfaces all at once.

We need a new set of controls, including:

  1. AI-Specific Logging: Tracking model drift and adversarial patterns.
  2. Data Provenance: Ensuring training data is authentic.
  3. Red Teaming: Systematically probing models for failure modes.
  4. Secure MLOps: Protecting the training and deployment infrastructure.

Final Thoughts

Before these sessions, I thought of machine learning in cybersecurity as a linear advancement. I now understand it is a new ecosystem with its own risks and adversaries.

The future of cybersecurity isn't just "more AI." It is Secure AI. The defenders who understand how to use ML to enhance security—while simultaneously defending the models themselveswill be the ones who survive the coming wave of AI-native threats.