10th Indian Delegation to Dubai, Gitex & Expand North Star – World’s Largest Startup Investor Connect
AI

Evil Models and Exploits: When AI Becomes the Attacker


Artificial intelligence is redefining industries at a staggering pace, and the field of cybersecurity is no exception. From coding assistants to penetration testing tools, we are witnessing the emergence of AI-driven mechanisms that amplify productivity and problem-solving.

However, the same tools that can enhance development workflows can also empower malicious actors. Here are four ways that AI is reshaping hacking and malware development, and how we can stay vigilant in response.

1. Agent-Augmented Hacking

The concept of agent-augmented hacking, using AI in novel and potentially destructive ways, is quickly moving from hypothetical to inevitable.

Anyone who has used AI-powered coding assistants like GitHub Copilot, Cursor or similar tools is familiar with the input-feedback loop. This process involves AI systems accessing a shell, executing commands, capturing output and using that feedback to refine subsequent instructions. In development, this iterative process enables coding assistants to create more effective and accurate code by directly understanding execution environments.

The same concept can be weaponized. Attackers can integrate the input-feedback loop into their toolkits, using AI to orchestrate popular penetration testing tools like OWASP ZAP, Nmap or Nikto2. The outputs of these tools could be piped directly into an AI model, allowing the system to craft tailored exploit code based on the findings.

Should the initial exploit fail, the system’s feedback loop enables it to iterate — adjusting, refining and retrying until successful. This process drastically reduces the time and expertise required to identify and exploit vulnerabilities, marking a fundamental shift in the landscape of cybersecurity threats.

2. Model Context Protocol

A more structured threat emerges with technologies like the Model Context Protocol (MCP). Originally introduced by Anthropic, MCP allows large language models (LLMs) to interact with host machines via JavaScript APIs. This enables LLMs to perform sophisticated operations by controlling local resources and services.

While MCP is being embraced by developers for legitimate use cases, such as automation and integration, its darker implications are clear. An MCP-enabled system could orchestrate a range of malicious activities with ease. Think of it as an AI-powered operator capable of executing everything from reconnaissance to exploitation.

For now, these capabilities are likely to surface in white-hat research, offering a preview of how attackers might use such tools. But it’s only a matter of time before malicious actors follow suit, introducing a new level of sophistication and autonomy in cyberattacks.

3. Evil Models

The proliferation of AI models is both a blessing and a curse. Platforms like Hugging Face host over a million models, ranging from state-of-the-art neural networks to poorly designed or maliciously altered versions. Amid this abundance lies a growing concern: model provenance.

Imagine a widely used model, fine-tuned by a seemingly reputable maintainer, turning out to be a tool of a state actor. Subtle modifications in the training data set or architecture could embed biases, vulnerabilities or backdoors. These “evil models” could then be distributed as trusted resources, only to be weaponized later.

This risk underscores the need for robust mechanisms to verify the origins and integrity of AI models. Initiatives like Sigstore, which employs tools such as SLSA (Supply chain Levels for Software Artifacts) to verify software provenance, must extend their efforts to encompass AI models and datasets. Without such safeguards, the community remains vulnerable to manipulation at scale.

4. Privacy Risks and PII Regurgitation

AI models are trained on vast amounts of data, much of it scraped from the internet or uploaded by users. This data often includes sensitive personally identifiable information (PII), secrets and tokens. The result? Models inadvertently regurgitate fragments of this sensitive information in their outputs.

Consider a scenario where users turn to AI for therapy or personal guidance. The PII embedded in these interactions, if included in subsequent training cycles, could resurface as part of a model’s output. As adoption grows, so too does the risk of sensitive data exposure.

This issue could spark a much-needed privacy movement, where users demand greater transparency about how their data is used. The age-old adage that “users are the product” may gain new relevance in the AI era, leading to tighter regulations and technological safeguards.

Mitigating the Risks: A Call to Action

As the cybersecurity landscape evolves, developers, enterprises and open source communities must adapt. The threats posed by AI, including enhanced hacking capabilities and privacy violations, are daunting but not insurmountable. Here are three key areas to focus on:

  1. Standardizing model provenance: The open source community must prioritize transparency and verification in the AI supply chain. Tools like Sigstore and SLSA should become standard practice for validating models and their training datasets.
  2. Building defensive AI systems: Just as attackers use AI to amplify their capabilities, defenders must do the same. This includes leveraging AI for real-time threat detection, vulnerability analysis and anomaly detection to stay ahead of evolving threats.
  3. Privacy-first AI practices: Protecting user data should be a cornerstone of AI development. Local agents can offer privacy-based protections for coding assistants and represent a step in the right direction. Broader adoption of privacy-focused technologies will be critical.

Conclusion

AI’s potential to transform cybersecurity is immense, but so are the risks. From agent-augmented hacking to privacy violations, the industry is facing challenges that demand proactive solutions. The need for verifiable AI models, privacy safeguards and AI-enhanced defenses has never been more urgent.

At Stacklok, we’re committed to addressing these challenges. We recently made CodeGate, a local privacy protection system for coding assistants and agents, open source as part of our mission to make AI both secure and trustworthy. The road ahead is uncertain, but with vigilance and collaboration, we can shape a future where AI amplifies security rather than undermines it.


Group Created with Sketch.





Source link

AI
by The Economic Times

IBM said Tuesday that it planned to cut thousands of workers as it shifts its focus to higher-growth businesses in artificial intelligence consulting and software. The company did not specify how many workers would be affected, but said in a statement the layoffs would “impact a low single-digit percentage of our global workforce.” The company had 270,000 employees at the end of last year. The number of workers in the United States is expected to remain flat despite some cuts, a spokesperson added in the statement. A massive supplier of technology to… Source link

AI
by The Economic Times

The number of Indian startups entering famed US accelerator and investor Y Combinator’s startup programme might have dwindled to just one in 2025, down from the high of 2021, when 64 were selected. But not so for Indian investors, who are queuing up to find the next big thing in AI by relying on shortlists made by YC to help them filter their investments. In 2025, Indian investors have invested in close to 10 Y Combinator (YC) AI startups in the US. These include Tesora AI, CodeAnt, Alter AI and Frizzle, all with Indian-origin founders but based in… Source link

by Techcrunch

Lovable, the Stockholm-based AI coding platform, is closing in on 8 million users, CEO Anton Osika told this editor during a sit-down on Monday, a major jump from the 2.3 million active users number the company shared in July. Osika said the company — which was founded almost exactly one year ago — is also seeing “100,000 new products built on Lovable every single day.” Source link