Itay Nakash

I’m a Research Scientist at IBM Research, working on NLP and LLM-based agents.

My research focuses on improving how language models and agents are trained, evaluated, and deployed in practical settings. I have worked on AI safety and security, red-teaming, agent evaluation, alignment, and LLM efficiency, with publications at conferences including NAACL, ACL, EMNLP, and COLM.

More broadly, I’m interested in applied research that connects new technical ideas with real-world AI systems: building methods that are useful, measurable, and can provide real value beyond paper settings.

News

Apr 29, 2026	Excited to share that our paper Efficient Agent Evaluation via Diversity-Guided User Simulation was accepted to ACL 2026! TL;DR: DIVERT offer an efficient and coverage based user simulator to evaluate LLM agents.
Jan 15, 2026	Happy to share that ideas and findings from our CRAFT paper (EMNLP 2025) made their way into IBM watsonx Orchestrate’s LLM agent vulnerability testing. We advised and co-designed the red-teaming components of the Agent Evaluation system together with the software teams.
Jan 01, 2025	I’m starting my full-time role as an NLP Researcher at IBM Research, focusing on Gen-AI safety and agentic AI security. Looking forward to tackling new challenges in the field!

Selected Publications

ACL 2026
Efficient Agent Evaluation via Diversity-Guided User Simulation

Itay Nakash, George Kour, and Ateret Anaby-Tavor

2026

Abs arXiv Bib HTML

DIVERT is a snapshot-based, coverage-guided user simulation framework for evaluating interactive LLM agents. It branches from critical mid-trajectory states with diverse user responses, reusing shared conversation prefixes to reduce evaluation cost while uncovering more failures.
@misc{nakash2026efficientagentevaluation, title = {Efficient Agent Evaluation via Diversity-Guided User Simulation}, author = {Nakash, Itay and Kour, George and Anaby-Tavor, Ateret}, year = {2026}, date = {2026-04-23}, eprint = {2604.21480}, archiveprefix = {arXiv}, primaryclass = {cs.AI}, url = {https://arxiv.org/abs/2604.21480}, }

NAACL 2025

Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In

Itay Nakash, George Kour, Guy Uziel, and 1 more author

In Findings of the Association for Computational Linguistics: NAACL 2025, Apr 2025

Abs DOI arXiv Bib HTML

Foot-in-the-door attacks exploit ReAct agents by first planting a harmless-looking action in the agent’s reasoning, which makes later malicious tool use more likely. The paper shows that indirect prompt injections can leverage this effect across models and that reflection-style checks during execution can reduce, but not eliminate, the risk.

@inproceedings{nakash-etal-2025-breaking,
  title = {Breaking {R}e{A}ct Agents: Foot-in-the-Door Attack Will Get You In},
  author = {Nakash, Itay and Kour, George and Uziel, Guy and Anaby Tavor, Ateret},
  editor = {Chiruzzo, Luis and Ritter, Alan and Wang, Lu},
  booktitle = {Findings of the Association for Computational Linguistics: NAACL 2025},
  month = apr,
  year = {2025},
  address = {Albuquerque, New Mexico},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2025.findings-naacl.363/},
  doi = {10.18653/v1/2025.findings-naacl.363},
  pages = {6499--6524},
  isbn = {979-8-89176-195-7},
  eprint = {2410.16950},
  archiveprefix = {arXiv},
  primaryclass = {cs.CR},
}

COLM 2025
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Itay Nakash, Nitay Calderon, Eyal Ben David, and 2 more authors

In Second Conference on Language Modeling, Apr 2025

Abs arXiv Bib

AdaptiVocab improves domain-specific LLM efficiency by replacing underused vocabulary items with domain n-gram tokens and initializing their embeddings from existing tokens. Across niche domains and two 7B models, it cuts input and output token usage by more than 25 percent while preserving generation quality and downstream performance.
@inproceedings{nakash2025adaptivocab, title = {AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation}, author = {Nakash, Itay and Calderon, Nitay and Ben David, Eyal and Hoffer, Elad and Reichart, Roi}, booktitle = {Second Conference on Language Modeling}, year = {2025}, url = {https://openreview.net/forum?id=TyXf9dwpZP}, eprint = {2503.19693}, archiveprefix = {arXiv}, primaryclass = {cs.CL} }

EMNLP 2025

Effective Red-Teaming of Policy-Adherent Agents

Itay Nakash, George Kour, Koren Lazar, and 3 more authors

In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025

Abs DOI arXiv Bib

Policy-adherent customer-service agents can be manipulated by adversarial users who try to extract benefits that violate business rules. This paper introduces CRAFT, a policy-aware multi-agent red-teaming system, and tau-break, a tau-bench-based benchmark showing that current defenses help but do not fully protect agents from persuasive policy-breaking attacks.

@inproceedings{nakash-etal-2025-effective,
  title = {Effective Red-Teaming of Policy-Adherent Agents},
  author = {Nakash, Itay and Kour, George and Lazar, Koren and Vetzler, Matan and Uziel, Guy and Anaby Tavor, Ateret},
  editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
  booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  month = nov,
  year = {2025},
  address = {Suzhou, China},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2025.emnlp-main.114/},
  doi = {10.18653/v1/2025.emnlp-main.114},
  pages = {2250--2268},
  isbn = {979-8-89176-332-6},
  eprint = {2506.09600},
  archiveprefix = {arXiv},
  primaryclass = {cs.MA}
}

Latest Posts

Dec 02, 2025	EMNLP 2025 Highlights