Publications | Itay Nakash

2026

ACL 2026
Efficient Agent Evaluation via Diversity-Guided User Simulation

Itay Nakash, George Kour, and Ateret Anaby-Tavor

2026

Abs arXiv Bib HTML

DIVERT is a snapshot-based, coverage-guided user simulation framework for evaluating interactive LLM agents. It branches from critical mid-trajectory states with diverse user responses, reusing shared conversation prefixes to reduce evaluation cost while uncovering more failures.
@misc{nakash2026efficientagentevaluation, title = {Efficient Agent Evaluation via Diversity-Guided User Simulation}, author = {Nakash, Itay and Kour, George and Anaby-Tavor, Ateret}, year = {2026}, date = {2026-04-23}, eprint = {2604.21480}, archiveprefix = {arXiv}, primaryclass = {cs.AI}, url = {https://arxiv.org/abs/2604.21480}, }

2025

NAACL 2025

Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In

Itay Nakash, George Kour, Guy Uziel, and 1 more author

In Findings of the Association for Computational Linguistics: NAACL 2025, Apr 2025

Foot-in-the-door attacks exploit ReAct agents by first planting a harmless-looking action in the agent’s reasoning, which makes later malicious tool use more likely. The paper shows that indirect prompt injections can leverage this effect across models and that reflection-style checks during execution can reduce, but not eliminate, the risk.

@inproceedings{nakash-etal-2025-breaking,
  title = {Breaking {R}e{A}ct Agents: Foot-in-the-Door Attack Will Get You In},
  author = {Nakash, Itay and Kour, George and Uziel, Guy and Anaby Tavor, Ateret},
  editor = {Chiruzzo, Luis and Ritter, Alan and Wang, Lu},
  booktitle = {Findings of the Association for Computational Linguistics: NAACL 2025},
  month = apr,
  year = {2025},
  address = {Albuquerque, New Mexico},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2025.findings-naacl.363/},
  doi = {10.18653/v1/2025.findings-naacl.363},
  pages = {6499--6524},
  isbn = {979-8-89176-195-7},
  eprint = {2410.16950},
  archiveprefix = {arXiv},
  primaryclass = {cs.CR},
}

COLM 2025
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Itay Nakash, Nitay Calderon, Eyal Ben David, and 2 more authors

In Second Conference on Language Modeling, Apr 2025

Abs arXiv Bib

AdaptiVocab improves domain-specific LLM efficiency by replacing underused vocabulary items with domain n-gram tokens and initializing their embeddings from existing tokens. Across niche domains and two 7B models, it cuts input and output token usage by more than 25 percent while preserving generation quality and downstream performance.
@inproceedings{nakash2025adaptivocab, title = {AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation}, author = {Nakash, Itay and Calderon, Nitay and Ben David, Eyal and Hoffer, Elad and Reichart, Roi}, booktitle = {Second Conference on Language Modeling}, year = {2025}, url = {https://openreview.net/forum?id=TyXf9dwpZP}, eprint = {2503.19693}, archiveprefix = {arXiv}, primaryclass = {cs.CL} }

ACL 2025

Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language Models

George Kour, Itay Nakash, Michal Shmueli-Scheuer, and 1 more author

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), Jul 2025

Abs DOI arXiv Bib HTML

POBs is a survey-style benchmark for measuring LLM preferences, opinions, and beliefs across societal, cultural, ethical, and personal topics. The paper finds that reasoning and self-reflection at test time provide only limited improvements in reliability, neutrality, and consistency, while newer model versions can become less consistent and more biased toward particular viewpoints.

@inproceedings{kour-etal-2025-think,
  title = {Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language Models},
  author = {Kour, George and Nakash, Itay and Shmueli-Scheuer, Michal and Anaby Tavor, Ateret},
  editor = {Rehm, Georg and Li, Yunyao},
  booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)},
  month = jul,
  year = {2025},
  address = {Vienna, Austria},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2025.acl-industry.45/},
  doi = {10.18653/v1/2025.acl-industry.45},
  pages = {639--660},
  isbn = {979-8-89176-288-6},
  eprint = {2505.19621},
  archiveprefix = {arXiv},
  primaryclass = {cs.AI},
}

EMNLP 2025

Effective Red-Teaming of Policy-Adherent Agents

Itay Nakash, George Kour, Koren Lazar, and 3 more authors

In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Nov 2025

Abs DOI arXiv Bib

Policy-adherent customer-service agents can be manipulated by adversarial users who try to extract benefits that violate business rules. This paper introduces CRAFT, a policy-aware multi-agent red-teaming system, and tau-break, a tau-bench-based benchmark showing that current defenses help but do not fully protect agents from persuasive policy-breaking attacks.

@inproceedings{nakash-etal-2025-effective,
  title = {Effective Red-Teaming of Policy-Adherent Agents},
  author = {Nakash, Itay and Kour, George and Lazar, Koren and Vetzler, Matan and Uziel, Guy and Anaby Tavor, Ateret},
  editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
  booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  month = nov,
  year = {2025},
  address = {Suzhou, China},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2025.emnlp-main.114/},
  doi = {10.18653/v1/2025.emnlp-main.114},
  pages = {2250--2268},
  isbn = {979-8-89176-332-6},
  eprint = {2506.09600},
  archiveprefix = {arXiv},
  primaryclass = {cs.MA}
}