Applied LLMs & Agents

Manage your research digests

Subscribe
February 27, 2026
20 papers
Sent

A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines

Gaoyuan Du, Amit Ahlawat, Xiaoyang Liu, Jing Wu

Source: arxiv Related

VeRO: An Evaluation Harness for Agents to Optimize Agents

Varun Ursekar, Apaar Shanker, Veronica Chatrath, Yuan, Xue, Sam Denton

Source: arxiv Related

General Agent Evaluation

Elron Bandel, Asaf Yehudai, Lilach Eden, Yehoshua Sagron, Yotam Perlitz, Elad Venezian, Natalia Razinkov, Natan Ergas, Shlomit Shachor Ifergan, Segev Shlomov, Michal Jacovi, Leshem Choshen, Liat Ein-Dor, Yoav Katz, Michal Shmueli-Scheuer

Source: arxiv Related

RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge

Ahmed Bin Khalid

Source: arxiv Related

Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in E-commerce Applications

Teri Rumble, Zbyněk Gazdík, Javad Zarrin, Jagdeep Ahluwalia

Source: arxiv Related

DS SERVE: A Framework for Efficient and Scalable Neural Retrieval

Jinjian Liu, Yichuan Wang, Xinxi Lyu, Rulin Shao, Joseph E. Gonzalez, Matei Zaharia, Sewon Min

Source: arxiv Related

Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents

Ryan Liu, Dilip Arumugam, Cedegao E. Zhang, Sean Escola, Xaq Pitkow, Thomas L. Griffiths

Source: arxiv Related

Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions

Yue Xu, Qian Chen, Zizhan Ma, Dongrui Liu, Wenxuan Wang, Xiting Wang, Li Xiong, Wenjie Wang

Source: arxiv Related

SmartChunk Retrieval: Query-Aware Chunk Compression with Planning for Efficient Document RAG

Xuechen Zhang, Koustava Goswami, Samet Oymak, Jiasi Chen, Nedim Lipka

Source: arxiv Related

Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

Zhiming Wang, Jinwei He, Feng Lu

Source: arxiv Related

Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

Varun Pratap Bhardwaj

Source: arxiv Related

AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications

Yujie Zhao, Boqin Yuan, Junbo Huang, Haocheng Yuan, Zhongming Yu, Haozhou Xu, Lanxiang Hu, Abhilash Shankarampeta, Zimeng Huang, Wentao Ni, Yuandong Tian, Jishen Zhao

Source: arxiv Related

Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace

Qianlong Lan, Anuj Kaul, Shaun Jones, Stephanie Westrum

Source: arxiv Related

MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks

Shiqian Su, Sen Xing, Xuan Dong, Muyan Zhong, Bin Wang, Xizhou Zhu, Yuntao Chen, Wenhai Wang, Yue Deng, Pengxiang Zhu, Ziyuan Liu, Tiantong Li, Jiaheng Yu, Zhe Chen, Lidong Bing, Jifeng Dai

Source: arxiv Related

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Idan Habler, Vineeth Sai Narajala, Stav Koren, Amy Chang, Tiffany Saade

Source: arxiv Related

Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation

Pengzhen Xie, Huizhi Liang

Source: arxiv Must Read

Agentic AI for Intent-driven Optimization in Cell-free O-RAN

Mohammad Hossein Shokouhi, Vincent W. S. Wong

Source: arxiv Must Read

TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought

Jianmin Li, Ying Chang, Su-Kit Tang, Yujia Liu, Yanwen Wang, Shuyuan Lin, Binkai Ou

Source: arxiv Must Read

Automating the Detection of Requirement Dependencies Using Large Language Models

Ikram Darif, Feifei Niu, Manel Abdellatif, Lionel C. Briand, Ramesh S., Arun Adiththan

Source: arxiv Must Read

Utilizing LLMs for Industrial Process Automation

Salim Fares

Source: arxiv Must Read