To Nha Notes | Feb. 18, 2025, 11:58 a.m.
The term agent has been used in many different engineering contexts, including but not limited to a software agent, intelligent agent, user agent, conversational agent, and reinforcement learning agent. So, what exactly is an agent?
An agent is anything that can perceive its environment and act upon that environment. This means that an agent is characterized by the environment it operates in and the set of actions it can perform.
An AI agent is meant to accomplish tasks typically provided by the users in the inputs. In an AI agent, AI is the brain that processes the information it receives, including the task and feedback from the environment, plans a sequence of actions to achieve this task, and determines whether the task has been accomplished.
The set of actions an AI agent can perform is augmented by the tools it has access to. Many generative AI-powered applications you interact with daily are agents with access to tools, albeit simple ones. ChatGPT is an agent. It can search the web, execute Python code, and generate images. RAG systems are agents, and text retrievers, image retrievers, and SQL executors are their tools.
For example, the SWE-agent (Yang et al., 2024) in below figure is a coding agent whose environment is the computer and whose actions include navigation, search, and editing. Adapted from an original image licensed under CC BY 4.0.




Referenced in the AI engineering ebook by the author Huyen Chip
What Are AI Agents? A Short Intro And A Step-by-Step Guide to Build Your Own.