Advances in language agents that can follow instructions and use tools have renewed interest in autonomous agents and multi-agent systems. Like previous generations of agents, language agents are designed for specific tasks, highlighting the need for open networks of agents that complement each other's abilities to tackle more complex problems. New protocols are rapidly emerging to allow agents to discover and use tools, or to discover and interact with other agents. Some of these protocols build on Web standards to promote interoperability, but their alignments, misalignments, and overlaps are unclear. This report synthesizes the large body of research on autonomous agents and multi-agent systems (MAS) to define a conceptual model for understanding Web-based MAS. We use this conceptual model to classify existing technologies and frameworks, to identify relevant standards within the W3C, and to discover standardization gaps (if any).
The vision of intelligent agents on the Web is almost as old as the Web itself: in a keynote at WWW'94, Sir Tim Berners-Lee was noting that documents on the Web describe real objects and relationships among them, and if the semantics of these objects are represented explicitly then machines can browse through and manipulate reality. This vision was published in 2001 as the Semantic Web [Berners-Lee et al., 2001] — and is now closer to its realization through the standardization of the Web of Things (WoT) at the W3C and the IETF.
In the AI community, the vision of a world-wide open network of intelligent agents can be traced back to the late '90s. In 2002, the AgentCities initiative was reporting a network of 41 agent platforms deployed in 21 countries [Willmott et al., 2002] — with up to 60 registered platforms reported in 2003 [Dale et al., 2003] and 160 platforms in 2005 [Bellifemine et al., 2005]. The network was based on the standards produced by the Foundation for Intelligent Physical Agents (FIPA), but quickly faded after the mid-2000s as industry attention shifted to Web services. Another prominent initative was the DARPA Control of Agent-Based Systems (CoABS) research program [TODO], which investigated the control, coordination, and management of large systems of autonomous software agents in military applications. Central to this program, CoABS Grid was the middleware integrating heterogeneous agent-based systems, object-based applications, and legacy systems using remote method invocation as a client-server style for network-based interaction.
The DARPA CoABS program demonstrated the use of agent technology in large-scale practical applications, but also raised a number of challenges, such as enabling software agents to dynamically identify and understand information sources [TODO]. To address these, DARPA launched the Agent Markup Language (DAML) research program, which built on top of existing Web standards and paved the way for the Web Ontology Language (OWL), Semantic Markup for Web Services (OWL-S), and other cornerstones of the Semantic Web. The DAML program thus advanced the original vision of the Web as an information space not only for people but also for intelligent agents, and promoted a shift from custom-built middleware for MAS — such as CoABS Grid or FIPA implementations — to offloading many of those responsibilities to the existing Web infrastructure. Web-based MAS received significant attention over the years, especially with the advent of service-oriented computing in the early 2000s [Singh and Huhns, 2006].
Recent years have brought renewed interest in Web-based MAS, as evidenced by the Dagstuhl Seminar 21072 (Feb. 2021) and Dagstuhl Seminar 23081 (Feb. 2023) on "Agents on the Web" that led to the creation of the W3C Autonomous Agents on the Web (WebAgents) Community Group. One key development is the Web of Things (WoT) [TODO], which unlocks new practical use cases for agents on the Web — and implements several visionary ideas expressed in the motivating scenarios from the original Semantic Web paper [Berners-Lee et al., 2001]. Another key development is the recent progreess in language agents that can follow instructions and use tools: just like previous generations of agents, language agents are designed for specific tasks, highlighting the need for open networks of agents that complement each other's abilities to tackle more complex problems. New protocols and frameworks are rapidly emerging to allow agents to discover and use tools, or to discover and interact with other agents — and many of these initiatives build on Web standards tos promote interoperability (e.g., see the Model Context Protocol, Agent2Agent Protocol, Agent Network Protocol, Eclipse LMOS).
Relevant Concepts | Agent Interaction | Tool Use | Identifiers | Descriptions | Discovery Mechanisms | Arch. Style | |
---|---|---|---|---|---|---|---|
MCP | Tool, Resource, Prompt | N/A | Function calling | Strings (Tools and Prompts),URIs (Resources) | Tool definition, Resource descriptions, Prompt definitions,(JSON) | Directories (via */list) | Client-Server with streaming RPC connectors (JSON-RPC 2.0, HTTP+SSE) |
A2A | Agent Card, Task | Task invocation | N/A | Strings? | Agent Card, Task description,(JSON) | Well-known URIs,Directories | Async. Client-Server with streaming RPC connectors and webhooks (JSON-RPC 2.0, HTTP+SSE) |
ANP | Agent,Agent Description,Communication Protocol | Communication protocols with protocol negotiation | N/A | W3C DID with custom Web-based Agent DID Method | Agent Description (RDF/JSON-LD) | Directories | Peer-to-Peer?(WebSocket subprotocol) |
LMOS | Agent, Agent Group, Tool, Agent Description, Tool Description | Message passing?(in principle: TD interaction affordances) | Property Affordances,Event Affordances,Action Affordances(W3C WoT TD) | Uniform identifiers (IRIs, W3C DIDs) | Agent Description, Tool Description(W3C WoT TD; JSON, RDF/JSON-LD) | DNS-SD/mDNS,Well-known URIs,Directories(W3C WoT Discovery) | W3C WoT Arch.? with protocol bindings for HTTP and WebSocket subprotocol |
FIPA | Agent,Agent Directory,Service Directory,Agent Communication Language,Interaction Protocol | FIPA Agent Communication Langauge,FIPA Agent Interaction Protocols | N/A | FIPA Agent Name | FIPA Agent Identifier Description | Directories | Service-oriented architecture |
OWL-S | Service, Service Profile, Process, Service Grounding | N/A | Service grounding | URIs | Service Profile, Process, Service Grounding | Service Registry | Service-oriented architecture |
hMAS | Agent, Artifact, Agent Body, Workspace, Signifier, Role, Group, Organization, Resource Profile | Message passing,Signifiers for agent body affordances | Signifiers(W3C WoT TD, hMAS ontology) | Uniform identifiers (IRIs, W3C DIDs) | Resource Profile(W3C WoT TD or hMAS ontology; RDF/Turtle) | Hypermedia crawling,Search engines,Directories | Async. Client-Server with REST connectors (HTTP) and brokered pub/sub (W3C WebSub) |
Multi-Agent MicroSevices (MAMS) | Agent, Agent Body, Resource, Microservices | FIPA ACL (over HTTP), REST, HTTP API, JMS | REST, HTTP API, JMS, W3C WOT TD | URIs (Agents, Agent Bodies, Resources) | Agent Bodies (JSON, JSON-LD (inc W3C WoT Hypermedia Controls Ontology), HAL) | Service Registries (Netflix Eureka), Link Crawling, Link Sharing | Microservices Architecture, Event Driven Architecture, REST |
The integration of agents with Web services is about as old as the history of Web services themselves. In 1996, the Foundation for Intelligent Physical Agents (FIPA) was created to develop standards for MAS. These standards include FIPA-ACL, an agent communication language, but also a standardization on the abstract architecture of a MAS and its components, such as directory facilitators to discover services. While these standards were not all specific to Web-based MAS and could be used to develop local MAS applications, for example using JADE (Java Agent DEvelopment Framework), many were applied to the development of Web-based MAS. In particular, the AgentCities project relied on FIPA standards to develop an international Web-based MAS [[AGENTCITIES]]. A related development was the development of the Semantic Web to provide machine-readable information describing Web resources for Web-based agents [[SEMWEB]]. The Semantic Web led to the creation of ontologies to describe Web services so that they can be used by machines. While other standards have been developed to describe Web services, such as the Web Services Description Language (WSDL), Semantic Web representations can be used to provide richer semantics to enable the use of these services by software agents through the use of specific ontologies. The OWL-S ontology is one of the ontologies that has been developed for that purpose [[OWLS]]. OWL-S can be used to describe what a service does, the processes associated with the service and how to use the service.
The early approaches on Web-based MAS were considering the Web mostly as a transport layer for messages among agents rather than an application architecture for a MAS across its different dimensions, including the environment dimension [[DECADE]]. New approaches to develop Web-based MAS relying on the REST architectural style were created. One of them is Multi-Agent Microservices (MAMS), an approach that relies on the REST and microservice architectural styles to design Web-based Multi-Agent Systems [[MAMS]]. In a MAMS system, agents and other entities present in a MAS, such as artifacts, are exposed as microservices on the Web and can therefore be interacted with as such. The Web of Things (WoT) provides a way to integrate Internet of Things (IoT) devices as part of the Web. A core standard of the WoT is the Thing Description (TD), which provides a hypermedia semantic description of a Thing using JSON-LD. Given similarities between the TD model and the concept of artifacts, used to represent environmental entities in the Agents & Artifacts (A&A) meta-model, TDs can be used to integrate A&A MAS environments, as part of the Web of Things [[YGGDRASIL2018]]. This insight leads to the practical instantiation of hypermedia MAS, which are MAS that are integrated within the Web architecture and hypermedia principles [[DECADE]], are used to enable the discovery of entities in the MAS and how to interact with them. Yggdrasil is a platform that enables the creation of hypermedia MAS by exposing artifacts as Thing Descriptions [[YGGDRASIL2018]]. More recently, the HMAS Ontology can be used instead of the Thing Description Ontology to describe the components of an hypermedia MAS.
Modelling Dimensions for Engineering Multi-Agent Systems [Demazeu, 1995]