WebAgents Community Group Report on Interoperability for Agents on the Web

Advances in language agents that can follow instructions and use tools have renewed interest in autonomous agents and multi-agent systems. Like previous generations of agents, language agents are designed for specific tasks, highlighting the need for open networks of agents that complement each other's abilities to tackle more complex problems. New protocols are rapidly emerging to allow agents to discover and use tools, or to discover and interact with other agents. Some of these protocols build on Web standards to promote interoperability, but their alignments, misalignments, and overlaps are unclear. This report synthesizes the large body of research on autonomous agents and multi-agent systems (MAS) to define a conceptual model for understanding Web-based MAS. We use this conceptual model to classify existing technologies and frameworks, to identify relevant standards within the W3C, and to discover standardization gaps (if any).

Agents on the Web

Visions of Agents on the Web

The vision of intelligent agents on the Web is almost as old as the Web itself: in a keynote at WWW'94, Sir Tim Berners-Lee was noting that documents on the Web describe real objects and relationships among them, and if the semantics of these objects are represented explicitly then machines can browse through and manipulate reality. This vision was published in 2001 as the Semantic Web [Berners-Lee et al., 2001] — and is now closer to its realization through the standardization of the Web of Things (WoT) at the W3C and the IETF.

In the AI community, the vision of a world-wide open network of intelligent agents can be traced back to the late '90s. In 2002, the AgentCities initiative was reporting a network of 41 agent platforms deployed in 21 countries [Willmott et al., 2002] — with up to 60 registered platforms reported in 2003 [Dale et al., 2003] and 160 platforms in 2005 [Bellifemine et al., 2005]. The network was based on the standards produced by the Foundation for Intelligent Physical Agents (FIPA), but quickly faded after the mid-2000s as industry attention shifted to Web services. Another prominent initative was the DARPA Control of Agent-Based Systems (CoABS) research program [TODO], which investigated the control, coordination, and management of large systems of autonomous software agents in military applications. Central to this program, CoABS Grid was the middleware integrating heterogeneous agent-based systems, object-based applications, and legacy systems using remote method invocation as a client-server style for network-based interaction.

The DARPA CoABS program demonstrated the use of agent technology in large-scale practical applications, but also raised a number of challenges, such as enabling software agents to dynamically identify and understand information sources [TODO]. To address these, DARPA launched the Agent Markup Language (DAML) research program, which built on top of existing Web standards and paved the way for the Web Ontology Language (OWL), Semantic Markup for Web Services (OWL-S), and other cornerstones of the Semantic Web. The DAML program thus advanced the original vision of the Web as an information space not only for people but also for intelligent agents, and promoted a shift from custom-built middleware for MAS — such as CoABS Grid or FIPA implementations — to offloading many of those responsibilities to the existing Web infrastructure. Web-based MAS received significant attention over the years, especially with the advent of service-oriented computing in the early 2000s [Singh and Huhns, 2006].

Recent years have brought renewed interest in Web-based MAS, as evidenced by the Dagstuhl Seminar 21072 (Feb. 2021) and Dagstuhl Seminar 23081 (Feb. 2023) on "Agents on the Web" that led to the creation of the W3C Autonomous Agents on the Web (WebAgents) Community Group. One key development is the Web of Things (WoT) [TODO], which unlocks new practical use cases for agents on the Web — and implements several visionary ideas expressed in the motivating scenarios from the original Semantic Web paper [Berners-Lee et al., 2001]. Another key development is the recent progreess in language agents that can follow instructions and use tools: just like previous generations of agents, language agents are designed for specific tasks, highlighting the need for open networks of agents that complement each other's abilities to tackle more complex problems. New protocols and frameworks are rapidly emerging to allow agents to discover and use tools, or to discover and interact with other agents — and many of these initiatives build on Web standards tos promote interoperability (e.g., see the Model Context Protocol, Agent2Agent Protocol, Agent Network Protocol, Eclipse LMOS).

State of Web-based Multi-Agent Systems

	Relevant Concepts	Agent Interaction	Tool Use	Identifiers	Descriptions	Discovery Mechanisms	Arch. Style
MCP	Tool, Resource, Prompt	N/A	Function calling	Strings (Tools and Prompts), URIs (Resources)	Tool definition, Resource descriptions, Prompt definitions, (JSON)	Directories (via */list)	Client-Server with streaming RPC connectors (JSON-RPC 2.0, HTTP+SSE)
A2A	Agent Card, Task	Task invocation	N/A	Strings?	Agent Card, Task description, (JSON)	Well-known URIs, Directories	Async. Client-Server with streaming RPC connectors and webhooks (JSON-RPC 2.0, HTTP+SSE)
ANP	Agent, Agent Description, Communication Protocol	Communication protocols with protocol negotiation	N/A	W3C DID with custom Web-based Agent DID Method	Agent Description (RDF/JSON-LD)	Directories	Peer-to-Peer? (WebSocket subprotocol)
LMOS	Agent, Agent Group, Tool, Agent Description, Tool Description	Message passing? (in principle: TD interaction affordances)	Property Affordances, Event Affordances, Action Affordances (W3C WoT TD)	Uniform identifiers (IRIs, W3C DIDs)	Agent Description, Tool Description (W3C WoT TD; JSON, RDF/JSON-LD)	DNS-SD/mDNS, Well-known URIs, Directories (W3C WoT Discovery)	W3C WoT Arch.? with protocol bindings for HTTP and WebSocket subprotocol
FIPA	Agent, Agent Directory, Service Directory, Agent Communication Language, Interaction Protocol	FIPA Agent Communication Langauge, FIPA Agent Interaction Protocols	N/A	FIPA Agent Name	FIPA Agent Identifier Description	Directories	Service-oriented architecture
OWL-S	Service, Service Profile, Process, Service Grounding	N/A	Service grounding	URIs	Service Profile, Process, Service Grounding	Service Registry	Service-oriented architecture
hMAS	Agent, Artifact, Agent Body, Workspace, Signifier, Role, Group, Organization, Resource Profile	Message passing, Signifiers for agent body affordances	Signifiers (W3C WoT TD, hMAS ontology)	Uniform identifiers (IRIs, W3C DIDs)	Resource Profile (W3C WoT TD or hMAS ontology; RDF/Turtle)	Hypermedia crawling, Search engines, Directories	Async. Client-Server with REST connectors (HTTP) and brokered pub/sub (W3C WebSub)
Multi-Agent MicroSevices (MAMS)	Agent, Agent Body, Resource, Microservices	FIPA ACL (over HTTP), REST, HTTP API, JMS	REST, HTTP API, JMS, W3C WOT TD	URIs (Agents, Agent Bodies, Resources)	Agent Bodies (JSON, JSON-LD (inc W3C WoT Hypermedia Controls Ontology), HAL)	Service Registries (Netflix Eureka), Link Crawling, Link Sharing	Microservices Architecture, Event Driven Architecture, REST

Agents and Web Services

The integration of agents with Web services is about as old as the history of Web services themselves. In 1996, the Foundation for Intelligent Physical Agents (FIPA) was created to develop standards for MAS. These standards include FIPA-ACL, an agent communication language, but also a standardization on the abstract architecture of a MAS and its components, such as directory facilitators to discover services. While these standards were not all specific to Web-based MAS and could be used to develop local MAS applications, for example using JADE (Java Agent DEvelopment Framework), many were applied to the development of Web-based MAS. In particular, the AgentCities project relied on FIPA standards to develop an international Web-based MAS [[AGENTCITIES]]. A related development was the development of the Semantic Web to provide machine-readable information describing Web resources for Web-based agents [[SEMWEB]]. The Semantic Web led to the creation of ontologies to describe Web services so that they can be used by machines. While other standards have been developed to describe Web services, such as the Web Services Description Language (WSDL), Semantic Web representations can be used to provide richer semantics to enable the use of these services by software agents through the use of specific ontologies. The OWL-S ontology is one of the ontologies that has been developed for that purpose [[OWLS]]. OWL-S can be used to describe what a service does, the processes associated with the service and how to use the service.

The early approaches on Web-based MAS were considering the Web mostly as a transport layer for messages among agents rather than an application architecture for a MAS across its different dimensions, including the environment dimension [[DECADE]]. New approaches to develop Web-based MAS relying on the REST architectural style were created. One of them is Multi-Agent Microservices (MAMS), an approach that relies on the REST and microservice architectural styles to design Web-based Multi-Agent Systems [[MAMS]]. In a MAMS system, agents and other entities present in a MAS, such as artifacts, are exposed as microservices on the Web and can therefore be interacted with as such. The Web of Things (WoT) provides a way to integrate Internet of Things (IoT) devices as part of the Web. A core standard of the WoT is the Thing Description (TD), which provides a hypermedia semantic description of a Thing using JSON-LD. Given similarities between the TD model and the concept of artifacts, used to represent environmental entities in the Agents & Artifacts (A&A) meta-model, TDs can be used to integrate A&A MAS environments, as part of the Web of Things [[YGGDRASIL2018]]. This insight leads to the practical instantiation of hypermedia MAS, which are MAS that are integrated within the Web architecture and hypermedia principles [[DECADE]], are used to enable the discovery of entities in the MAS and how to interact with them. Yggdrasil is a platform that enables the creation of hypermedia MAS by exposing artifacts as Thing Descriptions [[YGGDRASIL2018]]. More recently, the HMAS Ontology can be used instead of the Thing Description Ontology to describe the components of an hypermedia MAS.

Agents and the Decentralized Social Web

Agentic AI

Conceptual Overview and Modeling Dimensions

Modelling dimensions for Multi-Agent Systems

Modelling Dimensions for Engineering Multi-Agent Systems [Demazeu, 1995]

Introduction

Terminology

Agents on the Web

Visions of Agents on the Web

State of Web-based Multi-Agent Systems

Agents and Web Services

Agents and the Decentralized Social Web

Agentic AI

Conceptual Overview and Modeling Dimensions

Architectural Considerations

Identification

Relevant Standards and Initiatives

Agent Identification

Tool Identification

Discussion

Profiles

Relevant Standards and Initiatives

Agent Profiles

Tool Profiles

Discussion

Verifiable Credentials

Relevant Standards

Discussion

Discovery

Relevant Standards and Initiatives

Agent Discovery

Tool Discovery

Discussion

Agent-to-Agent Interaction

Relevant Standards and Initiatives

Agents and People

Discussion

Agent-Environment Interaction

Relevant Standards and Initiatives

Tool Use

Discussion

Norms, Policies, and Organizations

Relevant Standards and Initiatives

Discussion

Security and Privacy

Relevant Standards

Authentication and Authorization

Discussion

Conclusions: A Strategy for Agents on the Web

Acknowledgements