Protocol‎ > ‎Draft Protocol Specs‎ > ‎

Draft Protocol Spec

Below you'll find the draft Google Wave Federation Protocol, and the canonical copy is maintained in the Mercurial repository hosted at: http://code.google.com/p/wave-protocol/. The intellectual property related to this protocol is licensed under a liberal Patent License, and the specification is made available under the Creative Commons Attribution 3.0 License. If you'd like to contribute to the specification, please review the community principles

Google Wave Federation Protocol Over XMPP

Anthony Baxter, Jochen Bekmann, Daniel Berlin, Soren Lassen, Sam Thorogood

July 21, 2009

TOC

Document Status

This document represents work in progress. It omits details that we are unable to capture at this point and we expect parts of the protocol to change. Please also note that while we revise the protocol and white papers, some transient inconsistencies will occur.



Table of Contents

1.  Introduction
    1.1.  Overview
    1.2.  Terminology
2.  Architectural Overview
    2.1.  Wave Providers
    2.2.  Wavelets
    2.3.  Documents
    2.4.  Operations
    2.5.  Wave Service Architecture
        2.5.1.  Federation Host and Federation Remote
3.  Protocol Specification
    3.1.  Connection Initiation and Lifetime
    3.2.  Cryptographic Certificates and Signatures
    3.3.  Stanzas
        3.3.1.  Commonly used attributes
        3.3.2.  Commonly used elements
        3.3.3.  Update Stanzas
        3.3.4.  Service Stanzas
4.  Wavelet Update
5.  Get Signer
6.  Post Signer
7.  Documents
    7.1.  Document operations
    7.2.  Document operation components
8.  References
Appendix A.  Protocol Schema
Appendix B.  Protocol Buffers
§  Authors' Addresses




 TOC 

1.  Introduction



 TOC 

1.1.  Overview

The Google Wave Federation Protocol Over XMPP is an open extension to the XMPP core (Saint-Andre, P., Ed., “Extensible Messaging and Presence Protocol (XMPP): Core,” October 2004.) [RFC3920] protocol allowing near real-time communication of wave updates between two wave servers.



 TOC 

1.2.  Terminology

The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [TERMS].



 TOC 

2.  Architectural Overview

Google Wave is a communication and collaboration platform based on hosted conversations, called waves. A wave consists of XML documents and supports concurrent modifications and low-latency updates between participants on the wave.



 TOC 

2.1.  Wave Providers

The wave federation protocol enables everyone to become a wave provider and share waves with others. For instance, an organization can operate as a wave provider for its members, an individual can run a wave server as a wave provider for a single user or family members, and an Internet service provider can run a wave service as another Internet service for its users as a supplement to email, IM, ftp, etc.

A wave provider is identified by its Internet domain name(s).

Wave users have wave addresses which consist of a user name and a wave provider domain in the same form as an email address, namely <username>@<domain>. Wave addresses can also refer to groups, robots, gateways, and other services. A group address refers to a collection of wave addresses, much like an email mailing list. A robot can be a translation robot or a chess game robot. A gateway translates between waves and other communication and sharing protocols such as email and IM. In the remainder we ignore addressees that are services, including robots and gateways - they are treated largely the same as users with respect to federation.

Wave users access all waves through their wave provider. If a wave has participants from different wave providers, their wave providers all maintain a copy of the wave and serve it to their users on the wave. The wave providers share updates to the wave with each other using the wave federation protocol which we describe below. For any given wave user, it is the responsibility of the wave provider for the user's domain to authenticate the user (using cookies and passwords, etc) and perform local access control.



 TOC 

2.2.  Wavelets

A wave consists of a set of wavelets. When a user has access to a wavelet, that user is called a participant of that wavelet. Each wavelet has a list of participants, and a set of documents that make up its contents. Different wavelets of a wave can have different lists of participants. Copies of a wavelet are shared across all of the wave providers that have at least one participant in that wavelet. Amongst these wave providers, there is a designated wave provider that has the definitive copy of that wavelet. We say that this particular provider is hosting that wavelet.

When a user opens a wave, a view of the wave is retrieved, namely the set of wavelets in the wave that the user is a participant of (directly, or indirectly via group membership). In general, different users have different wave views for a given wave. For example, per-user data for a user in a wave, such as the user's read/unread state for the wave, is stored in a user-data wavelet in the wave with the user as the only participant. The user-data wavelet only appears in this user's wave view. Another example is a private reply within a wave, which is represented as a wavelet with a restricted participant list. The private reply wavelet is only in the wave views of the restricted list of users.

A wave is identified by a globally unique wave id, which is a pair of a domain name and an id string. The domain names the wave provider where the wave originated.

A wavelet has a wavelet id which is unique within its wave. Like a wave id, a wavelet id is a pair of a domain name and an id string. The domain name in the wavelet id plays a special role: It names the wave provider that hosts the wavelet. A wavelet is hosted by the wave provider of the participant who creates the wavelet. The wave provider who hosts a wavelet is responsible both for operational transformation and application of wavelet operations to the wavelet and for sharing the applied operations with the wave providers of all the wavelet participants

Wavelets in the same wave can be hosted by different wave providers. For example, a user-data wavelet is always hosted by the user's wave provider, regardless of where the rest of the wave is hosted. Indeed, user-data is not federated, i.e., not shared with other wave providers. Another example is a private reply wavelet. A particularly simple instance of this is when all the participants of the private reply are from the same wave provider. Then this wave provider will not share the private reply wavelet with other wave providers, regardless of where the other wavelets in the wave are hosted.



 TOC 

2.3.  Documents

Each wavelet is a container for any number of documents. Each document has an id that is unique within its containing wavelet. It is composed of an XML document and a set of annotations. Annotations are key-value pairs that span arbitrary ranges of the XML document and are independent of the XML document structure. They are used to represent text formatting, spelling suggestions and hyper-links.

Some documents represent rich-text messages in the wavelet. These are known as "blips". The blips in a wave form a threaded conversation. Others represent data, for example tags, and are not displayed to the user as part of the threaded conversation structure in the wave. For detailed information on the structure of documents, please refer to the Google Wave Operational Transformation white paper.



 TOC 

2.4.  Operations

Operations are mutations on wavelets. The state of a wavelet is entirely defined by a sequence of operations on that wavelet.

Clients and servers exchange operations in order to communicate modifications to a wavelet. Operations propagate through the system to all clients and servers interested in that wavelet. They each apply the operation to their own copy of the wavelet. The use of operational transformation (OT) guarantees all copies of the wavelet will eventually converge to the same state. In order for the guarantees made by OT to hold, all communication participants must use the same operational transformation and composition algorithms (i.e. all OT implementations must be functionally equivalent).



 TOC 

2.5.  Wave Service Architecture

A wave provider operates a wave service on one or more networked servers. The central pieces of the wave service is the wave store, which stores wavelet operations, and the wave server, which resolves wavelet operations by operational transformation and writes and reads wavelet operations to and from the wave store. Typically, the wave service serves waves to users of the wave provider which connect to the wave service frontend (see Google Wave Data Model and Client-Server Protocol), and we shall assume this in the following description of the wave service architecture. More importantly, for the purpose of federation, the wave service shares waves with participants from other providers by communicating with these wave provider's servers.

For a given wave provider, its wave server serves wave views to local participants, i.e., participants from its domain. This wave server stores the state of all wavelets that it knows about. Some are hosted by the wave server itself. These are "local wavelets" relative to this wave server. Others are copies of wavelets hosted by other wave providers. These are "remote". A wave view can contain both types of wavelet simultaneously.

At a particular wave provider, local wavelets are those created at that provider, namely by users who belong to the wavelet provider. The wave server is responsible for processing the wavelet operations submitted to the wavelet by local participants and by remote participants from other wave providers. The wave server performs concurrency control by ordering the submitted wavelet operations relative to each other using operational transformation. It also validates the operations before applying them to a local wavelet.

Remote wavelets are hosted by other wave providers. The wave server maintains cached copies locally and updates them with wavelet operations that it gets from the hosting wave providers. When a local participant submits a wavelet operation to a remote wavelet, the wave server forwards the operation to the wave server of the hosting provider. When the transformed and applied operation is echoed back, it is applied to the cached copy. Read access to local participants is done from the cached copy without a round trip to the hosting wave provider.

Local and remote wavelets are all stored in the wave server's persistent wave store.

We say that a wave provider is "upstream" relative to its local wavelets and that it is "downstream" relative to its remote wavelets.



 TOC 

2.5.1.  Federation Host and Federation Remote

The wave service uses two components for peering with other wave providers, a "federation host" and a "federation remote". (In an earlier revision of this draft specification these components were called "federation gateway" and "federation proxy", respectively).

The federation host communicates local wavelet operations, i.e., operations on local wavelets:

  • It pushes new wavelet operations that are applied to a local wavelet to the wave providers of any remote participants.
  • It satisfies requests for old wavelet operations.
  • It processes wavelet operations submission requests.

The federation remote communicates remote wavelet operations and is the component of a wave provider that communicates with the federation host of upstream wave providers:

  • It receives new wavelet operations pushed to it from the wave providers that host the wavelets.
  • It requests old wavelet operations from the hosting wave providers.
  • It submits wavelet operations to the hosting wave providers.

An upstream wave provider's federation host connects to a downstream wave provider's federation remote to push wavelet operations that are hosted by the upstream wave provider.

The federation protocol has the following mechanisms to make operation delivery from host to remote reliable. The federation host maintains (in persistent storage) a queue of outgoing operations for each remote domain. Operations are queued until their receipt is acknowledged by the receiving federation remote. The federation host will continually attempt to establish a connection and reconnect after any connection failures (retrying with an exponential backoff). When a connection is established, the federation host will send queued operations. The receiving federation remote sends acknowledgements back to the sending federation host and whenever an acknowledgement is received, the sender dequeues the acknowledged operations.



 TOC 

3.  Protocol Specification



 TOC 

3.1.  Connection Initiation and Lifetime

As an XMPP extension, this protocol expects a bidirectional stream to be established according to the XMPP core specification.

The connection MUST be secured using the TLS feature of XMPP. It is RECOMMENDED that communication is encrypted (namely by using a non-identity TLS cipher).

All communication except wavelet updates are sent via PubSub (, “XMPP Publish Suscribe,” September 2008.) [XEP0060] events. Wavelet updates are sent using Message stanzas.



 TOC 

3.2.  Cryptographic Certificates and Signatures

In the section below there are references to cryptographic signatures and certificates used to generate them.

The paper by Kissner and Laurie, General Verifiable Federation gives a detailed explanation of how we expect to make all changes to wavelets attributable to their originating servers and render the federation protocol immune to a number of attacks. The techniques described in the paper have not yet been fully implemented or incorporated into this protocol specification, however certificates are exchanged using the get signer and post signer XMPP messages.



 TOC 

3.3.  Stanzas

The federation protocol involves two parties: a wave federation host and wave federation remote as described above (Federation Host and Federation Remote). The top level stanzas are divided into two types: those that are part of the "update stanzas", and those part of the "service stanzas". The update stanzas (Update Stanzas) are initiated by a federation host to a federation remote and carry <update/> (wavelet-update)s from the host to the remote. The service stanzas (Service Stanzas) are initiated by the federation remote to the federation host and carry <submit-request/> (Submit Request)s and <submit-response/> (submit-response)s, <history-request/> (History Request)s and <history-response/> (History Response)s, <signer-get-request/>s and <signer-get-response/>s, <signer-post-request/>s and <signer-post-response/>s.



 TOC 

3.3.1.  Commonly used attributes

These stanzas commonly contain the following attributes:



 TOC 

3.3.1.1.  wavelet-name

The 'wavelet-name' attribute is an encoded form of the following components:

  • A "wave id" specifies the domain of the wave provider that originally started the wave, plus an identifier unique within that domain.
  • A "wavelet id" specifies the domain of the wave provider that hosts the wavelet, plus a unique identifier which is unique within all wavelets with that domain, within the wave.

These components are encoded into a single string in the format of a netpath of an URI. The wavelet id domain is used as the host part (since this is where the wavelet is hosted). The wave id is used as the first path element, which contains the wave id domain if it does not match the wavelet id domain, in this case it is prepended to a unique identifier with a '$' delimiter. The unique identifier in the wavelet id is the final path element. URI generic delimiter characters (:/?#[]@) appearing in the id parts must be percent-escaped.

For example, a 'wavelet-name' might be "initech-corp.com/acmewave.com$w+4Kl2/conv+3sG7", where the wavelet id has domain "initech-corp.com" and unique identifier "conv+3sG7", and the wave id has domain "acmewave.com" and unique identifier "w+4Kl2".

If the wavelet was hosted at "initech-corp.com" and the wave had also been started on that domain, the 'wavelet-name' would be "initech-corp.com/w+4Kl2/conv+3sG7".



 TOC 

3.3.2.  Commonly used elements



 TOC 

3.3.2.1.  hashed-version

A <hashed-version/> element contains the version and history hash pair of a wavelet.

  • 'version' -- REQUIRED attribute which contains the version of the wavelet.
  • 'history-hash' -- REQUIRED attribute which is the value of the rolling history hash at the given version.



 TOC 

3.3.2.2.  commit-notice

The <commit-notice/> element is a variant of the <hashed-version/> element. It is used to indicate that the wave server has committed deltas up to this point.

  • 'version' -- REQUIRED attribute which contains the version of the wavelet.



 TOC 

3.3.2.3.  delta

The <delta/> element contains a sequence of one or more operations grouped for communication to and between wave servers:

  • 'wavelet-name' -- REQUIRED wavelet-name (wavelet-name).
  • <operation/> -- The operation is carried as the text of the <delta> element as a Base64 encoded protocol buffer.



 TOC 

3.3.2.4.  applied-delta

The <applied-delta/> element contains a delta which has been successfully applied to a wavelet by a wave server, along with supplementary information about the result of the application.

  • <operation/> -- The operation is carried as the text of the <applied-delta> element as a Base64 encoded protocol buffer.



 TOC 

3.3.3.  Update Stanzas

The wavelet-update operation is sent as a Message stanza.



 TOC 

3.3.3.1.  wavelet-update

The <wavelet-update/> element is used within a Message stanza. It is used to push new wavelet operations applied to a local wavelet to the wave providers of any remote participants.

When the requester is resending updates after reconnecting a XMPP stream, the <wavelet-update/> MAY omit the <applied-delta/> elements but MUST resend the <commit-notice/> elements. In this case the <commit-notice/> informs the receiver of the existence of updates, and it is up to the receiver to request these using a <history-request/> on a service stream.



 TOC 

3.3.3.2.  Successful update and response example

An example of otherwave.com's federation host pushing data to a federation remote:

Step 1: otherwave.com's federation host sends an update to the federation remote:

<message type="normal"
  from="wave.initech-corp.com"
  id="1-1" to="wave.acmewave.com">
  <request xmlns="urn:xmpp:receipts"/>
  <event xmlns="http://jabber.org/protocol/pubsub#event">
    <items>
      <item>
        <wavelet-update
          xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
          wavelet-name="acmewave.com/initech-corp.com!a/b">
          <applied-delta><![CDATA[CiI...MwE] ]></applied-delta>
        </wavelet-update>
      </item>
    </items>
  </event>
</message>

Step 2: The federation remote acknowledges the update, indicating success.

<message id="1-1"
  from="wave.acmewave.com"
  to="wave.initech-corp.com">
  <received
    xmlns="urn:xmpp:receipts"/>
</message>


 TOC 

3.3.4.  Service Stanzas

The service stanzas are for the submission of operations and wavelet history retrieval.



 TOC 

3.3.4.1.  History Request

The <delta-history/> element is used within a PubSub (, “XMPP Publish Suscribe,” September 2008.) [XEP0060] event. It is sent by a federation remote to request wavelet operations from the hosting wave providers. The response by the host provider's federation host will contain the operations for the requested version range.

  • 'wavelet-name' -- REQUIRED attribute.
  • 'start-version' -- REQUIRED attribute with version number (inclusive) from which to retrieve the wavelet's history. (Note that the start version MUST fall on a delta boundary).
  • 'start-version-hash' -- REQUIRED attribute with the hash for the associated start version.
  • 'end-version' -- OPTIONAL attribute with ending version number (exclusive) up to which to retrieve the wavelet's history. (Note that the end version MUST fall on a delta boundary).
  • 'end-version-hash' -- REQUIRED attribute with the hash for the associated end version.
  • 'response-length-limit' -- OPTIONAL attribute containing advice from the requester about the preferred response limit, measured as the aggregate number of characters in the XML serialization of the applied deltas in the response. The responder is advised but not required to respect the limit. Moreover, the responder may operate with a lower limit of its own and send back a smaller message than requested. When the responder exercises either its own or the requester's limit, it will return only a prefix of the requested wavelet deltas. Unless the version range is empty, the responder will always return a minimum of one wavelet delta (the first) even if its length exceeds the responders or requester's limits.



 TOC 

3.3.4.2.  History Response

The response to a History Request contains:

  • <history-truncated> -- OPTIONAL attribute indicating that the returned deltas were truncated at the given version number (exclusive). Truncation will occur if the <delta-history/> (History Request) specified a 'response-length-limit' attribute or the responder imposed its own limit.
  • <applied-delta/> -- the update contains ZERO OR MORE <applied-delta/> elements, starting from the requested version up to the requested end version (exclusive), or until the latest version if the request did not contain the end version, or up to the version indicated in 'version-truncated-at'.
  • <commit-notice/> -- OPTIONAL element indicating that some range of the returned deltas has not been committed to persistent storage by the hosting wave server. The <commit-notice/> indicates up to which version the server has committed.



 TOC 

3.3.4.3.  Successful history request / history response example

Step 1: A federation remote makes a history-request (History Request) to the acmewave.com federation host:

<iq type="get" id="1-1" from="wave.initech-corp.com" to="wave.acmewave.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <items node="wavelet">
      <delta-history
        xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
        start-version="12"
        start-version-hash=""
        end-version="2345"
        end-version-hash=""
        response-length-limit="300000"
        wavelet-name="acmewave.com/initech-corp.com!a/b"/>
    </items>
  </pubsub>
</iq>

Step 2: acmewave.com's federation host returns the requested history.

<iq type="result" id="1-1" from="wave.acmewave.com" to="wave.initech-corp.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <items>
      <item>
        <applied-delta
          xmlns="http://waveprotocol.org/protocol/0.2/waveserver">
            <![CDATA[CiI...MwE] ]>
        </applied-delta>
      </item>
      <item>
        <commit-notice
          xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
          version="2344"/>
      </item>
      <item>
        <history-truncated
          xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
          version="2300"/>
      </item>
    </items>
  </pubsub>
</iq>


 TOC 

3.3.4.4.  Submit Request

The <submit-request/> element is used within a PubSub (, “XMPP Publish Suscribe,” September 2008.) [XEP0060] event. The federation remote submits wavelet operations to the hosting wave provider. A <submit-response/> will be returned.

  • <delta/> -- REQUIRED delta element to be submitted.



 TOC 

3.3.4.5.  submit-response

A <submit-response/> element is used within a PubSub (, “XMPP Publish Suscribe,” September 2008.) [XEP0060] response. It is returned by a federation host after the hosting wave server has processed the submitted delta.

  • 'operations-applied' -- REQUIRED attribute with the number of operations applied by the wave server after transforming the submitted delta.
  • 'application-timestamp' -- REQUIRED timestamp (milliseconds since epoch) attribute recording the time of delta application.
  • 'error-message' -- OPTIONAL string attribute containing an error message if the an error occurred while applying the delta. Note it's possible to partially apply a delta, in which case the error message will be present.
  • <hashed-version/> -- REQUIRED element with the version and history hash of the wavelet after the submitted delta was applied.



 TOC 

3.3.4.6.  Successful submit request / submit response example

Step 1: The federation remote makes an submit request to the initech-corp.com federation host:

<iq type="set" id="1-1" from="wave.initech-corp.com" to="wave.acmewave.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <publish node="wavelet">
      <item>
        <submit-request
          xmlns="http://waveprotocol.org/protocol/0.2/waveserver">
          <delta wavelet-name="acmewave.com/initech-corp.com!a/b">
            <![CDATA[CiA...NvbQ==] ]>
          </delta>
        </submit-request>
      </item>
    </publish>
  </pubsub>
</iq>

Step 2: The initech-corp.com federation host returns a response to the submit request. Note that this example shows the case where a different party submitted a delta at version 100 with 3 operations before this submit-request was received. The requester's submitted delta was thus transformed before it was applied, and as a result the version number at which it was applied was 103.

<iq type="result" id="1-1" from="wave.acmewave.com" to="wave.initech-corp.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <publish>
      <item>
        <submit-response
          xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
          application-timestamp="1234567890"
          operations-applied="2">
          <hashed-version
            history-hash=""
            version="1234"/>
        </submit-response>
      </item>
    </publish>
  </pubsub>
</iq>


 TOC 

4.  Wavelet Update

Wavelet update operations mutate wavelets. The actual operation is a signed protocol buffer that is included in the applied-delta element Base64 encoded text. The wavelet update MAY contain multple applied-delta's and an optional commit-notice. The wavelet update response is an XMPP receipt of the form specified in XEP-0184.

Here is an example exchange:

<message type="normal" from="wave.initech-corp.com" id="1-1" to="wave.acmewave.com">
  <request xmlns="urn:xmpp:receipts"/>
  <event xmlns="http://jabber.org/protocol/pubsub#event">
    <items>
      <item>
        <wavelet-update xmlns="http://waveprotocol.org/protocol/0.2/waveserver" wavelet-name="acmewave.com/initech-corp.com!a/b">
          <applied-delta><![CDATA[CiIKIAoFCNIJEgASF2ZvenppZUBpbml0ZWNoLWNvcnAuY29tEgUI0gkSABgCINKF2MwE] ]></applied-delta>
        </wavelet-update>
      </item>
    </items>
  </event>
</message>
<message id="1-1" from="wave.acmewave.com" to="wave.initech-corp.com">
  <received xmlns="urn:xmpp:receipts"/>
</message>


 TOC 

5.  Get Signer

A remote wave server issues a signer-request to request certificates for wavelets where the signer of the wavelet is currently unknown. The request is sent to the wave server that hosts the wavelet. The provided history-hash identifies the delta for which the certificate is being requested.

Here is an example exchange:

<iq type="get" id="1-1" from="wave.initech-corp.com" to="wave.acmewave.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <items node="signer">
      <signer-request xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
        history-hash="somehash" version="1234"
        wavelet-name="acmewave.com/initech-corp.com!a/b"/> </items>
  </pubsub>
</iq>

The hosting wave server replies with a chain of certificates sent Base64 encoded in the certificate elements. Each certificate element represents a single certificate. The order of the certificate elements goes from the first which is the closest certificate, to the last certificate which is the root for the certificate chain. More details on signing are still to be added to this document.

<iq type="result" id="1-1" from="wave.acmewave.com" to="wave.initech-corp.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <items>
      <signature xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
        domain="initech-corp.com" algorithm="SHA256">
        <certificate><![CDATA[Q0VS...VElPTg==] ]></certificate>
        <certificate><![CDATA[QkV...LRQ==] ]></certificate>
      </signature>
    </items>
  </pubsub>
</iq>


 TOC 

6.  Post Signer

Before submitting a wavelet delta for the first time, a remote wave server will supply the certificate chain that will allow the hosting wave server to authenticate the signed wave delta. More details on signing are still to be added to this document.

Here is an example exchange:

<iq type="set" id="1-1" from="wave.initech-corp.com" to="wave.acmewave.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <publish node="signer">
      <item>
        <signature xmlns="http://waveprotocol.org/protocol/0.2/waveserver"
          domain="initech-corp.com" algorithm="SHA256">
          <certificate><![CDATA[Q0V...Tg==] ]></certificate>
          <certificate><![CDATA[QkV...RQ==] ]></certificate>
        </signature>
      </item>
    </publish>
  </pubsub>
</iq>

The hosting wave server acks the message.

<iq type="set" id="1-1" from="wave.initech-corp.com" to="wave.acmewave.com">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <publish>
      <item node="signer">
        <signature-response xmlns="http://waveprotocol.org/protocol/0.2/waveserver" />
      </item>
    </publish>
  </pubsub>
</iq>


 TOC 

7.  Documents

A document is a sequence of items, where each item is a character, a start tag, or an end tag. Each item has a key-value map of annotations.

Characters are Unicode code points. Certain control characters, special characters and noncharacters are not permitted.

Start tags consist of a type and attributes. The type is an XML name. The attributes form a key-value map, where keys and values are strings. Certain Unicode control characters, special characters and noncharacters are permitted neither in the type nor in attribute names or values.

Each end tag terminates the rightmost unterminated start tag; the tag name is implicit. The number of start tags in the document equals the number of end tags, and for every prefix of the document, the number of start tags equals or exceeds the number of end tags. Thus, start and end tags nest properly, and there are no self-closing tags.

Annotation keys and values are strings. Certain Unicode control characters, special characters and noncharacters are not permitted. If the map has no entry for a given key, we sometimes say that the value for that key is null. While each item conceptually has its own annotation map, implementations may find it more efficient to have just one annotation map for each consecutive run of items with the same annotations.

Note that a naive serialization of the document without annotations into a string is not formally an XML document because it can have multiple elements and characters at the top level, while XML requires a single root element. How to interpret the document as XML is up to the application; options include making sure at the application level that the entire document contents are inside a single element even if the protocol does not enforce this; ignoring all content other than the first element; or wrapping the entire document in an implicit root element whose type and attributes are not represented inside the document.



 TOC 

7.1.  Document operations

A document operation is a set of instructions that specify how to process an input document, reading its sequence of items from left to right, to generate an output document. For the purpose of this specification, the operation does not modify the input document, although implementations that perform destructive updates are possible.

Document operations are invertible; for any document operation op that turns an input document A into an output document B, an inverse operation that turns B into A can always be derived from op without knowledge of A or B.

A document operation consists of a sequence of document operation components that are executed in order. During this process, two pieces of state need to be maintained in addition to the document being processed:

  • the current location ('cursor') in the input document, either to the left of the first item, between two items, or to the right of the last item of the input document, and
  • the current annotations update, which is a map of annotation keys to pairs (old-value, new-value), where old-value and new-value are either null or an annotation value.

Initially, the cursor is to the left of the first item, and the annotations update is empty.

After the final component, the annotations update must be empty, and the cursor must be to the right of the last item in the input document.



 TOC 

7.2.  Document operation components

Document operation components can be divided into four classes:

  • update components (retain, replaceAttributes, updateAttributes) move the cursor over a consecutive range of input items and generate corresponding but potentially modified items in the output document;
  • insertion components (characters, elementStart, elementEnd) generate items in the output document without moving the cursor;
  • deletion components (deleteCharacters, deleteElementStart, deleteElementEnd) move the cursor over a consecutive range of input items without generating any output;
  • annotation boundaries (annotationBoundary) change the current annotations update but do not directly affect the document or the cursor.

The different component classes have the following interaction with annotations:

  • For update components, the old values in the annotations update match the annotation values of each item in the input document that the component processes. The generated items in the output document will have the same annotations as the corresponding input items, except for the annotation keys in the annotations update; for those keys, the generated items will have the new values.
  • For insertion components, the old values in the annotations update match the annotations of the item to the left of the cursor. The inserted items are annotated with the new values from the annotations update in addition to any annotations on the item to the left of the cursor with keys that are not part of the annotations update.
    If the cursor is at the beginning of the document, the old values in the annotations update are null, and the inserted items are annotated with the new values from the annotations update.
  • For deletion components, the old values in the annotations update match the annotations of each item in the input document processed by the component, and the new values match the annotations of the rightmost item generated so far. All annotation keys that have different values in the processed item and the rightmost item generated so far are present in the annotations update.
    If no items have been generated so far, the new values are null, and all annotation keys of the deleted items must be present in the annotations update.

retain(itemCount)
The cursor moves over the next itemCount items, and they are copied to the output document, with annotations as described above. The argument itemCount is a positive integer.
replaceAttributes(oldAttributes, newAttributes)
The cursor moves over the next item, which must be a start tag with the attributes oldAttributes. A start tag with the same type but the attributes newAttributes is generated in the output. Its annotations are as described above. The arguments oldAttributes and newAttributes are key-value maps.
updateAttributes(attributesUpdate)
The cursor moves over the next item, which must be a start tag. A start tag with the same type is generated in the output, with annotations as described above. The argument attributesUpdate is a map of attribute names to pairs (oldValue, newValue), where oldValue and newValue are either null or an attribute value. The oldValues match the attributes of the start tag in the input document; an oldValue of null means no such attribute is present. The generated start tag has the new values for the attributes in attributesUpdate. Attributes in the input whose names are not listed are transferred to the output unchanged.
characters(characters)
The specified characters are inserted into the output document, with annotations as described above.
elementStart(type, attributes)
An element start with type type and attributes attributes is inserted into the output document, with annotations as described above. This component must be terminated with an elementEnd. Between an elementStart and its corresponding elementEnd, only insertion components are permitted.
elementEnd
An element end is inserted into the output document, with annotations as described above. This component terminates the most recent unterminated elementStart. It must not occur without a corresponding elementStart.
deleteCharacters(characters)
This component moves the cursor over the specified characters in the input document without generating any output. The characters must match the actual characters in the input document, and the current annotations update must match as described above.
deleteElementStart(type, attributes)
This component moves the cursor over the specified element start in the input document without generating any output. There must be an element start to the right of the cursor, and its type and attributes must match the arguments. The current annotations update must match as described above. This component must be terminated with a deleteElementEnd. Between a deleteElementStart and its corresponding deleteElementEnd, only deletion components are permitted.
deleteElementEnd
This component moves the cursor over an element end in the input document without generating any output. There must be an element end to the right of the cursor. The current annotations update must match as described above. This component terminates the most recent unterminated deleteElementStart. It must not occur without a corresponding deleteElementStart.
annotation-boundary(ends, changes)
This component modifies the current annotations update. Ends is a set of annotation keys; these keys are removed from the annotations update. Changes is a map of annotation keys to pairs (oldValue, newValue), where oldValue and newValue are either null or an annotation value; these entries are added to the annotations update, or replace entries in the annotations update that have the same key. The keys in ends and changes must be disjoint. An operation must not contain two consecutive annotationBoundary components. Ends must only contain keys that are part of the current annotations update.



 TOC 

8. References

[RFC3920] Saint-Andre, P., Ed., “Extensible Messaging and Presence Protocol (XMPP): Core,” RFC 3920, October 2004 (TXT, HTML, XML).
[TERMS] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[XEP0060] XMPP Publish Suscribe,” September 2008.


 TOC 

Appendix A.  Protocol Schema

The protocol schema, as RelaxNG compact:

namespace default = ""
namespace pubsub = "http://jabber.org/protocol/pubsub"
namespace disco = "http://jabber.org/protocol/disco#info"
namespace rec   = "urn:xmpp:receipts"  # NAMESPACE_XMPP_RECEIPTS
namespace discoitems = "http://jabber.org/protocol/disco#items"  # NAMESPACE_DISCO_ITEMS
namespace pubsubevt = "http://jabber.org/protocol/pubsub#event" # NAMESPACE_PUBSUB_EVENT
namespace wavesrv = "http://waveprotocol.org/protocol/0.2/waveserver" # NAMESPACE_WAVE_SERVER
namespace rcpt = "urn:xmpp:receipts"


## Our possible iq children
start = update
      | submitrequest | submitresponse
      | history-request | history-response
      | signer-get-request | signer-get-response
      | signer-post-request | signer-post-response

## Envelope to push wavelet operations. Used in iq set stanza.
update =
  element message {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element rcpt:request { empty },
    element pubsubevt:event {
      element pubsubevt:items {
        element pubsubevt:item {
          element wavesrv:wavelet-update {
            attribute wavelet-name { xsd:string },
            element wavesrv:applied-delta { text }*,
            commit-notice?
          }
        }
      }
    }
  }


## Request for historical wavelet operations. Used in iq get stanza.
history-request =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:items {
        attribute node { text }, # set to "wavelet"
        element wavesrv:delta-history {
          attribute wavelet-name { xsd:string },
          attribute start-version { xsd:integer },
          attribute start-version-hash { xsd:string },
          attribute end-version { xsd:integer }?,
          attribute end-version-hash { xsd:string }?,
          attribute response-length-limit { xsd:integer }?
        } *
      }
    }
  }

## Response to history-request. Used in iq result stanza.
history-response =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:items {
        element pubsub:item {
          element wavesrv:applied-delta {
            text
          }
          | element wavesrv:history-truncated {
            attribute version { xsd:integer }
          }
          | commit-notice
        } *
      }
    }
  }

## Request to submit operations to a wavelet. Used in iq set stanza.
submitrequest =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:publish {
        attribute node { xsd:string },
        element pubsub:item {
          element wavesrv:submit-request {
            element wavesrv:delta {
              attribute wavelet-name { xsd:string },
              text
            }
          }
        }
      }
    }
  }

## Response to submit-request. Used in iq result stanza.
submitresponse =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:publish {
        element pubsub:item {
          element wavesrv:submit-response {
            attribute application-timestamp { xsd:integer },
            attribute operations-applied { xsd:integer },
            attribute error-message { xsd:string }?,
            hashed-version
          }
        }
      }
    }
  }


## Signer get request
signer-get-request =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:items {
        attribute node { xsd:string }, # need to be hardcoded value "signer"
        element wavesrv:signer-request {
          attribute signer-id {xsd:string },
          attribute wavelet-name { xsd:string },
          version-hash-attributes
        }
      }
    }
  }

## Signer get response
signer-get-response =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:items {
        element wavesrv:signature {
          attribute domain { text },
          attribute algorithm { text },
          element wavesrv:certificate { text } +
        }
      }
    }
  }



## Signer post request
signer-post-request =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:publish {
        attribute node { xsd:string }, # need to be hardcoded value "signer"
        element pubsub:item {
          element wavesrv:signature {
            attribute domain { text },
            attribute algorithm { text },
            element wavesrv:certificate { text } +
          }
        }
      }
    }
  }

## Signer post response
signer-post-response =
  element iq {
    attribute id   { text },
    attribute from { text },
    attribute to   { text },
    attribute type { text },
    element pubsub:pubsub {
      element pubsub:publish {
        element pubsub:item {
          attribute node { xsd:string }, # need to be hardcoded value "signer"
          element wavesrv:signature-response { empty }
        }
      }
    }
  }


## A wavelet version and the wavelet's history hash at that version.
version-hash-attributes =
  attribute version { xsd:integer } & attribute history-hash { xsd:string }

## Notification of the fact that the host provider has persisted a
## wavelet up to the specified version.
commit-notice =
  element wavesrv:commit-notice {
    attribute version { xsd:integer }
  }

## Describes a wavelet version and the wavelet's history hash at that version.
hashed-version =
  element wavesrv:hashed-version {
    version-hash-attributes
  }




 TOC 

Appendix B.  Protocol Buffers

The protocol buffer definitions

/**
 * Copyright 2009 Google Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 */
//
// Google Wave Federation Protocol data structures.
//
// They are intended to be equivalent to the data structures in the
// draft "Google Wave Federation Protocol Over XMPP" at
// http://code.google.com/p/wave-protocol/source
//

syntax = "proto2";

package protocol;

option java_package = "org.waveprotocol.wave.protocol";
option java_outer_classname = "common";

/**
 * An immutable list of operations for contribution to a wavelet.
 * Specifies the contributor and the wavelet version that the
 * operations are intended to be applied to.  The host wave server
 * may apply the operations to the wavelet at the specified wavelet version
 * or it may accept them at a later version after operational transformation
 * against the operations at the intermediate wavelet versions.
 */
message ProtocolWaveletDelta {
  // Wavelet version that the delta is intended to be applied to.
  required ProtocolHashedVersion hashedVersion = 1;

  // Wave address of the contributor. Must be an explicit wavelet participant,
  // and may be different from the originator of this delta.
  required string author = 2;

  // Operations included in this delta.
  repeated ProtocolWaveletOperation operation = 3;

  /*
   * The nodes on the "overt" path from the originator through the address
   * access graph leading up to (but excluding) the author. The path excludes
   * any initial segments of the complete path which come before a WRITE edge
   * in the graph. This field is empty if the author is either the originator's
   * entry point into the address graph or is accessed by a WRITE edge.
   *
   * For example, "wave-discuss@acmewave.com" may be the explicit participant of
   * a wavelet, and is set as the author of a delta. However, this group is
   * being asked to act on behalf of "peter@initech-corp.com", who is a member
   * of "wave-authors", which is in turn a member of "wave-discuss". In this
   * example, the delta would be configured as such:
   *  delta.author = "wave-discuss@acmewave.com"
   *  delta.addressPath = ["peter@initech-corp.com", "wave-authors@acmewave.com"]
   */
  repeated string addressPath = 4;
}

/**
 * Describes a wavelet version and the wavelet's history hash at that version.
 */
message ProtocolHashedVersion {
  required int64 version = 1;
  required bytes historyHash = 2;
}

/**
 * An operation within a delta. Exactly one of the following seven fields must be set
 * for this operation to be valid.
 */
message ProtocolWaveletOperation {

  // A document operation. Mutates the contents of the specified document.
  message MutateDocument {
    required string documentId = 1;
    required ProtocolDocumentOperation documentOperation = 2;
  }

  // Adds a new participant (canonicalized wave address) to the wavelet.
  optional string addParticipant = 1;

  // Removes an existing participant (canonicalized wave address) from the wavelet.
  optional string removeParticipant = 2;

  // Mutates a document.
  optional MutateDocument mutateDocument = 3;

  // Does nothing. True if set.
  optional bool noOp = 4;
}

/**
 * A list of mutation components.
 */
message ProtocolDocumentOperation {

  /**
   * A component of a document operation.  One (and only one) of the component
   * types must be set.
   */
  message Component {

    message KeyValuePair {
      required string key = 1;
      required string value = 2;
    }

    message KeyValueUpdate {
      required string key = 1;
      // Absent field means that the attribute was absent/the annotation
      // was null.
      optional string oldValue = 2;
      // Absent field means that the attribute should be removed/the annotation
      // should be set to null.
      optional string newValue = 3;
    }

    message ElementStart {
      required string type = 1;
      // MUST NOT have two pairs with the same key.
      repeated KeyValuePair attribute = 2;
    }

    message ReplaceAttributes {
      // This field is set to true if and only if both oldAttributes and
      // newAttributes are empty.  It is needed to ensure that the optional
      // replaceAttributes component field is not dropped during serialization.
      optional bool empty = 1;
      // MUST NOT have two pairs with the same key.
      repeated KeyValuePair oldAttribute = 2;
      // MUST NOT have two pairs with the same key.
      repeated KeyValuePair newAttribute = 3;
    }

    message UpdateAttributes {
      // This field is set to true if and only if attributeUpdates are empty.
      // It is needed to ensure that the optional updateAttributes
      // component field is not dropped during serialization.
      optional bool empty = 1;
      // MUST NOT have two updates with the same key.
      repeated KeyValueUpdate attributeUpdate = 2;
    }

    message AnnotationBoundary {
      // This field is set to true if and only if both ends and changes are
      // empty.  It is needed to ensure that the optional annotationBoundary
      // component field is not dropped during serialization.
      optional bool empty = 1;
      // MUST NOT have the same string twice.
      repeated string end = 2;
      // MUST NOT have two updates with the same key.  MUST NOT
      // contain any of the strings listed in the 'end' field.
      repeated KeyValueUpdate change = 3;
    }

    optional AnnotationBoundary annotationBoundary = 1;
    optional string characters = 2;
    optional ElementStart elementStart = 3;
    optional bool elementEnd = 4;
    optional int32 retainItemCount = 5;
    optional string deleteCharacters = 6;
    optional ElementStart deleteElementStart = 7;
    optional bool deleteElementEnd = 8;
    optional ReplaceAttributes replaceAttributes = 9;
    optional UpdateAttributes updateAttributes = 10;
  }

  repeated Component component = 1;
}

/**
 * Information generated about this delta post-applicaton. Used in
 * ProtocolUpdate and ProtocolHistoryResponse.
 */
message ProtocolAppliedWaveletDelta {
  required ProtocolSignedDelta signedOriginalDelta = 1;
  optional ProtocolHashedVersion hashedVersionAppliedAt = 2;
  required int32 operationsApplied = 3;
  required int64 applicationTimestamp = 4;
}

/**
 * A delta signed with a number of domain signatures.
 */
message ProtocolSignedDelta {
  required ProtocolWaveletDelta delta = 1;
  repeated ProtocolSignature signature = 2;
}

/**
 * A signature for a delta. It contains the actual bytes of the signature,
 * an identifier of the signer (usually the hash of a certificate chain),
 * and an enum identifying the signature algorithm used.
 */
message ProtocolSignature {

  enum SignatureAlgorithm {
    SHA1_RSA = 1;
  }

  required bytes signatureBytes = 1;
  required bytes signerId = 2;
  required SignatureAlgorithm signatureAlgorithm = 3;
}

/**
 * A certificate chain that a sender will refer to in subsequent signatures.
 *
 * The signer_id field in a ProtocolSignature refers to a ProtocolSignerInfo
 * as follows: The certificates present in a ProtocolSignerInfo are encoded
 * in PkiPath format, and then hashed using the hash algorithm indicated in the
 * ProtocolSignerInfo.
 */
message ProtocolSignerInfo {

  enum HashAlgorithm {
    SHA256 = 1;
    SHA512 = 2;
  }

  // The hash algorithm senders will use to generate an id that will refer to
  // this certificate chain in the future
  required HashAlgorithm hashAlgorithm = 1;

  // The domain that this certificate chain was issued to. Receivers of this
  // ProtocolSignerInfo SHOULD reject the ProtocolSignerInfo if the target
  // certificate (the first one in the list) is not issued to this domain.
  required string domain = 2;

  // The certificate chain. The target certificate (i.e., the certificate issued
  // to the signer) is first, and the CA certificate (or one issued directly
  // by the CA) is last.
  repeated bytes certificate = 3;
}



 TOC 

Authors' Addresses

  Anthony Baxter
  Google, Inc.
Email:  arb@google.com
  
  Jochen Bekmann
  Google, Inc.
Email:  jochen@google.com
  
  Daniel Berlin
  Google, Inc.
Email:  dannyb@google.com
  
  Soren Lassen
  Google, Inc.
Email:  soren@google.com
  
  Sam Thorogood
  Google, Inc.
Email:  thorogood@google.com