Protocol‎ > ‎Design Documents‎ > ‎

Work in Progress: HTTP-based Federation Protocol

    Authors

    Torben Weis, University Duisburg-Essen
    Joseph Gentle
    Tad Glines
    Chris Harvey, iotaWave.org
    John Edstrom
    Charles Goddard
    Michael Rollins
    Ian Roughley, Novell
    Michael MacFadden, SOLUTE Consulting

Purpose and Motivation

The purpose of this specification is to provide an alternative transport for wave federation. The original wave implementation uses XMPP, which adds quite some complexity to setting up or even implementing a wave server.

The design of the protocol follows a number of requirements
    • Keep it simple. We want developers to build their own wave servers and make them federate easily
    • Keep it efficient. The HTTP-based protocol should not be slower than the XMPP version. In an ideal case it will even be faster
    • Keep it secure: Provide the same (or higher) level of security as with the XMPP-based federation

The federation protocol described here is HTTPS based. Thus, it can run as a stand-alone server with a HTTPS front-end, or hidden behind a web server and connected via FCGI.

    Scope

    This document shows how the logical message flow of wave federation can be implemented using an HTTPS-based transport. Furthermore, this document specifies how to do discovery, i.e. given a wave user ID (e.g. joe@example.com) how can we find the corresponding wave server for this user.

The semantics of the messages being sent is discussed here only briefly. For a more in depth discussion see the specification of the logical message flow in wave federation [TODO].

    Discovery

    When a wave server wants to send a message to a user it must find the corresponding wave server. For example a hosting-server encounters that user joe@example.com has been added to a wave. It needs to know how to talk to this user's wave server. The discovery protocol consists of three parts.
  1. Determine or guess the host-name and port of the wave server
  2. Try to fetch a capabilities document from this server using HTTPS. For example we suspect foo.com at port 443 to provide wave services. Using HTTPS we ask for the document https://foo.com:443/wave/fed/capabilities. This should return a document with a format as outlined below
  3. Inspect the capabilities file. It describes, for example, which version of the wave federation protocol this user supports. Optionally, the wave server may have special security requirements or policies which are described in the capabilities file.
    The protocol for this contains 3 options which are tried in the following order:
  1. For a user joe@foo,com, lookup a DNS SRV record. Search for _wave-server._tcp.foo.com. If the result provides a target (i.e. host-name or IP-address) and a port, then try to fetch the capabilities file. If one can be obtained, discovery has succeeded.
  2. For a user joe@foo.com, try to obtain capabilities from https://wave.foo.com/wave/fed/capabilities.If one can be obtained, discovery has succeeded.
  3. For a user joe@foo.com, try to obtain capabilities from https://foo.com/wave/fed/capabilities.If one can be obtained, discovery has succeeded.
  4. Discovery has failed. It is up to the wave server implementation to repeat discovery at a later time or give up at this point.
The capabilities file is a new-lines separated key/value list encoded in UTF-8. Key and value are separated by a colon. Empty lines are ignored. Space characters (newline, carriage return, tab, space) at the end of the line, and before or after the colon are ignored . The key is case-insensitive. Each capabilities file MUST include the federation protocol version number, i.e. the key "wave-version" must be present.

Currently the standard does not specify other keys. Developers are free to add their own keys by prefixing it with "X-". If a key is prefixed with "MUST-" or "X-MUST-" then the calling server MUST support the feature described by this key. If the calling server does not understand the meaning of the key or does not implement the feature, it SHOULD not attempt to initiate federation since it will be rejected anyway.

An example capabilities file looks as follows:

wave-version:1
wave-domain: foo.com
X-server:lightwave

Once a wave server has obtained a capabilities file for a certain wave domain, it SHOULD store it in a store. By default, capabilities files expire after 1 day. Thus, after 1 day the wave server SHOULD run the discovery mechanism again to ensure that nothing has changed.

    Transport

    HTTP-based federation is REST-full. This implies that given the URL and the HTTP method, a server knows already what the purpose of the HTTP request is. Furthermore, the protocol is state-less, i.e. each HTTP call is an independent and atomic operation. All federation-specific URL pathes must be prefixed by /wave/fed. This simplifies server development since it provides an easy way to split wave requests from normal web server requests just by inspecting the URL.

All compliant wave servers MUST implement HTTP transport as specified in the HTTP 1.1 standard [1]. A wave server SHOULD implement persistent connections as specified by section 8.1 of [1].

For large scale wave deployments, a front-end web server can perform load-balancing based on the URL. However, the protocol does not support HTTP redirects. The front-end wave server may internally route the requests to some wave server, but it must not ask the caller to redirect its request using a HTTP 3xx status code. A wave HTTP request either succeeds (status code 200) or it fails (status code 4xx or 5xx).

Status Codes

The federation protocol makes use of HTTP status codes. However, there are some restrictions as discussed above. Furthermore, HTTP error codes MUST only be used for transport related errors or encoding/decoding related errors. For example, if the application of a wave delta failed then this is not a transport failure. The server must still report status code 200 OK and deliver the application-logic error message in the reply body. For example, if the decoding of a ProtoBuf message fails it is a transport error and the server reports 401 BAD REQUEST. If the mime type of the message could not be understood, the server reports 406 NOT ACCEPTABLE.

    Security

    All communication is carried out using HTTPS. For development purposes it is ok to resort to HTTP, but productive systems MUST support HTTPS. They MUST support client-side and server-side certificates. Both wave servers MUST verify that the certificates presented by the other side during HTTPS handshake are signed by a trusted root CA. If the calling server does not trust the certificate presented it can simply close the underlying TCP connection. If the called server does not trust the certificate presented by the calling server, it returns the HTTP status code 401 and closes the underlying TCP connection.

    A wave server may opt to dial-back (i.e. perform a discovery of the calling server's domain) when he receives a message from a server it has not talked to before. This way it gets to see this server's capabilities before it decides to accept or reject the message received. This adds another layer of security, because the server can verify that the caller does have a valid DNS entry in addition to the HTTPS certificate it has already presented. However, this is optional and transparent to the calling server.

      Messages

      The following message types can be sent:
      - POST Submit
      - PUT
      WaveletUpdate
      - GET
      GetSigner
      - GET HistoryRequest

      The semantics and contents of these messages is essentially the same as in the current federation protocol. Only
      PostSigner has been merged with the Submit message. Data that has formerly been encoded as XML does now become ProtoBuf data. XMPP-specific control message are gone for good.

      The P
      rotoBuf message types used below are either part of the current Google Wave ProtoBuf definitions [TODO] or they are defined in this document. In the following the patterm <wavelet-id> corresponds to the wavelet name as specified in section 4 of [2].

    The mime type of all messages exchanged between wave servers is application/x-protobuf-wave. The "-wave" extension is required because ProtoBuf encoding does not tell much. The receiver must know which message type to expect. This mime type together with the URL and HTTP method allows a server to judge which ProtoBuf message to expect and only with this knowledge it is possible to decode the ProtoBuf message. Later versions of the protocol may as well support other message types. In this case the mime type could change to application/x-protobuf-wave2 or application/json. Therefore, all compliant servers MUST send the mime type with each request. If the mime type is missing or not understood by the receiving wave server, it MUST respond with the status code 406.

    Message: Submit
      A Submit is sent from a remote wave-server to a hosting wave-server with the intent to modify a wavelet. The remote-server is sending a wavelet operation and expects a response that tells whether the operation has been applied successfully or not.

    POST http://hosting-server/wave/fed/data/<wavelet-id>

      The request payload is a ProtocolSubmitRequest which contains the signature and a ProtocolWaveletDelta (which is encoded as byte stream). The signature signs exactly this byte stream. So far, we could even tolerate a non-canonical serialization as long as this byte stream is persisted. In addition the message must be able to carry a ProtocolSignerInfo. This replaces the old
      PostSigner message. We integrate it as an option into Submit. This saves us one extra message and is more rest-ful, i.e. state-less.

      message ProtocolSubmitRequest {
          required bytes delta = 1;
          repeated ProtocolSignature signature = 2;
          optional ProtocolSignerInfo signer = 3;
      }

      The response payload is a message of type

      message ProtocolSubmitResponse {
          required int32 operations_applied = 1;
          optional string error_message = 2;
          optional protocol.ProtocolHashedVersion hashed_version_after_application = 3;
          optional int64 application_timestamp = 4;
      }

      Message: WaveletUpdate

      A WaveletUpdate is sent by a hosting wave-server to a remote wave-server to inform him that a wavelet has been modified. The payload of the message is a set of wavelet operations

    PUT http://remote-server/wave/fed/data/<wavelet-id>

      The payload is a ProtocolWaveletUpdate

      message ProtocolWaveletUpdate {
          required string wavelet_name = 1;
          repeated ProtocolAppliedWaveletDelta deltas = 2;
          // Tells up to which version the deltas have been committed to persistent storage
          optional int64 commit_notice = 3;
      }

      Message: GetSigner
      If a remote server receives a wavelet update which contains an unknown signer id, the remote server can ask the hosting server for the certificates corresponding to this signer id.
      The signer-id is a hash. For the purpose of this protocol, the hash is base64 encoded.

      GET http://hosting-server/wave/fed/signer/<signer-id>

      The response payload is a message of type ProtocolSignerInfo, which contains the full certificate change corresponding to the signer-id

      Message: GetHistory

      A remote-server can ask the hosting server for the history of a wavelet. It has to specify the first delta v1 (inclusive) and the last delta v2 (exclusive) it wants to obtain. In addition the hashes of the respective wavelet versions are required. These hashes are SECURITY RELEVANT. If all users of a remote server have been kicked out of a wavelet, this remote server can still ask for the history up to the point where his last user has been kicked out. This is because this remote server knows the hashes of the wavelet up to this time.It does not know hashes of the wavelet after his last user has been kicked out.

    GET http://hosting-server/wave/fed/data/<wavelet-id>?v1=<start-version>&v1hash=<start-version-hash>&v2=<end-version>&v2hash=<end-version-hash>&limit=<bytes>

      - v1: An int64 encoded as a decimal number. Deltas starting at this version are to be included in the reply
      - v2: An int64 encoded as a decimal number. The reply will only include deltas up to this version.
      - v1hash: A base64 encoded hash value, corresponding to version v1 of the wavelet
        - v2hash: A base64 encoded hash value, corresponding to version v2 of the wavelet
        - limit: The maximum number of bytes the caller is willing to accept in the response.

      The response payload is a message of type ProtocolWaveletHistory

    message ProtocolWaveletHistory {
        repeated bytes deltas = 1;
        optional int64 truncated = 2;
        optional int64 commit_notice = 3;
    }

    The field deltas is a sequence of ProtoBuf encoded ProtocolAppliedWaveletDelta instances.

    If the requested history is larger than the limit specified by the calling server or larger than the limit imposed by the hosting-server, the hosting-server can send fewer deltas than requested. In this case, the parameter "trunctated" indicates the first version that has not been sent due to size restrictions.

    If not all deltas returned by the history request are committed to persistent storage yet, then the "commit_notice" parameter specifies the version of the first delta that is not yet committed to persistent storage.

    Reliable Message Transport

        This section of the specification discusses how servers can recover from failed communication attempts. Two cases have to be distinguished

        - The calling server fails to establish a connection or send the complete request
        - The calling server sent the complete request but it did not receive any response from the other wave server.

        If GetSigner or GetHistoryRequest fail, nothing bad can happen, since they are idempotent. The calling server SHOULD try again after an interval of its choice.

        If a remote-server submits a delta but does not get any response, it remains unclear whether the delta has been applied or not. The remote-server SHOULD re-submit its delta as soon as possible. The hosting-server MUST detect that this deltas has already been applied by comparing it to its accepted deltas. Thus, while the delta undergoes OT-transformation, the server must check if it has already accepted this delta. If the deltas have already been accepted, the hosting-server MUST send a valid submit response, but it MUST NOT send an additional wavelet update message. Thus, the hosting-server MUST make submission idempotent.

        If a wavelet update could not be sent to a remote-server or if the hosting-server is not sure that the remote-server received the wavelet update, then the hosting-server SHOULD try again to send the wavelet update in an interval of its choice. If multiple wavelet update messages are pending for some remote-server, then hosting-server SHOULD only retry to send the latest of these messages. The remote-server can later use GetHistoryRequest to obtain the other deltas it has missed out. This reduces the queue size of pending wavelet update messages on the hosting-server.

        If a server crashes and comes back up, it SHOULD ask all hosting-servers it has federated with for their capabilities file. This way these hosting-servers realize that the remote-server is back up. The hosting-server SHOULD then attempt to send a wavelet update to the remote-server. This wavelet-update consists only of the latest delta and the latest commit-notice. The remote-server can use GetHistoryRequest to obtain the other deltas it has missed out.

    References

    [1]  RFC 2616: The Hypertext Transfer Protocol -- HTTP/1.1
    Comments