Protocol‎ > ‎Design Documents‎ > ‎

New Wave Panel (Undercurrent)

    Author: Reuben Kan (, David Hearnden (, David Wang (

    Date: 19-Oct-2010


    This design's goal is to create a wave panel that behaves like the one in Google Wave with the important improvement of lowering user-perceived latency when opening a wave, especially when opening a wave via a URL. This design aims to achieve this by performing incremental loading of a wave. The display of a wave must be fast even in the case when the bulk of the javascript is not yet available (either because it hasn't been downloaded yet, or not yet compiled/evaluated). This wave panel will also support IE 7+, Firefox 3+, Chrome, Safari 4+, and mobile browsers (Android, iPhone, and iPad) with varying degree of functionality.


    After spending nearly twelve months improving the speed of the wave panel in the Google Wave client, the ratio of performance increase to engineering effort of profiling and refactoring was significantly diminishing for each new round of optimizations. This wave panel has a number of architectural constraints that were limiting the scope of needed optimizations. For example, like a typical GWT application, it constructs the rendering of a wave client-side. This requires downloading all the javascript code for rendering, along with the code for many rarely used features, before rendering even begins. Additionally, it uses a stateful presnetation model, where an in-memory representation of the wave's presentation is constructed before any of that state is pushed into the DOM. These architectural properties impose a significant lower bound on the latency for showing a wave.

    Undercurrent is a complete redesign of the wave panel's architecture, based on the lessons learned from the Google Wave client's original approach. Undercurrent uses a pipelined architecture that optimizes for the initial display speed of a wave, rather than the speed to load all the javascript and data necessary to have a fully functional wave panel. In particular, unlike a regular GWT application, this design explicitly supports (but does not require) server-side rendering of wave content, attaching client-side behaviour to server-supplied HTML. Server-side rendering is the primary tool for reducing the latency of displaying waves.


    This design uses a pipelined process that will reduce perceived latency by showing the wave content to the user as soon as possible. The pipelined rendering process allows the client to scale gracefully across different browser platform by trading functionality against computation power. On low-end devices the panel will offer a cut-down experience by exiting the pipeline early. The pipeline consists of four stages.

    Stage 1. Server-supplied rendering

    With server-side rendering enabled, the server has already supplied a rendering of wave content, so it has already been displayed on screen. Note that this rendering does not have to be complete. It only needs to have rendered enough content (blips) to fill a browser's screen; other blips can be rendered blank. This rendering also includes all the user-specific information, such as unread status and diff highlighting. The HTML DOM holds attributes that allow dynamic behaviour to be added later. In addition, the HTML holds some JSON that describes the blip structure of the wave.

    Stage one is instantiated on this existing HTML content, and installs a minimal set of features, such as the reading frame that focuses the user's attention on a particular blip.

    Stage 2. Live model (read only)

    The defining component of the second stage is the wave model code, including the operation-based infrastructure. Any feature that requires interaction with the wave model must be delayed until at least this stage.

    Client-side rendering, which is based on the wave model, is also included in this stage. If server-side rendering is not enabled, then the client renders the wave from scratch. If server-side rendering is enabled, then the client can complete the rendering of any part of the wave not rendered by the server. This stage also includes liveness, meaning the rendering is updated as the model changes due to incoming operations. The client can now also render items in the wave that are not capable of being rendered on the server, such as gadgets (blip components defined by external javascript) and doodads (blip components defined by plug-in GWT code). For robustness, this is achieved by making the client re-render every blip in the wave. Although this is redundant work in the overwhelming majority of cases, it is a transparent process if the rendering server supplied a correct rendering. It should be possible in the future to optimize this process not to re-render every blip.

    Other features for interacting with a wave in a reading context are installed in this stage, such as read/unread highlighting and thread collapsing.

    Stage 3. Write

    The defining component of the third stage is the installation of writing features, such as replying, editing, and deleting wave content. Since the code for these features is quite large, environments that have already exited the pipeline (such as a read-only mobile client) will save a significant download.

    Stage 4. Feature completeness

    The code for virtually all other features is included in this stage, and those features are installed on demand. This stage is not a strict part of Undercurrent's pipeline: it could be divided into more stages if necessary.

    Design principles

    • Low latency is more important than high throughput. It is better to respond faster with some initial feedback, at the cost of taking longer to complete an entire action, than it is to complete the entire action as fast as possible. A significant part of providing low latency responses is favouring lazy evaluation over eager evaluation. For example, each blip's content model is not processed until that blip is rendered, rather than initializing all the blip content when the model is initially loaded in memory (stage two). Although this increases the total time for rendering the entire contents of a wave, the decrease in response time for the initial display of that wave is preferrable.
    • Graceful degradation across browsers. The architecture should be able to support different browsers with varying ability to execute Javascript. The architecture should offer a gradual degradation of features depending on the amount of bandwidth/CPU available.
    • Modularization of functionality. Each feature should be able to be enabled/disabled separately on separate passes. This allows us to separately assess the cost of each feature and push slow features to a later stage of processing.

    Additional considerations for performance:

    • For JavaScript-enabled clients, server-side rendering requires more data to be downloaded, but reduces client-side processing, and is not gated by the download of the wave model state. This is a trade-off between the client's execution speed, download speed, the size of the HTML rendering, and the size of the model data required for client-side rendering. The balance is usually in favour of server-side rendering, but it is not unreasonable (for a very fast browser on a particularly slow network) for client-side rendering to be faster.
    • Server-side rendering requires server CPU resources. The client/server rendering protocol is such that the server is not expected to render all the blips. This allows the rendering server to increase and decrease the amount of wave content it renders in response to load and client capabiltiies, trading the rendering work between the client and server.

    Detailed design

    Initial HTML / bootstrap page

    It is likely that the render server as mentioned in this part of the design will not be implemented immediately in the Wave in a Box project. However, this "optional" server was one of the driving motivations of the overall design. It is described in detail amongst this section. The system functions even though this component is missing.

    In order to show the content of the wave as quickly as possible, the first screenful of wave content is rendered as HTML into the bootstrap page (the startup page for the client). The bootstrap page is also cacheable by wave-id so that the rendered HTML can be shown as quickly as possible for recently viewed waves. We are willing to accept some cache staleness (60 seconds right now, can be reduced to 0 if we want to be conservative) by setting the expiry time of the bootstrap page. After the page's expiry time passes, the browser's cache validation mechanism kicks in (HTTP 304, see for details on HTTP cache validation).

    Depending on which level of the cache the client hits, the client pays a different price
    • Stage 1: Browser cache hit before expiry timeout - no network cost
    • Stage 2: Browser cache miss because expiry time out, but document is still valid - network round trip to server to discover document is still valid
    • Stage 3: Browser cache miss, document modified - full download cost
    This can result in significant saving, as a randomly sampled set of frontend requests in Google Wave shows that approximately 75% of wave fetches were for waves that hasn't been modified.

    The skeleton bootstrap page looks like the following:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
    <meta http-equiv="X-UA-Compatible" content="IE=8"> 1
    <style type="text/css">
    ... 2
    <script type="text/javascript" defer="defer"> 3
    function loadScript() { 4
    n = document.createElement("script");
    n.type = "text/javascript";
    n.src = "/a/"; 5
    setTimeout("loadScript()", 0);
    <div id="initialHtml">
    ... 6

    1. IE8 only, standard mode is easier to write for now
    2. Style needed for the server-supplied rendering.
    3. The JavaScript attribute "defer" is specified so the following code is not executed until the page is loaded. Note, this is only supported in IE.
    4. The application is loaded with programmatic script injection to avoid the cost of evaluating the javascript up front.
    5. Refering to compiled GWT Javascript
    6. Prerendered content from the server. If this content is empty, due to server failure, the content is created dynamically by the javascript loaded at Point 5. See more info about the format of this content in the "HTML Format" section.


    An alternative approach is to have the bootstrap page as non-cacheable, with an iframe pointing to a permanently cacheable page with unique url for a wavelet with version, user id and language information encoded. It has the advantage of allowing the content cache to be infinitely cacheable at the proxy level, but has the additional cost of an extra round trip time.

    HTML format

    In order to allow the Javascript to attach behavior to the server-supplied DOM efficiently, the HTML must be generated with annotations containing information needed to find appropriate elements quickly. HTML IDs are used for these annotations, illustrated by the following snippet.

    <div style="display:none">
    <div id="render_metadata"> 1
    {"idmap" : { 2
    "!w+Khns7Nd,!conv+root,t+NDiNWrZkA" : "a",
    "!w+Khns7Nd,!conv+root,b+NDiNWrZkB" : "b",
    "versions" : [ 3
    {"wavelet" : "!conv+root", "version" : "83"},
    {"wavelet" : "!conv+kl20xE", "version" : "125"},
    <div kind="thread" id="a"> 4
    <div kind="blip" id="b"> 5
    <span id="b-Time">Jun 4</span> 6

    1. A preamble containing metadata about the rendering, required for various client-side operations. The components of this metadata are explained below. The rendering metadata is delivered as hidden text in the page content, rather than a script tag, to avoid the delay from the browser evaluating it as Javascript upfront. When the rendering metadata is needed by the client, it is extracted as JSON from the inner text of the metadata DOM element, and evaluated.
    2. The rendering metadata contains a string map, mapping between strings that uniquely identify model objects (like blips and threads) and short, obfuscated strings used in HTML ids. Obfuscated HTML ids are used for a number of reasons:
      1. HTML ids in IE7 are not case sensitive. Ids for wave model objects are case sensitive. The obfuscation mapping ensures that HTML ids are unique ignoring case.
      2. Minor point, the strings that uniquely identify model objects are long. e.g., in order to be unique across multiple waves on the page, the unique id for a blip must contain a wave id, a wavelet id, and a blip id. Mapping these long identifiers to short ones reduces the size of the HTML, because a model id is used as a prefix for view components (see point 6). Note, this may not be a significant size reduction when the HTML is gzipped.
      Once the id map loaded, the client can map directly between model objects and the DOM elements of their renderings.
    3. The rendering metadata contains the wavelet versions at which the rendering took place, so the client can fetch the model data at that correct state.
    4. HTML rendering for thread "t+NDiNWrZkA" in conversation "!conv+root" of wave "example.gom!w+Khns7Nd". This rendering is identified with "a", which is the obfuscated value of the thread's unique model id (which composes its wave id, conversation id, and thread id).
    5. HTML rendering for blip "b+NDiNWrZkB" in conversation "!conv+root" of wave "example.gom!w+Khns7Nd". This rendering is identified with "b", which is the obfuscated value of the blip's unique model id (which composes its wave id, conversation id, and blip id).
    6. The timestamp component of a blip's HTML rendering. The id of this span is computable from the HTML id of blip that contains this span. In this example, the timestamp element of a blip's HTML is identified by appending "-Time" to the blip's HTML id.

    The rendering server that produces this DOM HTML shares the same HTML producing Java code with the GWT client. This means that if the prerendered HTML is missing, the client is able to produce it itself. Also, this code is necessary for the client to update the DOM in a way that remains consistent with what the rendering server produces.

    Event handling system - O(1) upfront setup

    The canonical approach for handling browser events in a GWT application is to construct the UI using GWT widget library. However, GWT widgets are relatively heavyweight compared to plain HTML (both in terms of memory usage and execution speed), and the widget library's architecture requires that every event-handling element be a widget. This requires an initial setup cost that is linear in the number of event-handling UI components.

    This design uses an alternative event architecture that does not require a GWT widget for every UI component, providing constant-time setup cost, as well as saving other costs inherent to widgets. In its most extreme form, this architecture has only a single widget for the entire application. This is achieved by taking advantage of the browser's underlying event bubbling mechanism. A single event listener on the top-level DOM element receives all events on the page. Since the rendering pattern is to annotate the HTML of UI components with their kind (HTML attribute), this top-level event handler can trace up the DOM from the target of a browser event, and dispatch the event to application-level event handlers registered against those kinds. This bottom-up dispatch mechanism is analogous to the browser's native event bubbling mechanism. The kind-based dispatch model means that all the contextual information for delivering an event to the appropriate handler is part of the HTML, which can be rendered on the server and delivered to the client. For example:

    <div onclick="handle()"> 2
    <div kind="blip"> 3
    <div> 1

  • A browser event (e.g., a click) occurs on some element. There are no DOM event handlers on that element, so the event bubbles.
  • The event reaches the root element, and its handler is invoked (that handler is Undercurrent's event system).
  • Undercurrent's event system traverses up the DOM from the target element of the event (1) until it sees an element that has a kind. If there is an application-level handler registered for that event type (e.g., clicks) and that kind value ("blip"), then it is invoked with the appropriate context (the element that has the kind). The event system continues to traverse up the DOM looking for elements with kinds, dispatching the event to every application-level handler registered for that event type on that kind, until one of the handlers indicates that propagation should stop.

  • In comparison with the traditional GWT Widget approach, this dispatch mechanism trades upfront setup cost against runtime cost of event delivery:

    • Traditional widgets: O(n) setup cost to construct the widgets and attach event listeners. O(n) memory cost from the widget objects, but O(1) cost to dispatch to application-level event handling.
    • Undercurrent's approach: O(1) setup cost for the root widget, O(1) memory cost, and O(n) worst case for dispatch (but usually O(1) in typical cases).

    In code, this event system is implemented by EventDispatcherPanel.


    This algorithm requires upwards DOM traversal from an event's target element to find an appropriate handler, which is bounded only by the depth of the element (O(n)). An alternative is to dispatch the event directly from elements with kinds. For example:
    <div kind="thread" onclick="window.__handle('click', 'thread', this);">
    <div kind="blip" onclick="window.__handle('click', 'blip', this);">

    This has the advantage that no explicit JS router is needed to bubble the javascript event. However, it has the disadvantages of making the HTML larger, and requires a named, global, dispatcher object.

    Model/view/presenter and flyweight patterns - O(1) start-up

    This section assumes the reader understands the MVP pattern. In order to support server-side rendering, the view state needs to be represented entirely by the DOM, disallowing stateful view objects. This allows Undercurrent to use flyweight objects for views. These flyweights are associated dynamically on-demand with DOM elements of interest, in order to interpret the DOM fragments meaningfully, and may be pooled and re-used. Furthermore, rather than fine-grained presenters for each UI component on screen, there is a single presenter object per wave for each category of presentation (one presenter for conversation structure, one presenter for profile information, one presenter for read state, etc). This keeps the startup cost for attaching behaviour to an existing HTML rendering minimal, and does not grow with the size of the wave.

    The following snippet illustrates the essential structure of these view classes:

    public final class BlipMetaDomImpl implements ... {
    /** The DOM element of this view. */
    private final Element self; 1

    /** The HTML id of {@code self}. */
    private final String id;

    // UI fields.

    private Element time;

    private Element getTime() {
    if (time == null) {
    // Construct
    time = load(id, Components.TIME); 2
    return time;

    public void setTime(String time) {

    1. View objects are bound to DOM elements. The current implementation uses disposable view objects for simplicity; for a pooling variation, these references would obviously be non-final.
    2. Parts of a view ("UI fields" in UI Binder terminology) are identified via enums, and loaded lazily using the DOM API's getElementById(). The ids of components are computable from the view's id and the component name.

    The transfer of model state into view state is performed by 'live renderers', which play the role of presenters. The following fragment (simplified from the actual code) illustrates the control flow for updating a blip's timestamp.

    public final class LiveConversationViewRenderer implements ObservableConversation.Listener, ... { 1

    public void onBlipTimestampChanged(ObservableConversationBlip blip, long oldTimestamp, long newTimestamp) {
    BlipView blipUi = viewProvider.getBlipView(blip); 2
    blipUi.setTime(formatDate(blip.getLastModifiedTime())); 3

    1. Each presenter is an observer of a live model.
    2. On a model change event, the live renderer is notified. It materializes views that need updating through a view provider. The provision of a view involves mapping the relevant model object to a unique view id (using the mapping process described earlier), querying the DOM for the element identified by that id, then binding a new (or re-used) view object of the appropriate kind to that DOM element.
    3. Model state is then pushed into the view using the view API for that kind (e.g., a BlipView exposes its timestamp, hiding the underlying HTML structure of the view).

    In particular, note that the presentation logic has no direct dependencies on GWT, DOM, or HTML: it simply defines a mapping from model state to view state, expressed using abstract view interfaces. This keeps the presentation logic portable and easily testable.

    Updated concurrency control (CC) stack

    The spirit of this design is to stage the code downloaded to the client so that content is shown immediately, and functionality is loaded in the order it is likely to be used. With that spirit, it's desirable to be able to show live streaming updates to the wave without having to load the entire operation transport stack, which includes much code that is not needed for unless writing (e.g., the operational transform code). To do this, this design introduces an empty CC stack which simply passes through any operations from the server to the client. If the client generates any operations on a wavelet, before the proper CC stack is downloaded and installed, the empty CC stack is paused. All operations, both client operations going out to the server, and server operations coming in to the client, are buffered and not processed. Once the real CC stack is in place, and the client has the ability to do operational transformation (OT), those buffered streams are flushed through transformation, and the CC stack resumes.

    Code and Package Structure

    The client package in the wave libraries is primarily structured around Undercurrent's UI architecture for events, rendering, and staged loading. It also includes a wave panel with a core set of features, built in a plug-in style.

    In org.waveprotocol.wave.client.
    • uibuilder - Defines what UiBuilders are (HTML closures), and some helper classes for implementing them
    • render - The interfaces that needs to be implemented to render a wave.
    • concurrencycontrol - Multi-stage wave stack (non-live, then live).
    • wavepanel -
      • event - implementation of the single-widget event dispatch system (should probably be in client.event rather than client.wavepanel.event)
      • impl - the skeleton structure of a wave panel feature. Fine-grained features exist in sub-packages, and are installed independently.
        • ...
      • render - wave rendering implementation, for plugging into the wave rendering mechanism in the ...client.render package
      • view - view interfaces for UI components within a wave (as per MVP separation), including both intrinsic definitions and full definitions (intrinsic + structure).
        • dom - DOM-based implementations of intrinsic parts of views
        • impl - flyweight wrappers for upgrading intrinsic implementations to a full view implementation (addition of structure handling)
        • fake - pojo implementations of the views, primarily for testing.

    Since the staged loading is a top-level concern (in order to use GWT's runAsync capabilities effectively), the stages and their loading sequence are part of the top-level client package. The stage implementations are designed to be configurable, so that in different client environments, arbitrary components within each stage can be substituted for alternative implementations.

    Using Undercurrent

    1. Following the instructions here to start the server.
    2. Open http://localhost:9898?enableWavePanelHarness=true&enableUndercurrentEditing=true