diff options
author | Matt A. Tobin <mattatobin@localhost.localdomain> | 2018-02-02 04:16:08 -0500 |
---|---|---|
committer | Matt A. Tobin <mattatobin@localhost.localdomain> | 2018-02-02 04:16:08 -0500 |
commit | 5f8de423f190bbb79a62f804151bc24824fa32d8 (patch) | |
tree | 10027f336435511475e392454359edea8e25895d /gfx/doc | |
parent | 49ee0794b5d912db1f95dce6eb52d781dc210db5 (diff) | |
download | UXP-5f8de423f190bbb79a62f804151bc24824fa32d8.tar UXP-5f8de423f190bbb79a62f804151bc24824fa32d8.tar.gz UXP-5f8de423f190bbb79a62f804151bc24824fa32d8.tar.lz UXP-5f8de423f190bbb79a62f804151bc24824fa32d8.tar.xz UXP-5f8de423f190bbb79a62f804151bc24824fa32d8.zip |
Add m-esr52 at 52.6.0
Diffstat (limited to 'gfx/doc')
-rw-r--r-- | gfx/doc/AsyncPanZoom-HighLevel.png | bin | 0 -> 67837 bytes | |||
-rw-r--r-- | gfx/doc/AsyncPanZoom.md | 299 | ||||
-rw-r--r-- | gfx/doc/B2GInputFlow.svg | 349 | ||||
-rw-r--r-- | gfx/doc/GraphicsOverview.md | 83 | ||||
-rw-r--r-- | gfx/doc/LayersHistory.md | 60 | ||||
-rw-r--r-- | gfx/doc/MainPage.md | 21 | ||||
-rw-r--r-- | gfx/doc/MozSurface.md | 124 | ||||
-rw-r--r-- | gfx/doc/SharedMozSurface.md | 147 | ||||
-rw-r--r-- | gfx/doc/Silk.md | 246 | ||||
-rw-r--r-- | gfx/doc/silkArchitecture.png | bin | 0 -> 221047 bytes |
10 files changed, 1329 insertions, 0 deletions
diff --git a/gfx/doc/AsyncPanZoom-HighLevel.png b/gfx/doc/AsyncPanZoom-HighLevel.png Binary files differnew file mode 100644 index 000000000..d19dcb7c8 --- /dev/null +++ b/gfx/doc/AsyncPanZoom-HighLevel.png diff --git a/gfx/doc/AsyncPanZoom.md b/gfx/doc/AsyncPanZoom.md new file mode 100644 index 000000000..1fc58e03d --- /dev/null +++ b/gfx/doc/AsyncPanZoom.md @@ -0,0 +1,299 @@ +Asynchronous Panning and Zooming {#apz} +================================ + +**This document is a work in progress. Some information may be missing or incomplete.** + +## Goals + +We need to be able to provide a visual response to user input with minimal latency. +In particular, on devices with touch input, content must track the finger exactly while panning, or the user experience is very poor. +According to the UX team, 120ms is an acceptable latency between user input and response. + +## Context and surrounding architecture + +The fundamental problem we are trying to solve with the Asynchronous Panning and Zooming (APZ) code is that of responsiveness. +By default, web browsers operate in a "game loop" that looks like this: + + while true: + process input + do computations + repaint content + display repainted content + +In browsers the "do computation" step can be arbitrarily expensive because it can involve running event handlers in web content. +Therefore, there can be an arbitrary delay between the input being received and the on-screen display getting updated. + +Responsiveness is always good, and with touch-based interaction it is even more important than with mouse or keyboard input. +In order to ensure responsiveness, we split the "game loop" model of the browser into a multithreaded variant which looks something like this: + + Thread 1 (compositor thread) + while true: + receive input + send a copy of input to thread 2 + adjust painted content based on input + display adjusted painted content + + Thread 2 (main thread) + while true: + receive input from thread 1 + do computations + repaint content + update the copy of painted content in thread 1 + +This multithreaded model is called off-main-thread compositing (OMTC), because the compositing (where the content is displayed on-screen) happens on a separate thread from the main thread. +Note that this is a very very simplified model, but in this model the "adjust painted content based on input" is the primary function of the APZ code. + +The "painted content" is stored on a set of "layers", that are conceptually double-buffered. +That is, when the main thread does its repaint, it paints into one set of layers (the "client" layers). +The update that is sent to the compositor thread copies all the changes from the client layers into another set of layers that the compositor holds. +These layers are called the "shadow" layers or the "compositor" layers. +The compositor in theory can continuously composite these shadow layers to the screen while the main thread is busy doing other things and painting a new set of client layers. + +The APZ code takes the input events that are coming in from the hardware and uses them to figure out what the user is trying to do (e.g. pan the page, zoom in). +It then expresses this user intention in the form of translation and/or scale transformation matrices. +These transformation matrices are applied to the shadow layers at composite time, so that what the user sees on-screen reflects what they are trying to do as closely as possible. + +## Technical overview + +As per the heavily simplified model described above, the fundamental purpose of the APZ code is to take input events and produce transformation matrices. +This section attempts to break that down and identify the different problems that make this task non-trivial. + +### Checkerboarding + +The content area that is painted and stored in a shadow layer is called the "displayport". +The APZ code is responsible for determining how large the displayport should be. +On the one hand, we want the displayport to be as large as possible. +At the very least it needs to be larger than what is visible on-screen, because otherwise, as soon as the user pans, there will be some unpainted area of the page exposed. +However, we cannot always set the displayport to be the entire page, because the page can be arbitrarily long and this would require an unbounded amount of memory to store. +Therefore, a good displayport size is one that is larger than the visible area but not so large that it is a huge drain on memory. +Because the displayport is usually smaller than the whole page, it is always possible for the user to scroll so fast that they end up in an area of the page outside the displayport. +When this happens, they see unpainted content; this is referred to as "checkerboarding", and we try to avoid it where possible. + +There are many possible ways to determine what the displayport should be in order to balance the tradeoffs involved (i.e. having one that is too big is bad for memory usage, and having one that is too small results in excessive checkerboarding). +Ideally, the displayport should cover exactly the area that we know the user will make visible. +Although we cannot know this for sure, we can use heuristics based on current panning velocity and direction to ensure a reasonably-chosen displayport area. +This calculation is done in the APZ code, and a new desired displayport is frequently sent to the main thread as the user is panning around. + +### Multiple layers + +Consider, for example, a scrollable page that contains an iframe which itself is scrollable. +The iframe can be scrolled independently of the top-level page, and we would like both the page and the iframe to scroll responsively. +This means that we want independent asynchronous panning for both the top-level page and the iframe. +In addition to iframes, elements that have the overflow:scroll CSS property set are also scrollable, and also end up on separate scrollable layers. +In the general case, the layers are arranged in a tree structure, and so within the APZ code we have a matching tree of AsyncPanZoomController (APZC) objects, one for each scrollable layer. +To manage this tree of APZC instances, we have a single APZCTreeManager object. +Each APZC is relatively independent and handles the scrolling for its associated layer, but there are some cases in which they need to interact; these cases are described in the sections below. + +### Hit detection + +Consider again the case where we have a scrollable page that contains an iframe which itself is scrollable. +As described above, we will have two APZC instances - one for the page and one for the iframe. +When the user puts their finger down on the screen and moves it, we need to do some sort of hit detection in order to determine whether their finger is on the iframe or on the top-level page. +Based on where their finger lands, the appropriate APZC instance needs to handle the input. +This hit detection is also done in the APZCTreeManager, as it has the necessary information about the sizes and positions of the layers. +Currently this hit detection is not perfect, as it uses rects and does not account for things like rounded corners and opacity. + +Also note that for some types of input (e.g. when the user puts two fingers down to do a pinch) we do not want the input to be "split" across two different APZC instances. +In the case of a pinch, for example, we find a "common ancestor" APZC instance - one that is zoomable and contains all of the touch input points, and direct the input to that APZC instance. + +### Scroll Handoff + +Consider yet again the case where we have a scrollable page that contains an iframe which itself is scrollable. +Say the user scrolls the iframe so that it reaches the bottom. +If the user continues panning on the iframe, the expectation is that the top-level page will start scrolling. +However, as discussed in the section on hit detection, the APZC instance for the iframe is separate from the APZC instance for the top-level page. +Thus, we need the two APZC instances to communicate in some way such that input events on the iframe result in scrolling on the top-level page. +This behaviour is referred to as "scroll handoff" (or "fling handoff" in the case where analogous behaviour results from the scrolling momentum of the page after the user has lifted their finger). + +### Input event untransformation + +The APZC architecture by definition results in two copies of a "scroll position" for each scrollable layer. +There is the original copy on the main thread that is accessible to web content and the layout and painting code. +And there is a second copy on the compositor side, which is updated asynchronously based on user input, and corresponds to what the user visually sees on the screen. +Although these two copies may diverge temporarily, they are reconciled periodically. +In particular, they diverge while the APZ code is performing an async pan or zoom action on behalf of the user, and are reconciled when the APZ code requests a repaint from the main thread. + +Because of the way input events are stored, this has some unfortunate consequences. +Input events are stored relative to the device screen - so if the user touches at the same physical spot on the device, the same input events will be delivered regardless of the content scroll position. +When the main thread receives a touch event, it combines that with the content scroll position in order to figure out what DOM element the user touched. +However, because we now have two different scroll positions, this process may not work perfectly. +A concrete example follows: + +Consider a device with screen size 600 pixels tall. +On this device, a user is viewing a document that is 1000 pixels tall, and that is scrolled down by 200 pixels. +That is, the vertical section of the document from 200px to 800px is visible. +Now, if the user touches a point 100px from the top of the physical display, the hardware will generate a touch event with y=100. +This will get sent to the main thread, which will add the scroll position (200) and get a document-relative touch event with y=300. +This new y-value will be used in hit detection to figure out what the user touched. +If the document had a absolute-positioned div at y=300, then that would receive the touch event. + +Now let us add some async scrolling to this example. +Say that the user additionally scrolls the document by another 10 pixels asynchronously (i.e. only on the compositor thread), and then does the same touch event. +The same input event is generated by the hardware, and as before, the document will deliver the touch event to the div at y=300. +However, visually, the document is scrolled by an additional 10 pixels so this outcome is wrong. +What needs to happen is that the APZ code needs to intercept the touch event and account for the 10 pixels of asynchronous scroll. +Therefore, the input event with y=100 gets converted to y=110 in the APZ code before being passed on to the main thread. +The main thread then adds the scroll position it knows about and determines that the user touched at a document-relative position of y=310. + +Analogous input event transformations need to be done for horizontal scrolling and zooming. + +### Content independently adjusting scrolling + +As described above, there are two copies of the scroll position in the APZ architecture - one on the main thread and one on the compositor thread. +Usually for architectures like this, there is a single "source of truth" value and the other value is simply a copy. +However, in this case that is not easily possible to do. +The reason is that both of these values can be legitimately modified. +On the compositor side, the input events the user is triggering modify the scroll position, which is then propagated to the main thread. +However, on the main thread, web content might be running Javascript code that programatically sets the scroll position (via window.scrollTo, for example). +Scroll changes driven from the main thread are just as legitimate and need to be propagated to the compositor thread, so that the visual display updates in response. + +Because the cross-thread messaging is asynchronous, reconciling the two types of scroll changes is a tricky problem. +Our design solves this using various flags and generation counters. +The general heuristic we have is that content-driven scroll position changes (e.g. scrollTo from JS) are never lost. +For instance, if the user is doing an async scroll with their finger and content does a scrollTo in the middle, then some of the async scroll would occur before the "jump" and the rest after the "jump". + +### Content preventing default behaviour of input events + +Another problem that we need to deal with is that web content is allowed to intercept touch events and prevent the "default behaviour" of scrolling. +This ability is defined in web standards and is non-negotiable. +Touch event listeners in web content are allowed call preventDefault() on the touchstart or first touchmove event for a touch point; doing this is supposed to "consume" the event and prevent touch-based panning. +As we saw in a previous section, the input event needs to be untransformed by the APZ code before it can be delivered to content. +But, because of the preventDefault problem, we cannot fully process the touch event in the APZ code until content has had a chance to handle it. +Web browsers in general solve this problem by inserting a delay of up to 300ms before processing the input - that is, web content is allowed up to 300ms to process the event and call preventDefault on it. +If web content takes longer than 300ms, or if it completes handling of the event without calling preventDefault, then the browser immediately starts processing the events. + +The way the APZ implementation deals with this is that upon receiving a touch event, it immediately returns an untransformed version that can be dispatched to content. +It also schedules a 400ms timeout (600ms on Android) during which content is allowed to prevent scrolling. +There is an API that allows the main-thread event dispatching code to notify the APZ as to whether or not the default action should be prevented. +If the APZ content response timeout expires, or if the main-thread event dispatching code notifies the APZ of the preventDefault status, then the APZ continues with the processing of the events (which may involve discarding the events). + +The touch-action CSS property from the pointer-events spec is intended to allow eliminating this 400ms delay in many cases (although for backwards compatibility it will still be needed for a while). +Note that even with touch-action implemented, there may be cases where the APZ code does not know the touch-action behaviour of the point the user touched. +In such cases, the APZ code will still wait up to 400ms for the main thread to provide it with the touch-action behaviour information. + +## Technical details + +This section describes various pieces of the APZ code, and goes into more specific detail on APIs and code than the previous sections. +The primary purpose of this section is to help people who plan on making changes to the code, while also not going into so much detail that it needs to be updated with every patch. + +### Overall flow of input events + +This section describes how input events flow through the APZ code. +<ol> +<li value="1"> +Input events arrive from the hardware/widget code into the APZ via APZCTreeManager::ReceiveInputEvent. +The thread that invokes this is called the input thread, and may or may not be the same as the Gecko main thread. +</li> +<li value="2"> +Conceptually the first thing that the APZCTreeManager does is to associate these events with "input blocks". +An input block is a set of events that share certain properties, and generally are intended to represent a single gesture. +For example with touch events, all events following a touchstart up to but not including the next touchstart are in the same block. +All of the events in a given block will go to the same APZC instance and will either all be processed or all be dropped. +</li> +<li value="3"> +Using the first event in the input block, the APZCTreeManager does a hit-test to see which APZC it hits. +This hit-test uses the event regions populated on the layers, which may be larger than the true hit area of the layer. +If no APZC is hit, the events are discarded and we jump to step 6. +Otherwise, the input block is tagged with the hit APZC as a tentative target and put into a global APZ input queue. +</li> +<li value="4"> + <ol> + <li value="i"> + If the input events landed outside the dispatch-to-content event region for the layer, any available events in the input block are processed. + These may trigger behaviours like scrolling or tap gestures. + </li> + <li value="ii"> + If the input events landed inside the dispatch-to-content event region for the layer, the events are left in the queue and a 400ms timeout is initiated. + If the timeout expires before step 9 is completed, the APZ assumes the input block was not cancelled and the tentative target is correct, and processes them as part of step 10. + </li> + </ol> +</li> +<li value="5"> +The call stack unwinds back to APZCTreeManager::ReceiveInputEvent, which does an in-place modification of the input event so that any async transforms are removed. +</li> +<li value="6"> +The call stack unwinds back to the widget code that called ReceiveInputEvent. +This code now has the event in the coordinate space Gecko is expecting, and so can dispatch it to the Gecko main thread. +</li> +<li value="7"> +Gecko performs its own usual hit-testing and event dispatching for the event. +As part of this, it records whether any touch listeners cancelled the input block by calling preventDefault(). +It also activates inactive scrollframes that were hit by the input events. +</li> +<li value="8"> +The call stack unwinds back to the widget code, which sends two notifications to the APZ code on the input thread. +The first notification is via APZCTreeManager::ContentReceivedInputBlock, and informs the APZ whether the input block was cancelled. +The second notification is via APZCTreeManager::SetTargetAPZC, and informs the APZ of the results of the Gecko hit-test during event dispatch. +Note that Gecko may report that the input event did not hit any scrollable frame at all. +The SetTargetAPZC notification happens only once per input block, while the ContentReceivedInputBlock notification may happen once per block, or multiple times per block, depending on the input type. +</li> +<li value="9"> + <ol> + <li value="i"> + If the events were processed as part of step 4(i), the notifications from step 8 are ignored and step 10 is skipped. + </li> + <li value="ii"> + If events were queued as part of step 4(ii), and steps 5-8 take less than 400ms, the arrival of both notifications from step 8 will mark the input block ready for processing. + </li> + <li value="iii"> + If events were queued as part of step 4(ii), but steps 5-8 take longer than 400ms, the notifications from step 8 will be ignored and step 10 will already have happened. + </li> + </ol> +</li> +<li value="10"> +If events were queued as part of step 4(ii) they are now either processed (if the input block was not cancelled and Gecko detected a scrollframe under the input event, or if the timeout expired) or dropped (all other cases). +Note that the APZC that processes the events may be different at this step than the tentative target from step 3, depending on the SetTargetAPZC notification. +Processing the events may trigger behaviours like scrolling or tap gestures. +</li> +</ol> + +If the CSS touch-action property is enabled, the above steps are modified as follows: +<ul> +<li> + In step 4, the APZC also requires the allowed touch-action behaviours for the input event. + This might have been determined as part of the hit-test in APZCTreeManager; if not, the events are queued. +</li> +<li> + In step 6, the widget code determines the content element at the point under the input element, and notifies the APZ code of the allowed touch-action behaviours. + This notification is sent via a call to APZCTreeManager::SetAllowedTouchBehavior on the input thread. +</li> +<li> + In step 9(ii), the input block will only be marked ready for processing once all three notifications arrive. +</li> +</ul> + +#### Threading considerations + +The bulk of the input processing in the APZ code happens on what we call "the input thread". +In practice the input thread could be the Gecko main thread, the compositor thread, or some other thread. +There are obvious downsides to using the Gecko main thread - that is, "asynchronous" panning and zooming is not really asynchronous as input events can only be processed while Gecko is idle. +In an e10s environment, using the Gecko main thread of the chrome process is acceptable, because the code running in that process is more controllable and well-behaved than arbitrary web content. +Using the compositor thread as the input thread could work on some platforms, but may be inefficient on others. +For example, on Android (Fennec) we receive input events from the system on a dedicated UI thread. +We would have to redispatch the input events to the compositor thread if we wanted to the input thread to be the same as the compositor thread. +This introduces a potential for higher latency, particularly if the compositor does any blocking operations - blocking SwapBuffers operations, for example. +As a result, the APZ code itself does not assume that the input thread will be the same as the Gecko main thread or the compositor thread. + +#### Active vs. inactive scrollframes + +The number of scrollframes on a page is potentially unbounded. +However, we do not want to create a separate layer for each scrollframe right away, as this would require large amounts of memory. +Therefore, scrollframes as designated as either "active" or "inactive". +Active scrollframes are the ones that do have their contents put on a separate layer (or set of layers), and inactive ones do not. + +Consider a page with a scrollframe that is initially inactive. +When layout generates the layers for this page, the content of the scrollframe will be flattened into some other PaintedLayer (call it P). +The layout code also adds the area (or bounding region in case of weird shapes) of the scrollframe to the dispatch-to-content region of P. + +When the user starts interacting with that content, the hit-test in the APZ code finds the dispatch-to-content region of P. +The input block therefore has a tentative target of P when it goes into step 4(ii) in the flow above. +When gecko processes the input event, it must detect the inactive scrollframe and activate it, as part of step 7. +Finally, the widget code sends the SetTargetAPZC notification in step 8 to notify the APZ that the input block should really apply to this new layer. +The issue here is that the layer transaction containing the new layer must reach the compositor and APZ before the SetTargetAPZC notification. +If this does not occur within the 400ms timeout, the APZ code will be unable to update the tentative target, and will continue to use P for that input block. +Input blocks that start after the layer transaction will get correctly routed to the new layer as there will now be a layer and APZC instance for the active scrollframe. + +This model implies that when the user initially attempts to scroll an inactive scrollframe, it may end up scrolling an ancestor scrollframe. +(This is because in the absence of the SetTargetAPZC notification, the input events will get applied to the closest ancestor scrollframe's APZC.) +Only after the round-trip to the gecko thread is complete is there a layer for async scrolling to actually occur on the scrollframe itself. +At that point the scrollframe will start receiving new input blocks and will scroll normally. diff --git a/gfx/doc/B2GInputFlow.svg b/gfx/doc/B2GInputFlow.svg new file mode 100644 index 000000000..ee6f4332c --- /dev/null +++ b/gfx/doc/B2GInputFlow.svg @@ -0,0 +1,349 @@ +<?xml version="1.0"?> +<svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="1000" height="800"> + <title>Touch input event flow on B2G</title> + <g id="arrows"></g> + <style type="text/css"><![CDATA[ + text { + fill: black; + text-anchor: middle; + white-space: pre-line; + font-size: 14px; + } + rect { + fill: none; + } + line { + stroke: black; + } + .parentinput rect { + stroke: black; + } + text.parentinput { + fill: black; + text-anchor: start; + } + .parentmain rect { + stroke: orange; + } + text.parentmain { + fill: orange; + text-anchor: start; + } + .parentcompositor rect { + stroke: green; + } + text.parentcompositor { + fill: green; + text-anchor: start; + } + .childmain rect { + stroke: red; + } + text.childmain { + fill: red; + text-anchor: start; + } + .bothmain rect { + stroke: blue; + } + text.bothmain { + fill: blue; + text-anchor: start; + } + ]]></style> + <script type="text/javascript"><![CDATA[ + var svg = "http://www.w3.org/2000/svg"; + var maxY = 0; + + function breaks(text) { + var count = 0; + for (var i = text.length - 1; i >= 0; i--) { + if (text.charAt(i) == '\n') { + count++; + } + } + return count; + } + + function makeAction(text, x, y, thread) { + maxY = Math.max(maxY, y); + var g = document.createElementNS(svg, "g"); + g.setAttribute("class", "action " + thread); + g.setAttribute("transform", "translate(" + x + ", " + (y + 30) + ")"); + var r = document.createElementNS(svg, "rect"); + r.setAttribute("width", "100"); + r.setAttribute("height", "40"); + var t = document.createElementNS(svg, "text"); + t.setAttribute("x", "50"); + t.setAttribute("y", 25 - (7 * breaks(text))); + t.appendChild(document.createTextNode(text)); + g.appendChild(r); + g.appendChild(t); + return g; + } + + function makeChoice(text, x, y, thread) { + maxY = Math.max(maxY, y); + var g = document.createElementNS(svg, "g"); + g.setAttribute("class", "choice " + thread); + g.setAttribute("transform", "translate(" + (x + 15) + ", " + (y + 15) + ")"); + var g2 = document.createElementNS(svg, "g"); + g2.setAttribute("transform", "rotate(-45, 35, 35)"); + var r = document.createElementNS(svg, "rect"); + r.setAttribute("width", "70"); + r.setAttribute("height", "70"); + g2.appendChild(r); + var t = document.createElementNS(svg, "text"); + t.setAttribute("x", "35"); + t.setAttribute("y", 40 - (7 * breaks(text))); + t.appendChild(document.createTextNode(text)); + g.appendChild(g2); + g.appendChild(t); + return g; + } + + function makeLabelChoice(label, point) { + var t = document.createElementNS(svg, "text"); + t.setAttribute("x", point.x); + t.setAttribute("y", point.y); + t.appendChild(document.createTextNode(label)); + return t; + } + + function makeLine(sx, sy, ex, ey) { + maxY = Math.max(maxY, sy, ey); + var l = document.createElementNS(svg, "line"); + l.setAttribute("x1", sx); + l.setAttribute("y1", sy); + l.setAttribute("x2", ex); + l.setAttribute("y2", ey); + return l; + } + + function makeArrow(start, end) { + var g = document.createElementNS(svg, "g"); + g.appendChild(makeLine(start.x, start.y, end.x, end.y)); + if (start.x != end.x) { + start.x = end.x + (4 * Math.sign(start.x - end.x)); + g.appendChild(makeLine(start.x, start.y - 4, end.x, end.y)); + g.appendChild(makeLine(start.x, start.y + 4, end.x, end.y)); + } else if (start.y != end.y) { + start.y = end.y + (4 * Math.sign(start.y - end.y)); + g.appendChild(makeLine(start.x - 4, start.y, end.x, end.y)); + g.appendChild(makeLine(start.x + 4, start.y, end.x, end.y)); + } + return g; + } + + function makeVHArrow(start, end) { + var g = document.createElementNS(svg, "g"); + g.appendChild(makeLine(start.x, start.y, start.x, end.y)); + start.y = end.y; + g.appendChild(makeArrow(start, end)); + return g; + } + + function makeHVArrow(start, end) { + var g = document.createElementNS(svg, "g"); + g.appendChild(makeLine(start.x, start.y, end.x, start.y)); + start.x = end.x; + g.appendChild(makeArrow(start, end)); + return g; + } + + function makeVHVArrow(start, end, length) { + var g = document.createElementNS(svg, "g"); + g.appendChild(makeLine(start.x, start.y, start.x, start.y + length)); + start.y += length; + g.appendChild(makeLine(start.x, start.y, end.x, start.y)); + start.x = end.x; + g.appendChild(makeArrow(start, end)); + return g; + } + + function makeHVHArrow(start, end, length) { + var g = document.createElementNS(svg, "g"); + g.appendChild(makeLine(start.x, start.y, start.x + length, start.y)); + start.x += length; + g.appendChild(makeLine(start.x, start.y, start.x, end.y)); + start.y = end.y; + g.appendChild(makeArrow(start, end)); + return g; + } + + function translation(group) { + var r = new RegExp("translate\\((\\d+), (\\d+)\\)"); + var result = r.exec(group.getAttribute("transform")); + return { x: parseInt(result[1]), y: parseInt(result[2]) }; + } + + function isAction(group) { + return group.classList.contains("action"); + } + + function isChoice(group) { + return group.classList.contains("choice"); + } + + function offset(point, x, y) { + point.x += x; + point.y += y; + return point; + } + + function rightOf(group) { + var t = translation(group); + if (isAction(group)) { + return offset(t, 100, 20); + } + if (isChoice(group)) { + return offset(t, 85, 35); + } + return t; + } + + function leftOf(group) { + var t = translation(group); + if (isAction(group)) { + return offset(t, 0, 20); + } + if (isChoice(group)) { + return offset(t, -15, 35); + } + return t; + } + + function topOf(group) { + var t = translation(group); + if (isAction(group)) { + return offset(t, 50, 0); + } + if (isChoice(group)) { + return offset(t, 35, -15); + } + return t; + } + + function bottomOf(group) { + var t = translation(group); + if (isAction(group)) { + return offset(t, 50, 40); + } + if (isChoice(group)) { + return offset(t, 35, 85); + } + return t; + } + + function midpoint(start, end) { + return { x: (start.x + end.x) / 2, + y: (start.y + end.y) / 2 }; + } + + function makeLegend(label, thread) { + var t = document.createElementNS(svg, "text"); + t.setAttribute("x", "10"); + t.setAttribute("y", maxY); + t.setAttribute("class", thread); + maxY += 15; + t.appendChild(document.createTextNode(label)); + return t; + } + + var android = makeAction("Android/Gonk", 20, 0, "parentinput"); + var sendNative = makeAction("DOMWindowUtils\nsendNativeTouchPoint", 20, 100, "parentmain"); + var apzHitTest = makeAction("APZ hit test", 150, 0, "parentcompositor"); + var apzUntransform = makeAction("APZ\nuntransform", 300, 0, "parentcompositor"); + var apzGesture = makeAction("APZ gesture\ndetection", 450, 0, "parentcompositor"); + var apzTransform = makeAction("APZ transform\nupdate", 600, 0, "parentcompositor"); + var compositor = makeAction("Compositor", 750, 0, "parentcompositor"); + var nsAppShell = makeAction("nsAppShell", 150, 100, "parentmain"); + var rootHitTest = makeAction("Gecko hit test\n(root process)", 300, 100, "parentmain"); + var rootEsm = makeAction("Gecko ESM\n(root process)", 450, 100, "parentmain"); + var isEdgeGesture = makeChoice("Edge gesture?", 300, 200, "parentmain"); + var edgeConsume = makeAction("Consume\nevent block", 150, 200, "parentmain"); + var bepjsm = makeAction("BEParent.jsm\nsendTouchEvent", 450, 200, "parentmain"); + var iframeSend = makeAction("HTMLIFrameElement\nsendTouchEvent", 20, 275, "parentmain"); + var isApzTarget = makeChoice("Target\nhas APZ?", 600, 200, "parentmain"); + var sendTouchEvent = makeAction("Target\nsendTouchEventToWindow", 750, 100, "parentmain"); + var injectTouch = makeAction("injectTouchEvent", 750, 200, "parentmain"); + var targetESM = makeAction("Target window\nESM", 750, 450, "bothmain"); + var tabParent = makeAction("TabParent", 750, 350, "parentmain"); + var geckoUntransform = makeAction("Gecko\nuntransform", 600, 350, "parentmain"); + var tabChild = makeAction("TabChild", 450, 350, "childmain"); + var isApzcEnabled = makeChoice("APZ\nenabled?", 300, 350, "childmain"); + var tabGesture = makeAction("TabChild gesture\ndetection", 150, 350, "childmain"); + var childHitTest = makeAction("Gecko hit test\n(child process)", 300, 450, "childmain"); + var childEsm = makeAction("Gecko ESM\n(child process)", 450, 450, "childmain"); + var childContent = makeAction("Content\n(child process)", 600, 450, "childmain"); + + document.documentElement.appendChild(android); + document.documentElement.appendChild(sendNative); + document.documentElement.appendChild(apzHitTest); + document.documentElement.appendChild(apzUntransform); + document.documentElement.appendChild(apzGesture); + document.documentElement.appendChild(apzTransform); + document.documentElement.appendChild(compositor); + document.documentElement.appendChild(nsAppShell); + document.documentElement.appendChild(rootHitTest); + document.documentElement.appendChild(rootEsm); + document.documentElement.appendChild(isEdgeGesture); + document.documentElement.appendChild(edgeConsume); + document.documentElement.appendChild(bepjsm); + document.documentElement.appendChild(iframeSend); + document.documentElement.appendChild(isApzTarget); + document.documentElement.appendChild(sendTouchEvent); + document.documentElement.appendChild(injectTouch); + document.documentElement.appendChild(targetESM); + document.documentElement.appendChild(tabParent); + document.documentElement.appendChild(geckoUntransform); + document.documentElement.appendChild(tabChild); + document.documentElement.appendChild(isApzcEnabled); + document.documentElement.appendChild(tabGesture); + document.documentElement.appendChild(childHitTest); + document.documentElement.appendChild(childEsm); + document.documentElement.appendChild(childContent); + + document.documentElement.appendChild(makeLabelChoice("Y", offset(leftOf(isEdgeGesture), -5, -5))); + document.documentElement.appendChild(makeLabelChoice("N", offset(rightOf(isEdgeGesture), 5, -5))); + document.documentElement.appendChild(makeLabelChoice("N", offset(topOf(isApzTarget), 8, -10))); + document.documentElement.appendChild(makeLabelChoice("Y", offset(rightOf(isApzTarget), 10, 14))); + document.documentElement.appendChild(makeLabelChoice("N", offset(leftOf(isApzcEnabled), -5, -5))); + document.documentElement.appendChild(makeLabelChoice("Y", offset(bottomOf(isApzcEnabled), 10, 14))); + + var arrows = document.getElementById('arrows'); + arrows.appendChild(makeArrow(rightOf(android), leftOf(apzHitTest))); + arrows.appendChild(makeVHVArrow(topOf(sendNative), midpoint(rightOf(android), leftOf(apzHitTest)), -20)); + arrows.appendChild(makeArrow(rightOf(apzHitTest), leftOf(apzUntransform))); + arrows.appendChild(makeArrow(rightOf(apzUntransform), leftOf(apzGesture))); + arrows.appendChild(makeArrow(rightOf(apzGesture), leftOf(apzTransform))); + arrows.appendChild(makeArrow(rightOf(apzTransform), leftOf(compositor))); + arrows.appendChild(makeVHVArrow(midpoint(leftOf(apzUntransform), rightOf(apzGesture)), topOf(nsAppShell), 40)); + arrows.appendChild(makeArrow(rightOf(nsAppShell), leftOf(rootHitTest))); + arrows.appendChild(makeArrow(rightOf(rootHitTest), leftOf(rootEsm))); + arrows.appendChild(makeVHVArrow(bottomOf(rootEsm), topOf(isEdgeGesture), 15)); + arrows.appendChild(makeArrow(leftOf(isEdgeGesture), rightOf(edgeConsume))); + arrows.appendChild(makeArrow(rightOf(isEdgeGesture), leftOf(bepjsm), 20)); + arrows.appendChild(makeHVArrow(rightOf(iframeSend), bottomOf(bepjsm))); + arrows.appendChild(makeArrow(rightOf(bepjsm), leftOf(isApzTarget))); + arrows.appendChild(makeArrow(rightOf(isApzTarget), leftOf(injectTouch))); + arrows.appendChild(makeArrow(bottomOf(injectTouch), topOf(tabParent))); + arrows.appendChild(makeVHArrow(topOf(isApzTarget), leftOf(sendTouchEvent))); + arrows.appendChild(makeHVHArrow(rightOf(sendTouchEvent), rightOf(targetESM), 30)); + arrows.appendChild(makeArrow(leftOf(tabParent), rightOf(geckoUntransform))); + arrows.appendChild(makeArrow(leftOf(geckoUntransform), rightOf(tabChild))); + arrows.appendChild(makeArrow(leftOf(tabChild), rightOf(isApzcEnabled))); + arrows.appendChild(makeArrow(leftOf(isApzcEnabled), rightOf(tabGesture))); + arrows.appendChild(makeArrow(bottomOf(isApzcEnabled), topOf(childHitTest))); + arrows.appendChild(makeVHArrow(bottomOf(tabGesture), leftOf(childHitTest))); + arrows.appendChild(makeArrow(rightOf(childHitTest), leftOf(childEsm))); + arrows.appendChild(makeArrow(rightOf(childEsm), leftOf(childContent))); + arrows.appendChild(makeVHVArrow(midpoint(leftOf(apzGesture), rightOf(apzTransform)), topOf(tabChild), 300)); + + document.documentElement.appendChild(makeLegend("Main process input thread", "parentinput")); + document.documentElement.appendChild(makeLegend("Main process main thread", "parentmain")); + document.documentElement.appendChild(makeLegend("Main process compositor thread", "parentcompositor")); + document.documentElement.appendChild(makeLegend("Child process main thread", "childmain")); + document.documentElement.appendChild(makeLegend("Undetermined process main thread", "bothmain")); + ]]></script> +</svg> diff --git a/gfx/doc/GraphicsOverview.md b/gfx/doc/GraphicsOverview.md new file mode 100644 index 000000000..b834172c8 --- /dev/null +++ b/gfx/doc/GraphicsOverview.md @@ -0,0 +1,83 @@ +Mozilla Graphics Overview {#graphicsoverview} +================= +## Work in progress. Possibly incorrect or incomplete. + +Overview +-------- +The graphics systems is responsible for rendering (painting, drawing) the frame tree (rendering tree) elements as created by the layout system. Each leaf in the tree has content, either bounded by a rectangle (or perhaps another shape, in the case of SVG.) + +The simple approach for producing the result would thus involve traversing the frame tree, in a correct order, drawing each frame into the resulting buffer and displaying (printing non-withstanding) that buffer when the traversal is done. It is worth spending some time on the "correct order" note above. If there are no overlapping frames, this is fairly simple - any order will do, as long as there is no background. If there is background, we just have to worry about drawing that first. Since we do not control the content, chances are the page is more complicated. There are overlapping frames, likely with transparency, so we need to make sure the elements are draw "back to front", in layers, so to speak. Layers are an important concept, and we will revisit them shortly, as they are central to fixing a major issue with the above simple approach. + +While the above simple approach will work, the performance will suffer. Each time anything changes in any of the frames, the complete process needs to be repeated, everything needs to be redrawn. Further, there is very little space to take advantage of the modern graphics (GPU) hardware, or multi-core computers. If you recall from the previous sections, the frame tree is only accessible from the UI thread, so while we're doing all this work, the UI is basically blocked. + +### (Retained) Layers + +Layers framework was introduced to address the above performance issues, by having a part of the design address each item. At the high level: + +1. We create a layer tree. The leaf elements of the tree contain all frames (possibly multiple frames per leaf). +2. We render each layer tree element and cache (retain) the result. +3. We composite (combine) all the leaf elements into the final result. + +Let's examine each of these steps, in reverse order. + +### Compositing +We use the term composite as it implies that the order is important. If the elements being composited overlap, whether there is transparency involved or not, the order in which they are combined will effect the result. +Compositing is where we can use some of the power of the modern graphics hardware. It is optimal for doing this job. In the scenarios where only the position of individual frames changes, without the content inside them changing, we see why caching each layer would be advantageous - we only need to repeat the final compositing step, completely skipping the layer tree creation and the rendering of each leaf, thus speeding up the process considerably. + +Another benefit is equally apparent in the context of the stated deficiencies of the simple approach. We can use the available graphics hardware accelerated APIs to do the compositing step. Direct3D, OpenGL can be used on different platforms and are well suited to accelerate this step. + +Finally, we can now envision performing the compositing step on a separate thread, unblocking the UI thread for other work, and doing more work in parallel. More on this below. + +It is important to note that the number of operations in this step is proportional to the number of layer tree (leaf) elements, so there is additional work and complexity involved, when the layer tree is large. + +#### Render and retain layer elements +As we saw, the compositing step benefits from caching the intermediate result. This does result in the extra memory usage, so needs to be considered during the layer tree creation. Beyond the caching, we can accelerate the rendering of each element by (indirectly) using the available platform APIs (e.g., Direct2D, CoreGraphics, even some of the 3D APIs like OpenGL or Direct3D) as available. This is actually done through a platform independent API (see Moz2D) below, but is important to realize it does get accelerated appropriately. + +#### Creating the layer tree +We need to create a layer tree (from the frames tree), which will give us the correct result while striking the right balance between a layer per frame element and a single layer for the complete frames tree. As was mentioned above, there is an overhead in traversing the whole tree and caching each of the elements, balanced by the performance improvements. Some of the performance improvements are only noticed when something changes (e.g., one element is moving, we only need to redo the compositing step). + +### Refresh Driver + +### Layers + +#### Rendering each layer + +### Tiling vs. Buffer Rotation vs. Full paint + +#### Compositing for the final result + +### Graphics API + +#### Moz2D +* The Moz2D graphics API, part of the Azure project, is a cross-platform interface onto the various graphics backends that Gecko uses for rendering such as Direct2D (1.0 and 1.1), Skia, Cairo, Quartz, and NV Path. Adding a new graphics platform to Gecko is accomplished by adding a backend to Moz2D. +\see [Moz2D documentation on wiki](https://wiki.mozilla.org/Platform/GFX/Moz2D) + +#### Compositing + +#### Image Decoding + +#### Image Animation + +### Funny words +There are a lot of code words that we use to refer to projects, libraries, areas of the code. Here's an attempt to cover some of those: +* Azure - See Moz2D in the Graphics API section above. +* Backend - See Moz2D in the Graphics API section above. +* Cairo - http://www.cairographics.org/. Cairo is a 2D graphics library with support for multiple output devices. Currently supported output targets include the X Window System (via both Xlib and XCB), Quartz, Win32, image buffers, PostScript, PDF, and SVG file output. +* Moz2D - See Moz2D in the Graphics API section above. +* Thebes - Graphics API that preceded Moz2D. +* Reflow +* Display list + +### [Historical Documents](http://www.youtube.com/watch?v=lLZQz26-kms) +A number of posts and blogs that will give you more details or more background, or reasoning that led to different solutions and approaches. + +* 2010-01 [Layers: Cross Platform Acceleration] (http://www.basschouten.com/blog1.php/layers-cross-platform-acceleration) +* 2010-04 [Layers] (http://robert.ocallahan.org/2010/04/layers_01.html) +* 2010-07 [Retained Layers](http://robert.ocallahan.org/2010/07/retained-layers_16.html) +* 2011-04 [Introduction](https://blog.mozilla.org/joe/2011/04/26/introducing-the-azure-project/ Moz2D) +* 2011-07 [Layers](http://chrislord.net/index.php/2011/07/25/shadow-layers-and-learning-by-failing/ Shadow) +* 2011-09 [Graphics API Design](http://robert.ocallahan.org/2011/09/graphics-api-design.html) +* 2012-04 [Moz2D Canvas on OSX](http://muizelaar.blogspot.ca/2012/04/azure-canvas-on-os-x.html) +* 2012-05 [Mask Layers](http://featherweightmusings.blogspot.co.uk/2012/05/mask-layers_26.html) +* 2013-07 [Graphics related](http://www.basschouten.com/blog1.php) + diff --git a/gfx/doc/LayersHistory.md b/gfx/doc/LayersHistory.md new file mode 100644 index 000000000..2833aa3c5 --- /dev/null +++ b/gfx/doc/LayersHistory.md @@ -0,0 +1,60 @@ +This is an overview of the major events in the history of our Layers infrastructure. + +- iPhone released in July 2007 (Built on a toolkit called LayerKit) + +- Core Animation (October 2007) LayerKit was publicly renamed to OS X 10.5 + +- Webkit CSS 3d transforms (July 2009) + +- Original layers API (March 2010) Introduced the idea of a layer manager that + would composite. One of the first use cases for this was hardware accelerated + YUV conversion for video. + +- Retained layers (July 7 2010 - Bug 564991) +This was an important concept that introduced the idea of persisting the layer +content across paints in gecko controlled buffers instead of just by the OS. This introduced +the concept of buffer rotation to deal with scrolling instead of using the +native scrolling APIs like ScrollWindowEx + +- Layers IPC (July 2010 - Bug 570294) +This introduced shadow layers and edit lists and was originally done for e10s v1 + +- 3d transforms (September 2011 - Bug 505115) + +- OMTC (December 2011 - Bug 711168) +This was prototyped on OS X but shipped first for Fennec + +- Tiling v1 (April 2012 - Bug 739679) +Originally done for Fennec. +This was done to avoid situations where we had to do a bunch of work for +scrolling a small amount. i.e. buffer rotation. It allowed us to have a +variety of interesting features like progressive painting and lower resolution +painting. + +- C++ Async pan zoom controller (July 2012 - Bug 750974) +The existing APZ code was in Java for Fennec so this was reimplemented. + +- Streaming WebGL Buffers (February 2013 - Bug 716859) +Infrastructure to allow OMTC WebGL and avoid the need to glFinish() every +frame. + +- Compositor API (April 2013 - Bug 825928) +The planning for this started around November 2012. +Layers refactoring created a compositor API that abstracted away the differences between the +D3D vs OpenGL. The main piece of API is DrawQuad. + +- Tiling v2 (Mar 7 2014 - Bug 963073) +Tiling for B2G. This work is mainly porting tiled layers to new textures, +implementing double-buffered tiles and implementing a texture client pool, to +be used by tiled content clients. + + A large motivation for the pool was the very slow performance of allocating tiles because +of the sync messages to the compositor. + + The slow performance of allocating was directly addressed by bug 959089 which allowed us +to allocate gralloc buffers without sync messages to the compositor thread. + +- B2G WebGL performance (May 2014 - Bug 1006957, 1001417, 1024144) +This work improved the synchronization mechanism between the compositor +and the producer. + diff --git a/gfx/doc/MainPage.md b/gfx/doc/MainPage.md new file mode 100644 index 000000000..70a9fc60a --- /dev/null +++ b/gfx/doc/MainPage.md @@ -0,0 +1,21 @@ +Mozilla Graphics {#mainpage} +====================== + +## Work in progress. Possibly incorrect or incomplete. + + +Introduction +------- +This collection of linked pages contains a combination of Doxygen +extracted source code documentation and design documents for the +Mozilla graphics architecture. The design documents live in gfx/docs directory. + +This [wiki page](https://wiki.mozilla.org/Platform/GFX) contains +information about graphics and the graphics team at MoCo. + +Continue here for a [very high level introductory overview](@ref graphicsoverview) +if you don't know where to start. + +Useful pointers for creating documentation +------ +[The mechanics of creating these files](https://wiki.mozilla.org/Platform/GFX/DesignDocumentationGuidelines) diff --git a/gfx/doc/MozSurface.md b/gfx/doc/MozSurface.md new file mode 100644 index 000000000..ae8c45f42 --- /dev/null +++ b/gfx/doc/MozSurface.md @@ -0,0 +1,124 @@ +MozSurface {#mozsurface} +========== + +**This document is work in progress. Some information may be missing or incomplete.** + +## Goals + +We need to be able to safely and efficiently render web content into surfaces that may be shared accross processes. +MozSurface is a cross-process and backend-independent Surface API and not a stream API. + +## Owner + +Nicolas Silva + +## Definitions + +## Use cases + +Drawing web content into a surface and share it with the compositor process to display it on the screen without copies. + +## Requirement + +* It must be possible to efficiently share a MozSurface with a separate thread or process through IPDL +* It must be possible to obtain read access a MozSurface on both the client and the host side at the same time. +* The creation, update and destrution of surfaces must be safe and race-free. In particular, the ownership of the shared data must be clearly defined. +* MozSurface must be a cross-backend/cross-platform abstraction that we will use on all of the supported platforms. +* It must be possible to efficiently draw into a MozSurface using Moz2D. +* While it should be possible to share MozSurfaces accross processes, it should not be limited to that. MozSurface should also be the preferred abstraction for use with surfaces that are not shared with the compositor process. + +## TextureClient and TextureHost + +TextureClient and TextureHost are the closest abstractions we currently have to MozSurface. The current plan is to evolve TextureClient into MozSurface. In its current state, TextureClient doesn't meet all the requirements and desisgn decisions of MozSurface yet. + +In particular, TextureClient/TextureHost are designed around cross-process sharing specifically. See the SharedMozSurface design document for more information about TextureClient and TextureHost. + +## Locking semantics + +In order to access the shared surface data users of MozSurface must acquire and release a lock on the surface, specifying the open mode (read/write/read+write). + + bool Lock(OpenMode aMode); + void Unlock(); + +This locking API has two purposes: + +* Ensure that access to the shared data is race-free. +* Let the implemetation do whatever is necessary for the user to have access to the data. For example it can be mapping and unmapping the surface data in memory if the underlying backend requires it. + +The lock is expected to behave as a cross-process blocking read/write lock that is not reentrant. + +## Immutable surfaces + +In some cases we know in advance that a surface will not be modified after it has been shared. This is for example true for video frames. In this case the surface can be marked as immutable and the underlying implementation doesn't need to hold an actual blocking lock on the shared data. +Trying to acquire a write lock on a MozSurface that is marked as immutable and already shared must fail (return false). +Note that it is still required to use the Lock/Unlock API to read the data, in order for the implementation to be able to properly map and unmap the memory. This is just an optimization and a safety check. + +## Drawing into a surface + +In most cases we want to be able to paint directly into a surface through the Moz2D API. + +A surface lets you *borrow* a DrawTarget that is only valid between Lock and Unlock. + + DrawTarget* GetAsDrawTarget(); + +It is invalid to hold a reference to the DrawTarget after Unlock, and a different DrawTarget may be obtained during the next Lock/Unlock interval. + +In some cases we want to use MozSurface without drawing into it. For instance to share video frames accross processes. Some surface types may also not be accessible through a DrawTarget (for example YCbCr surfaces). + + bool CanExposeDrawTarget(); + +helps with making sure that a Surface supports exposing a Moz2D DrawTarget. + +## Using a MozSurface as a source for Compositing + +To interface with the Compositor API, MozSurface gives access to TextureSource objects. TextureSource is the cross-backend representation of a texture that Compositor understands. +While MozSurface handles memory management of (potentially shared) texture data, TextureSource is only an abstraction for Compositing. + +## Fence synchronization + +TODO: We need to figure this out. Right now we have a Gonk specific implementation, but no cross-platform abstraction/design. + +## Ownership of the shared data + +MozSurface (TextureClient/TextureHost in its current form) defines ownership rules that depend on the configuration of the surface, in order to satisy efficiency and safety requirements. + +These rules rely on the fact that the underlying shared data is strictly owned by the MozSurface. This means that keeping direct references to the shared data is illegal and unsafe. + +## Internal buffers / direct texturing + +Some MozSurface implementations use CPU-side shared memory to share the texture data accross processes, and require a GPU texture upload when interfacing with a TextureSource. In this case we say that the surface has an internal buffer (because it is implicitly equivalent to double buffering where the shared data is the back buffer and the GPU side texture is the front buffer). We also say that it doesn't do "direct texturing" meaning that we don't draw directly into the GPU-side texture. + +Examples: + + * Shmem MozSurface + OpenGL TextureSource: Has an internal buffer (no direct texturing) + * Gralloc MozSurface + Gralloc TextureSource: No internal buffer (direct texturing) + +While direct texturing is usually the most efficient way, it is not always available depending on the platform and the required allocation size or format. Textures with internal buffers have less restrictions around locking since the host side will only need to read from the MozSurface once per update, meaning that we can often get away with single buffering where we would need double buffering with direct texturing. + +## Alternative solutions + +## Backends + +We have MozSurface implementaions (classes inheriting from TextureClient/TextureHost) for OpenGL, Software, D3D9, and D3D11 backends. +Some implemtations can be used with any backend (ex. ShmemTextureClient/Host). + +## Users of MozSurface + +MozSurface is the mechanism used by layers to share surfaces with the compositor, but it is not limited to layers. It should be used by anything that draws into a surface that may be shared with the compositor thread. + +## Testing + +TODO - How can we make MozSurface more testable and what should we test? + +## Future work + +### Using a MozSurface as a source for Drawing + +MozSurface should be able to expose a borrowed Moz2D SourceSurface that is valid between Lock and Unlock similarly to how it exposes a DrawTarget. + +## Comparison with other APIs + +MozSurface is somewhat equivalent to Gralloc on Android/Gonk: it is a reference counted cross-process surface with locking semantics. While Gralloc can interface itself with OpenGL textures for compositing, MozSurface can interface itself to TextureSource objects. + +MozSurface should not be confused with higher level APIs such as EGLStream. A swap-chain API like EGLStream can be implemented on top of MozSurface, but MozSurface's purpose is to define and manage the memory and resources of shared texture data. + diff --git a/gfx/doc/SharedMozSurface.md b/gfx/doc/SharedMozSurface.md new file mode 100644 index 000000000..3ff3e53dd --- /dev/null +++ b/gfx/doc/SharedMozSurface.md @@ -0,0 +1,147 @@ +Shared MozSurface {#mozsurface} +========== + +**This document is work in progress. Some information may be missing or incomplete.** + +Shared MozSurfaces represent an important use case of MozSurface, anything that is in the MozSurface design document also applies to shared MozSurfaces. + +## Goals + +We need to be able to safely and efficiently render web content into surfaces that may be shared accross processes. +MozSurface is a cross-process and backend-independent Surface API and not a stream API. + +## Owner + +Nicolas Silva + +## Definitions + +* Client and Host: In Gecko's compositing architecture, the client process is the producer, while the host process is the consumer side, where compositing takes place. + +## Use cases + +Drawing web content into a surface and share it with the compositor process to display it on the screen without copies. + +## Requirement + +Shared MozSurfaces represent an important use case of MozSurface, it has the same requirements as MozSurface. + +## TextureClient and TextureHost + +TextureClient and TextureHost are the closest abstractions we currently have to MozSurface. +Inline documentation about TextureClient and TextureHost can be found in: + +* [gfx/layers/client/TextureClient.h](http://dxr.mozilla.org/mozilla-central/source/gfx/layers/client/TextureClient.h) +* [gfx/layers/composite/TextureHost.h](http://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/TextureHost.h) + +TextureClient is the client-side handle on a MozSurface, while TextureHost is the equivalent host-side representation. There can only be one TextureClient for a given TextureHost, and one TextureHost for a given TextureClient. Likewise, there can only be one shared object for a given TextureClient/TextureHost pair. + +A MozSurface containing data that is shared between a client process and a host process exists in the following form: + +``` + . + Client process . Host process + . + ________________ ______________ ______________ + | | | | | | + | TextureClient +----+ <SharedData> +----+ TextureHost | + |________________| |______________| |______________| + . + . + . + Figure 1) A Surface as seen by the client and the host processes +``` + +The above figure is a logical representation, not a class diagram. +`<SharedData>` is a placeholder for whichever platform specific surface type we are sharing, for example a Gralloc buffer on Gonk or a D3D11 texture on Windows. + +## Deallocation protocol + +The shared data is accessible by both the client-side and the host-side of the MozSurface. A deallocation protocol must be defined to handle which side deallocates the data, and to ensure that it doesn't cause any race condition. +The client side, which contains the web content's logic, always "decides" when a surface is needed or not. So the life time of a MozSurface is driven by the reference count of it's client-side handle (TextureClient). +When a TextureClient's reference count reaches zero, a "Remove" message is sent in order to let the host side that the shared data is not accessible on the client side and that it si safe for it to be deleted. The host side responds with a "Delete" message. + + +``` + client side . host side + . + (A) Client: Send Remove -. . + \ . + \ . ... can receive and send ... + \ + Can receive `--> (B) Host: Receive Remove + Can't send | + .-- (C) Host: Send Delete + / + / . ... can't receive nor send ... + / . + (D) Client: Receive Delete <--' . + . + Figure 2) MozSurface deallocation handshake +``` + +This handshake protocol is twofold: + +* It defines where and when it is possible to deallocate the shared data without races +* It makes it impossible for asynchronous messages to race with the destruction of the MozSurface. + +### Deallocating on the host side + +In the common case, the shared data is deallocated asynchronously on the host side. In this case the deallocation takes place at the point (C) of figure 2. + +### Deallocating on the client side + +In some rare cases, for instance if the underlying implementation requires it, the shared data must be deallocated on the client side. In such cases, deallocation happens at the point (D) of figure 2. + +In some exceptional cases, this needs to happen synchronously, meaning that the client-side thread will block until the Delete message is received. This is supported but it is terrible for performance, so it should be avoided as much as possible. +Currently this is needed when shutting down a hardware-decoded video stream with libstagefright on Gonk, because the libstagefright unfortunately assumes it has full ownership over the shared data (gralloc buffers) and crashes if there are still users of the buffers. + +### Sharing state + +The above deallocation protocol of a MozSurface applies to the common case that is when the surface is shared between two processes. A Surface can also be deallocated while it is not shared. + +The sharing state of a MozSurface can be one of the following: + +* (1) Uninitialized (it doesn't have any shared data) +* (2) Local (it isn't shared with the another thread/process) +* (3) Shared (the state you would expect it to be most of the time) +* (4) Invalid (when for some rare cases we needed to force the deallocation of the shared data before the destruction of the TextureClient object). + +Surfaces can move from state N to state N+1 and be deallocated in any of these states. It could be possible to move from Shared to Local, but we currently don't have a use case for it. + +The deallocation protocol above, applies to the Shared state (3). +In the other cases: + +* (1) Unitilialized: There is nothing to do. +* (2) Local: The shared data is deallocated by the client side without need for a handshake, since it is not shared with other threads. +* (4) Invalid: There is nothing to do (deallocation has already happenned). + +## Alternative solutions + +### Sending ownership back and forth between the client and host sides through message passing, intead of sharing. + +The current design of MozSurface makes the surface accessible from both sides at the same time, forcing us to do Locking and have a hand shake around deallocating the shared data, while using pure message passing and making the surface accessible only from one side at a time would avoid these complications. + +Using pure message passing was actually the first approach we tried when we created the first version of TextureClient and TextureHost. This strategy failed in several places, partly because of some legacy in Gecko's architecture, and partly because of some of optimizations we do to avoid copying surfaces. + +We need a given surface to be accessible on both the client and host for the following reasons: + +* Gecko can at any time require read access on the client side to a surface that is shared with the host process, for example to build a temporary layer manager and generate a screenshot. This is mostly a legacy problem. +* We do some copy-on-write optimizations on surfaces that are shared with the compositor in order to keep invalid regions as small as possible. Out tiling implementation is an example of that. +* Our buffer rotation code on scrollable non-tiled layers also requires a synchronization on the client side between the front and back buffers, while the front buffer is used on the host side. + +## Testing + +TODO - How can we make shared MozSurfaces more testable and what should we test? + +## Future work + +### Rename TextureClient/TextureHost + +The current terminology is very confusing. + +### Unify TextureClient and TextureHost + +TextureClient and TextureHost should live under a common interface to better hide the IPC details. The base classe should only expose the non-ipc related methods such as Locking, access through a DrawTarget, access to a TextureSource. + +## Comparison with other APIs diff --git a/gfx/doc/Silk.md b/gfx/doc/Silk.md new file mode 100644 index 000000000..c71e79e43 --- /dev/null +++ b/gfx/doc/Silk.md @@ -0,0 +1,246 @@ +Silk Architecture Overview +================= + +#Architecture +Our current architecture is to align three components to hardware vsync timers: + +1. Compositor +2. RefreshDriver / Painting +3. Input Events + +The flow of our rendering engine is as follows: + +1. Hardware Vsync event occurs on an OS specific *Hardware Vsync Thread* on a per monitor basis. +2. The *Hardware Vsync Thread* attached to the monitor notifies the **CompositorVsyncDispatchers** and **RefreshTimerVsyncDispatcher**. +3. For every Firefox window on the specific monitor, notify a **CompositorVsyncDispatcher**. The **CompositorVsyncDispatcher** is specific to one window. +4. The **CompositorVsyncDispatcher** notifies a **CompositorWidgetVsyncObserver** when remote compositing, or a **CompositorVsyncScheduler::Observer** when compositing in-process. +5. If remote compositing, a vsync notification is sent from the **CompositorWidgetVsyncObserver** to the **VsyncBridgeChild** on the UI process, which sends an IPDL message to the **VsyncBridgeParent** on the compositor thread of the GPU process, which then dispatches to **CompositorVsyncScheduler::Observer**. +6. The **RefreshTimerVsyncDispatcher** notifies the Chrome **RefreshTimer** that a vsync has occured. +7. The **RefreshTimerVsyncDispatcher** sends IPC messages to all content processes to tick their respective active **RefreshTimer**. +8. The **Compositor** dispatches input events on the *Compositor Thread*, then composites. Input events are only dispatched on the *Compositor Thread* on b2g. +9. The **RefreshDriver** paints on the *Main Thread*. + +The implementation is broken into the following sections and will reference this figure. Note that **Objects** are bold fonts while *Threads* are italicized. + +<img src="silkArchitecture.png" width="900px" height="630px" /> + +#Hardware Vsync +Hardware vsync events from (1), occur on a specific **Display** Object. +The **Display** object is responsible for enabling / disabling vsync on a per connected display basis. +For example, if two monitors are connected, two **Display** objects will be created, each listening to vsync events for their respective displays. +We require one **Display** object per monitor as each monitor may have different vsync rates. +As a fallback solution, we have one global **Display** object that can synchronize across all connected displays. +The global **Display** is useful if a window is positioned halfway between the two monitors. +Each platform will have to implement a specific **Display** object to hook and listen to vsync events. +As of this writing, both Firefox OS and OS X create their own hardware specific *Hardware Vsync Thread* that executes after a vsync has occured. +OS X creates one *Hardware Vsync Thread* per **CVDisplayLinkRef**. +We do not currently support multiple displays, so we use one global **CVDisplayLinkRef** that works across all active displays. +On Windows, we have to create a new platform *thread* that waits for DwmFlush(), which works across all active displays. +Once the thread wakes up from DwmFlush(), the actual vsync timestamp is retrieved from DwmGetCompositionTimingInfo(), which is the timestamp that is actually passed into the compositor and refresh driver. + +When a vsync occurs on a **Display**, the *Hardware Vsync Thread* callback fetches all **CompositorVsyncDispatchers** associated with the **Display**. +Each **CompositorVsyncDispatcher** is notified that a vsync has occured with the vsync's timestamp. +It is the responsibility of the **CompositorVsyncDispatcher** to notify the **Compositor** that is awaiting vsync notifications. +The **Display** will then notify the associated **RefreshTimerVsyncDispatcher**, which should notify all active **RefreshDrivers** to tick. + +All **Display** objects are encapsulated in a **VsyncSource** object. +The **VsyncSource** object lives in **gfxPlatform** and is instantiated only on the parent process when **gfxPlatform** is created. +The **VsyncSource** is destroyed when **gfxPlatform** is destroyed. +There is only one **VsyncSource** object throughout the entire lifetime of Firefox. +Each platform is expected to implement their own **VsyncSource** to manage vsync events. +On Firefox OS, this is through the **HwcComposer2D**. +On OS X, this is through **CVDisplayLinkRef**. +On Windows, it should be through **DwmGetCompositionTimingInfo**. + +#Compositor +When the **CompositorVsyncDispatcher** is notified of the vsync event, the **CompositorVsyncScheduler::Observer** associated with the **CompositorVsyncDispatcher** begins execution. +Since the **CompositorVsyncDispatcher** executes on the *Hardware Vsync Thread* and the **Compositor** composites on the *CompositorThread*, the **CompositorVsyncScheduler::Observer** posts a task to the *CompositorThread*. +The **CompositorBridgeParent** then composites. +The model where the **CompositorVsyncDispatcher** notifies components on the *Hardware Vsync Thread*, and the component schedules the task on the appropriate thread is used everywhere. + +The **CompositorVsyncScheduler::Observer** listens to vsync events as needed and stops listening to vsync when composites are no longer scheduled or required. +Every **CompositorBridgeParent** is associated and tied to one **CompositorVsyncScheduler::Observer**, which is associated with the **CompositorVsyncDispatcher**. +Each **CompositorBridgeParent** is associated with one widget and is created when a new platform window or **nsBaseWidget** is created. +The **CompositorBridgeParent**, **CompositorVsyncDispatcher**, **CompositorVsyncScheduler::Observer**, and **nsBaseWidget** all have the same lifetimes, which are created and destroyed together. + +##Out-of-process Compositors +When compositing out-of-process, this model changes slightly. +In this case there are effectively two observers: a UI process observer (**CompositorWidgetVsyncObserver**), and the **CompositorVsyncScheduler::Observer** in the GPU process. +There are also two dispatchers: the widget dispatcher in the UI process (**CompositorVsyncDispatcher**), and the IPDL-based dispatcher in the GPU process (**CompositorBridgeParent::NotifyVsync**). +The UI process observer and the GPU process dispatcher are linked via an IPDL protocol called PVsyncBridge. +**PVsyncBridge** is a top-level protocol for sending vsync notifications to the compositor thread in the GPU process. +The compositor controls vsync observation through a separate actor, **PCompositorWidget**, which (as a subactor for **CompositorBridgeChild**) links the compositor thread in the GPU process to the main thread in the UI process. + +Out-of-process compositors do not go through **CompositorVsyncDispatcher** directly. +Instead, the **CompositorWidgetDelegate** in the UI process creates one, and gives it a **CompositorWidgetVsyncObserver**. +This observer forwards notifications to a Vsync I/O thread, where **VsyncBridgeChild** then forwards the notification again to the compositor thread in the GPU process. +The notification is received by a **VsyncBridgeParent**. +The GPU process uses the layers ID in the notification to find the correct compositor to dispatch the notification to. + +###CompositorVsyncDispatcher +The **CompositorVsyncDispatcher** executes on the *Hardware Vsync Thread*. +It contains references to the **nsBaseWidget** it is associated with and has a lifetime equal to the **nsBaseWidget**. +The **CompositorVsyncDispatcher** is responsible for notifying the **CompositorBridgeParent** that a vsync event has occured. +There can be multiple **CompositorVsyncDispatchers** per **Display**, one **CompositorVsyncDispatcher** per window. +The only responsibility of the **CompositorVsyncDispatcher** is to notify components when a vsync event has occured, and to stop listening to vsync when no components require vsync events. +We require one **CompositorVsyncDispatcher** per window so that we can handle multiple **Displays**. +When compositing in-process, the **CompositorVsyncDispatcher** is attached to the CompositorWidget for the +window. When out-of-process, it is attached to the CompositorWidgetDelegate, which forwards +observer notifications over IPDL. In the latter case, its lifetime is tied to a CompositorSession +rather than the nsIWidget. + +###Multiple Displays +The **VsyncSource** has an API to switch a **CompositorVsyncDispatcher** from one **Display** to another **Display**. +For example, when one window either goes into full screen mode or moves from one connected monitor to another. +When one window moves to another monitor, we expect a platform specific notification to occur. +The detection of when a window enters full screen mode or moves is not covered by Silk itself, but the framework is built to support this use case. +The expected flow is that the OS notification occurs on **nsIWidget**, which retrieves the associated **CompositorVsyncDispatcher**. +The **CompositorVsyncDispatcher** then notifies the **VsyncSource** to switch to the correct **Display** the **CompositorVsyncDispatcher** is connected to. +Because the notification works through the **nsIWidget**, the actual switching of the **CompositorVsyncDispatcher** to the correct **Display** should occur on the *Main Thread*. +The current implementation of Silk does not handle this case and needs to be built out. + +###CompositorVsyncScheduler::Observer +The **CompositorVsyncScheduler::Observer** handles the vsync notifications and interactions with the **CompositorVsyncDispatcher**. +When the **Compositor** requires a scheduled composite, it notifies the **CompositorVsyncScheduler::Observer** that it needs to listen to vsync. +The **CompositorVsyncScheduler::Observer** then observes / unobserves vsync as needed from the **CompositorVsyncDispatcher** to enable composites. + +###GeckoTouchDispatcher +The **GeckoTouchDispatcher** is a singleton that resamples touch events to smooth out jank while tracking a user's finger. +Because input and composite are linked together, the **CompositorVsyncScheduler::Observer** has a reference to the **GeckoTouchDispatcher** and vice versa. + +###Input Events +One large goal of Silk is to align touch events with vsync events. +On Firefox OS, touchscreens often have different touch scan rates than the display refreshes. +A Flame device has a touch refresh rate of 75 HZ, while a Nexus 4 has a touch refresh rate of 100 HZ, while the device's display refresh rate is 60HZ. +When a vsync event occurs, we resample touch events, and then dispatch the resampled touch event to APZ. +Touch events on Firefox OS occur on a *Touch Input Thread* whereas they are processed by APZ on the *APZ Controller Thread*. +We use [Google Android's touch resampling](http://www.masonchang.com/blog/2014/8/25/androids-touch-resampling-algorithm) algorithm to resample touch events. + +Currently, we have a strict ordering between Composites and touch events. +When a touch event occurs on the *Touch Input Thread*, we store the touch event in a queue. +When a vsync event occurs, the **CompositorVsyncDispatcher** notifies the **Compositor** of a vsync event, which notifies the **GeckoTouchDispatcher**. +The **GeckoTouchDispatcher** processes the touch event first on the *APZ Controller Thread*, which is the same as the *Compositor Thread* on b2g, then the **Compositor** finishes compositing. +We require this strict ordering because if a vsync notification is dispatched to both the **Compositor** and **GeckoTouchDispatcher** at the same time, a race condition occurs between processing the touch event and therefore position versus compositing. +In practice, this creates very janky scrolling. +As of this writing, we have not analyzed input events on desktop platforms. + +One slight quirk is that input events can start a composite, for example during a scroll and after the **Compositor** is no longer listening to vsync events. +In these cases, we notify the **Compositor** to observe vsync so that it dispatches touch events. +If touch events were not dispatched, and since the **Compositor** is not listening to vsync events, the touch events would never be dispatched. +The **GeckoTouchDispatcher** handles this case by always forcing the **Compositor** to listen to vsync events while touch events are occurring. + +###Widget, Compositor, CompositorVsyncDispatcher, GeckoTouchDispatcher Shutdown Procedure +When the [nsBaseWidget shuts down](http://hg.mozilla.org/mozilla-central/file/0df249a0e4d3/widget/nsBaseWidget.cpp#l182) - It calls nsBaseWidget::DestroyCompositor on the *Gecko Main Thread*. +During nsBaseWidget::DestroyCompositor, it first destroys the CompositorBridgeChild. +CompositorBridgeChild sends a sync IPC call to CompositorBridgeParent::RecvStop, which calls [CompositorBridgeParent::Destroy](http://hg.mozilla.org/mozilla-central/file/ab0490972e1e/gfx/layers/ipc/CompositorBridgeParent.cpp#l509). +During this time, the *main thread* is blocked on the parent process. +CompositorBridgeParent::RecvStop runs on the *Compositor thread* and cleans up some resources, including setting the **CompositorVsyncScheduler::Observer** to nullptr. +CompositorBridgeParent::RecvStop also explicitly keeps the CompositorBridgeParent alive and posts another task to run CompositorBridgeParent::DeferredDestroy on the Compositor loop so that all ipdl code can finish executing. +The **CompositorVsyncScheduler::Observer** also unobserves from vsync and cancels any pending composite tasks. +Once CompositorBridgeParent::RecvStop finishes, the *main thread* in the parent process continues shutting down the nsBaseWidget. + +At the same time, the *Compositor thread* is executing tasks until CompositorBridgeParent::DeferredDestroy runs, which flushes the compositor message loop. +Now we have two tasks as both the nsBaseWidget releases a reference to the Compositor on the *main thread* during destruction and the CompositorBridgeParent::DeferredDestroy releases a reference to the CompositorBridgeParent on the *Compositor Thread*. +Finally, the CompositorBridgeParent itself is destroyed on the *main thread* once both references are gone due to explicit [main thread destruction](http://hg.mozilla.org/mozilla-central/file/50b95032152c/gfx/layers/ipc/CompositorBridgeParent.h#l148). + +With the **CompositorVsyncScheduler::Observer**, any accesses to the widget after nsBaseWidget::DestroyCompositor executes are invalid. +Any accesses to the compositor between the time the nsBaseWidget::DestroyCompositor runs and the CompositorVsyncScheduler::Observer's destructor runs aren't safe yet a hardware vsync event could occur between these times. +Since any tasks posted on the Compositor loop after CompositorBridgeParent::DeferredDestroy is posted are invalid, we make sure that no vsync tasks can be posted once CompositorBridgeParent::RecvStop executes and DeferredDestroy is posted on the Compositor thread. +When the sync call to CompositorBridgeParent::RecvStop executes, we explicitly set the CompositorVsyncScheduler::Observer to null to prevent vsync notifications from occurring. +If vsync notifications were allowed to occur, since the **CompositorVsyncScheduler::Observer**'s vsync notification executes on the *hardware vsync thread*, it would post a task to the Compositor loop and may execute after CompositorBridgeParent::DeferredDestroy. +Thus, we explicitly shut down vsync events in the **CompositorVsyncDispatcher** and **CompositorVsyncScheduler::Observer** during nsBaseWidget::Shutdown to prevent any vsync tasks from executing after CompositorBridgeParent::DeferredDestroy. + +The **CompositorVsyncDispatcher** may be destroyed on either the *main thread* or *Compositor Thread*, since both the nsBaseWidget and **CompositorVsyncScheduler::Observer** race to destroy on different threads. +nsBaseWidget is destroyed on the *main thread* and releases a reference to the **CompositorVsyncDispatcher** during destruction. +The **CompositorVsyncScheduler::Observer** has a race to be destroyed either during CompositorBridgeParent shutdown or from the **GeckoTouchDispatcher** which is destroyed on the main thread with [ClearOnShutdown](http://hg.mozilla.org/mozilla-central/file/21567e9a6e40/xpcom/base/ClearOnShutdown.h#l15). +Whichever object, the CompositorBridgeParent or the **GeckoTouchDispatcher** is destroyed last will hold the last reference to the **CompositorVsyncDispatcher**, which destroys the object. + +#Refresh Driver +The Refresh Driver is ticked from a [single active timer](http://hg.mozilla.org/mozilla-central/file/ab0490972e1e/layout/base/nsRefreshDriver.cpp#l11). +The assumption is that there are multiple **RefreshDrivers** connected to a single **RefreshTimer**. +There are two **RefreshTimers**: an active and an inactive **RefreshTimer**. +Each Tab has its own **RefreshDriver**, which connects to one of the global **RefreshTimers**. +The **RefreshTimers** execute on the *Main Thread* and tick their connected **RefreshDrivers**. +We do not want to break this model of multiple **RefreshDrivers** per a set of two global **RefreshTimers**. +Each **RefreshDriver** switches between the active and inactive **RefreshTimer**. + +Instead, we create a new **RefreshTimer**, the **VsyncRefreshTimer** which ticks based on vsync messages. +We replace the current active timer with a **VsyncRefreshTimer**. +All tabs will then tick based on this new active timer. +Since the **RefreshTimer** has a lifetime of the process, we only need to create a single **RefreshTimerVsyncDispatcher** per **Display** when Firefox starts. +Even if we do not have any content processes, the Chrome process will still need a **VsyncRefreshTimer**, thus we can associate the **RefreshTimerVsyncDispatcher** with each **Display**. + +When Firefox starts, we initially create a new **VsyncRefreshTimer** in the Chrome process. +The **VsyncRefreshTimer** will listen to vsync notifications from **RefreshTimerVsyncDispatcher** on the global **Display**. +When nsRefreshDriver::Shutdown executes, it will delete the **VsyncRefreshTimer**. +This creates a problem as all the **RefreshTimers** are currently manually memory managed whereas **VsyncObservers** are ref counted. +To work around this problem, we create a new **RefreshDriverVsyncObserver** as an inner class to **VsyncRefreshTimer**, which actually receives vsync notifications. It then ticks the **RefreshDrivers** inside **VsyncRefreshTimer**. + +With Content processes, the start up process is more complicated. +We send vsync IPC messages via the use of the PBackground thread on the parent process, which allows us to send messages from the Parent process' without waiting on the *main thread*. +This sends messages from the Parent::*PBackground Thread* to the Child::*Main Thread*. +The *main thread* receiving IPC messages on the content process is acceptable because **RefreshDrivers** must execute on the *main thread*. +However, there is some amount of time required to setup the IPC connection upon process creation and during this time, the **RefreshDrivers** must tick to set up the process. +To get around this, we initially use software **RefreshTimers** that already exist during content process startup and swap in the **VsyncRefreshTimer** once the IPC connection is created. + +During nsRefreshDriver::ChooseTimer, we create an async PBackground IPC open request to create a **VsyncParent** and **VsyncChild**. +At the same time, we create a software **RefreshTimer** and tick the **RefreshDrivers** as normal. +Once the PBackground callback is executed and an IPC connection exists, we swap all **RefreshDrivers** currently associated with the active **RefreshTimer** and swap the **RefreshDrivers** to use the **VsyncRefreshTimer**. +Since all interactions on the content process occur on the main thread, there are no need for locks. +The **VsyncParent** listens to vsync events through the **VsyncRefreshTimerDispatcher** on the parent side and sends vsync IPC messages to the **VsyncChild**. +The **VsyncChild** notifies the **VsyncRefreshTimer** on the content process. + +During the shutdown process of the content process, ActorDestroy is called on the **VsyncChild** and **VsyncParent** due to the normal PBackground shutdown process. +Once ActorDestroy is called, no IPC messages should be sent across the channel. +After ActorDestroy is called, the IPDL machinery will delete the **VsyncParent/Child** pair. +The **VsyncParent**, due to being a **VsyncObserver**, is ref counted. +After **VsyncParent::ActorDestroy** is called, it unregisters itself from the **RefreshTimerVsyncDispatcher**, which holds the last reference to the **VsyncParent**, and the object will be deleted. + +Thus the overall flow during normal execution is: + +1. VsyncSource::Display::RefreshTimerVsyncDispatcher receives a Vsync notification from the OS in the parent process. +2. RefreshTimerVsyncDispatcher notifies VsyncRefreshTimer::RefreshDriverVsyncObserver that a vsync occured on the parent process on the hardware vsync thread. +3. RefreshTimerVsyncDispatcher notifies the VsyncParent on the hardware vsync thread that a vsync occured. +4. The VsyncRefreshTimer::RefreshDriverVsyncObserver in the parent process posts a task to the main thread that ticks the refresh drivers. +5. VsyncParent posts a task to the PBackground thread to send a vsync IPC message to VsyncChild. +6. VsyncChild receive a vsync notification on the content process on the main thread and ticks their respective RefreshDrivers. + +###Compressing Vsync Messages +Vsync messages occur quite often and the *main thread* can be busy for long periods of time due to JavaScript. +Consistently sending vsync messages to the refresh driver timer can flood the *main thread* with refresh driver ticks, causing even more delays. +To avoid this problem, we compress vsync messages on both the parent and child processes. + +On the parent process, newer vsync messages update a vsync timestamp but do not actually queue any tasks on the *main thread*. +Once the parent process' *main thread* executes the refresh driver tick, it uses the most updated vsync timestamp to tick the refresh driver. +After the refresh driver has ticked, one single vsync message is queued for another refresh driver tick task. +On the content process, the IPDL **compress** keyword automatically compresses IPC messages. + +### Multiple Monitors +In order to have multiple monitor support for the **RefreshDrivers**, we have multiple active **RefreshTimers**. +Each **RefreshTimer** is associated with a specific **Display** via an id and tick when it's respective **Display** vsync occurs. +We have **N RefreshTimers**, where N is the number of connected displays. +Each **RefreshTimer** still has multiple **RefreshDrivers**. + +When a tab or window changes monitors, the **nsIWidget** receives a display changed notification. +Based on which display the window is on, the window switches to the correct **RefreshTimerVsyncDispatcher** and **CompositorVsyncDispatcher** on the parent process based on the display id. +Each **TabParent** should also send a notification to their child. +Each **TabChild**, given the display ID, switches to the correct **RefreshTimer** associated with the display ID. +When each display vsync occurs, it sends one IPC message to notify vsync. +The vsync message contains a display ID, to tick the appropriate **RefreshTimer** on the content process. +There is still only one **VsyncParent/VsyncChild** pair, just each vsync notification will include a display ID, which maps to the correct **RefreshTimer**. + +#Object Lifetime +1. CompositorVsyncDispatcher - Lives as long as the nsBaseWidget associated with the VsyncDispatcher +2. CompositorVsyncScheduler::Observer - Lives and dies the same time as the CompositorBridgeParent. +3. RefreshTimerVsyncDispatcher - As long as the associated display object, which is the lifetime of Firefox. +4. VsyncSource - Lives as long as the gfxPlatform on the chrome process, which is the lifetime of Firefox. +5. VsyncParent/VsyncChild - Lives as long as the content process +6. RefreshTimer - Lives as long as the process + +#Threads +All **VsyncObservers** are notified on the *Hardware Vsync Thread*. It is the responsibility of the **VsyncObservers** to post tasks to their respective correct thread. For example, the **CompositorVsyncScheduler::Observer** will be notified on the *Hardware Vsync Thread*, and post a task to the *Compositor Thread* to do the actual composition. + +1. Compositor Thread - Nothing changes +2. Main Thread - PVsyncChild receives IPC messages on the main thread. We also enable/disable vsync on the main thread. +3. PBackground Thread - Creates a connection from the PBackground thread on the parent process to the main thread in the content process. +4. Hardware Vsync Thread - Every platform is different, but we always have the concept of a hardware vsync thread. Sometimes this is actually created by the host OS. On Windows, we have to create a separate platform thread that blocks on DwmFlush(). diff --git a/gfx/doc/silkArchitecture.png b/gfx/doc/silkArchitecture.png Binary files differnew file mode 100644 index 000000000..938c585e4 --- /dev/null +++ b/gfx/doc/silkArchitecture.png |