diff options
Diffstat (limited to 'toolkit/components/telemetry/docs/concepts')
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/archiving.rst | 12 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/crashes.rst | 23 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/index.rst | 23 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/pings.rst | 32 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/sessions.rst | 40 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/submission.rst | 34 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/subsession_triggers.png | bin | 0 -> 1219295 bytes |
7 files changed, 164 insertions, 0 deletions
diff --git a/toolkit/components/telemetry/docs/concepts/archiving.rst b/toolkit/components/telemetry/docs/concepts/archiving.rst new file mode 100644 index 000000000..a2c57de43 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/archiving.rst @@ -0,0 +1,12 @@ +========= +Archiving +========= + +When archiving is enabled through the relevant pref (``toolkit.telemetry.archive.enabled``), pings submitted to ``TelemetryController`` are also stored locally in the user profile directory, in ``<profile-dir>/datareporting/archived``. + +To allow for cheaper lookup of archived pings, storage follows a specific naming scheme for both the directory and the ping file name: `<YYYY-MM>/<timestamp>.<UUID>.<type>.jsonlz4`. + +* ``<YYYY-MM>`` - The subdirectory name, generated from the ping creation date. +* ``<timestamp>`` - Timestamp of the ping creation date. +* ``<UUID>`` - The ping identifier. +* ``<type>`` - The ping type. diff --git a/toolkit/components/telemetry/docs/concepts/crashes.rst b/toolkit/components/telemetry/docs/concepts/crashes.rst new file mode 100644 index 000000000..c9f69a23b --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/crashes.rst @@ -0,0 +1,23 @@ +======= +Crashes +======= + +There are many different kinds of crashes for Firefox, there is not a single system used to record all of them. + +Main process crashes +==================== + +If the Firefox main process dies, that should be recorded as an aborted session. We would submit a :doc:`main ping <../data/main-ping>` with the reason ``aborted-session``. +If we have a crash dump for that crash, we should also submit a :doc:`crash ping <../data/crash-ping>`. + +The ``aborted-session`` information is first written to disk 60 seconds after startup, any earlier crashes will not trigger an ``aborted-session`` ping. +Also, the ``aborted-session`` is updated at least every 5 minutes, so it may lag behind the last session state. + +Crashes during startup should be recorded in the next sessions main ping in the ``STARTUP_CRASH_DETECTED`` histogram. + +Child process crashes +===================== + +If a Firefox plugin, content or gmplugin process dies unexpectedly, this is recorded in the main pings ``SUBPROCESS_ABNORMAL_ABORT`` keyed histogram. + +If we catch a crash report for this, then additionally the ``SUBPROCESS_CRASHES_WITH_DUMP`` keyed histogram is incremented. diff --git a/toolkit/components/telemetry/docs/concepts/index.rst b/toolkit/components/telemetry/docs/concepts/index.rst new file mode 100644 index 000000000..a49466f8d --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/index.rst @@ -0,0 +1,23 @@ +======== +Concepts +======== + +There are common concepts used throughout Telemetry: + +* :doc:`pings <pings>` - the packets we use to submit data +* :doc:`sessions & subsessions <sessions>` - how we slice a users' time in the browser +* *measurements* - how we :doc:`collect data <../collection/index>` +* *opt-in* & *opt-out* - the different sets of data we collect +* :doc:`submission <submission>` - how we send data to the servers +* :doc:`archiving <archiving>` - retaining ping data locally +* :doc:`crashes <crashes>` - the different data crashes generate + +.. toctree:: + :maxdepth: 2 + :titlesonly: + :glob: + :hidden: + + pings + crashes + * diff --git a/toolkit/components/telemetry/docs/concepts/pings.rst b/toolkit/components/telemetry/docs/concepts/pings.rst new file mode 100644 index 000000000..db7371b32 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/pings.rst @@ -0,0 +1,32 @@ +.. _telemetry_pings: + +===================== +Telemetry pings +===================== + +A *Telemetry ping* is the data that we send to Mozillas Telemetry servers. + +That data is stored as a JSON object client-side and contains common information to all pings and a payload specific to a certain *ping types*. + +The top-level structure is defined by the :doc:`common ping format <../data/common-ping>` format. +It contains: + +* some basic information shared between different ping types +* the :doc:`environment data <../data/environment>` (optional) +* the data specific to the *ping type*, the *payload*. + +Ping types +========== + +We send Telemetry with different ping types. The :doc:`main <../data/main-ping>` ping is the ping that contains the bulk of the Telemetry measurements for Firefox. For more specific use-cases, we send other ping types. + +Pings sent from code that ships with Firefox are listed in the :doc:`data documentation <../data/index>`. + +Important examples are: + +* :doc:`main <../data/main-ping>` - contains the information collected by Telemetry (Histograms, hang stacks, ...) +* :doc:`saved-session <../data/main-ping>` - has the same format as a main ping, but it contains the *"classic"* Telemetry payload with measurements covering the whole browser session. This is only a separate type to make storage of saved-session easier server-side. This is temporary and will be removed soon. +* :doc:`crash <../data/crash-ping>` - a ping that is captured and sent after Firefox crashes. +* ``activation`` - *planned* - sent right after installation or profile creation +* ``upgrade`` - *planned* - sent right after an upgrade +* :doc:`deletion <../data/deletion-ping>` - sent when FHR upload is disabled, requesting deletion of the data associated with this user diff --git a/toolkit/components/telemetry/docs/concepts/sessions.rst b/toolkit/components/telemetry/docs/concepts/sessions.rst new file mode 100644 index 000000000..088556978 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/sessions.rst @@ -0,0 +1,40 @@ +======== +Sessions +======== + +A *session* is the time from when Firefox starts until it shut down. +A session can be very long-running. E.g. for Mac users that are used to always put their laptops into sleep-mode, Firefox may run for weeks. +We slice the sessions into smaller logical units called *subsessions*. + +Subsessions +=========== + +The first subsession starts when the browser starts. After that, we split the subsession for different reasons: + +* ``daily``, when crossing local midnight. This keeps latency acceptable by triggering a ping at least daily for most active users. +* ``environment-change``, when a change to the *environment* happens. This happens for important changes to the Firefox settings and when addons activate or deactivate. + +On a subsession split, a :doc:`main ping <../data/main-ping>` with that reason will be submitted. We store the reason in the pings payload, to see what triggered it. + +A session always ends with a subsession with one of two reason: + +* ``shutdown``, when the browser was cleanly shut down. To avoid delaying shutdown, we only save this ping to disk and send it at the next opportunity (typically the next browsing session). +* ``aborted-session``, when the browser crashed. While Firefox is active, we write the current ``main`` ping data to disk every 5 minutes. If the browser crashes, we find this data on disk on the next start and send it with this reason. + +.. image:: subsession_triggers.png + +Subsession data +=============== + +A subsessions data consists of: + +* general information: the date the subsession started, how long it lasted, etc. +* specific measurements: histogram & scalar data, etc. + +This has some advantages: + +* Latency - Sending a ping with all the data of a subsession immediately after it ends means we get the data from installs faster. For ``main`` pings, we aim to send a ping at least daily by starting a new subsession at local midnight. +* Correlation - By starting new subsessions when fundamental settings change (i.e. changes to the *environment*), we can correlate a subsessions data better to those settings. + + + diff --git a/toolkit/components/telemetry/docs/concepts/submission.rst b/toolkit/components/telemetry/docs/concepts/submission.rst new file mode 100644 index 000000000..165917d40 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/submission.rst @@ -0,0 +1,34 @@ +========== +Submission +========== + +*Note:* The server-side behaviour is documented in the `HTTP Edge Server specification <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification>`_. + +Pings are submitted via a common API on ``TelemetryController``. +If a ping fails to successfully submit to the server immediately (e.g. because +of missing internet connection), Telemetry will store it on disk and retry to +send it until the maximum ping age is exceeded (14 days). + +*Note:* the :doc:`main pings <../data/main-ping>` are kept locally even after successful submission to enable the HealthReport and SelfSupport features. They will be deleted after their retention period of 180 days. + +Submission logic +================ + +Sending of pending pings starts as soon as the delayed startup is finished. They are sent in batches, newest-first, with up +to 10 persisted pings per batch plus all unpersisted pings. +The send logic then waits for each batch to complete. + +If it succeeds we trigger the next send of a ping batch. This is delayed as needed to only trigger one batch send per minute. + +If ping sending encounters an error that means retrying later, a backoff timeout behavior is +triggered, exponentially increasing the timeout for the next try from 1 minute up to a limit of 120 minutes. +Any new ping submissions and "idle-daily" events reset this behavior as a safety mechanism and trigger immediate ping sending. + +Status codes +============ + +The telemetry server team is working towards `the common services status codes <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification#Server_Responses>`_, but for now the following logic is sufficient for Telemetry: + +* `2XX` - success, don't resubmit +* `4XX` - there was some problem with the request - the client should not try to resubmit as it would just receive the same response +* `5XX` - there was a server-side error, the client should try to resubmit later diff --git a/toolkit/components/telemetry/docs/concepts/subsession_triggers.png b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png Binary files differnew file mode 100644 index 000000000..5717b00a9 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png |