summaryrefslogtreecommitdiffstats
path: root/toolkit/components/telemetry/docs/concepts
diff options
context:
space:
mode:
Diffstat (limited to 'toolkit/components/telemetry/docs/concepts')
-rw-r--r--toolkit/components/telemetry/docs/concepts/archiving.rst12
-rw-r--r--toolkit/components/telemetry/docs/concepts/crashes.rst23
-rw-r--r--toolkit/components/telemetry/docs/concepts/index.rst23
-rw-r--r--toolkit/components/telemetry/docs/concepts/pings.rst32
-rw-r--r--toolkit/components/telemetry/docs/concepts/sessions.rst40
-rw-r--r--toolkit/components/telemetry/docs/concepts/submission.rst34
-rw-r--r--toolkit/components/telemetry/docs/concepts/subsession_triggers.pngbin0 -> 1219295 bytes
7 files changed, 164 insertions, 0 deletions
diff --git a/toolkit/components/telemetry/docs/concepts/archiving.rst b/toolkit/components/telemetry/docs/concepts/archiving.rst
new file mode 100644
index 000000000..a2c57de43
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/archiving.rst
@@ -0,0 +1,12 @@
+=========
+Archiving
+=========
+
+When archiving is enabled through the relevant pref (``toolkit.telemetry.archive.enabled``), pings submitted to ``TelemetryController`` are also stored locally in the user profile directory, in ``<profile-dir>/datareporting/archived``.
+
+To allow for cheaper lookup of archived pings, storage follows a specific naming scheme for both the directory and the ping file name: `<YYYY-MM>/<timestamp>.<UUID>.<type>.jsonlz4`.
+
+* ``<YYYY-MM>`` - The subdirectory name, generated from the ping creation date.
+* ``<timestamp>`` - Timestamp of the ping creation date.
+* ``<UUID>`` - The ping identifier.
+* ``<type>`` - The ping type.
diff --git a/toolkit/components/telemetry/docs/concepts/crashes.rst b/toolkit/components/telemetry/docs/concepts/crashes.rst
new file mode 100644
index 000000000..c9f69a23b
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/crashes.rst
@@ -0,0 +1,23 @@
+=======
+Crashes
+=======
+
+There are many different kinds of crashes for Firefox, there is not a single system used to record all of them.
+
+Main process crashes
+====================
+
+If the Firefox main process dies, that should be recorded as an aborted session. We would submit a :doc:`main ping <../data/main-ping>` with the reason ``aborted-session``.
+If we have a crash dump for that crash, we should also submit a :doc:`crash ping <../data/crash-ping>`.
+
+The ``aborted-session`` information is first written to disk 60 seconds after startup, any earlier crashes will not trigger an ``aborted-session`` ping.
+Also, the ``aborted-session`` is updated at least every 5 minutes, so it may lag behind the last session state.
+
+Crashes during startup should be recorded in the next sessions main ping in the ``STARTUP_CRASH_DETECTED`` histogram.
+
+Child process crashes
+=====================
+
+If a Firefox plugin, content or gmplugin process dies unexpectedly, this is recorded in the main pings ``SUBPROCESS_ABNORMAL_ABORT`` keyed histogram.
+
+If we catch a crash report for this, then additionally the ``SUBPROCESS_CRASHES_WITH_DUMP`` keyed histogram is incremented.
diff --git a/toolkit/components/telemetry/docs/concepts/index.rst b/toolkit/components/telemetry/docs/concepts/index.rst
new file mode 100644
index 000000000..a49466f8d
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/index.rst
@@ -0,0 +1,23 @@
+========
+Concepts
+========
+
+There are common concepts used throughout Telemetry:
+
+* :doc:`pings <pings>` - the packets we use to submit data
+* :doc:`sessions & subsessions <sessions>` - how we slice a users' time in the browser
+* *measurements* - how we :doc:`collect data <../collection/index>`
+* *opt-in* & *opt-out* - the different sets of data we collect
+* :doc:`submission <submission>` - how we send data to the servers
+* :doc:`archiving <archiving>` - retaining ping data locally
+* :doc:`crashes <crashes>` - the different data crashes generate
+
+.. toctree::
+ :maxdepth: 2
+ :titlesonly:
+ :glob:
+ :hidden:
+
+ pings
+ crashes
+ *
diff --git a/toolkit/components/telemetry/docs/concepts/pings.rst b/toolkit/components/telemetry/docs/concepts/pings.rst
new file mode 100644
index 000000000..db7371b32
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/pings.rst
@@ -0,0 +1,32 @@
+.. _telemetry_pings:
+
+=====================
+Telemetry pings
+=====================
+
+A *Telemetry ping* is the data that we send to Mozillas Telemetry servers.
+
+That data is stored as a JSON object client-side and contains common information to all pings and a payload specific to a certain *ping types*.
+
+The top-level structure is defined by the :doc:`common ping format <../data/common-ping>` format.
+It contains:
+
+* some basic information shared between different ping types
+* the :doc:`environment data <../data/environment>` (optional)
+* the data specific to the *ping type*, the *payload*.
+
+Ping types
+==========
+
+We send Telemetry with different ping types. The :doc:`main <../data/main-ping>` ping is the ping that contains the bulk of the Telemetry measurements for Firefox. For more specific use-cases, we send other ping types.
+
+Pings sent from code that ships with Firefox are listed in the :doc:`data documentation <../data/index>`.
+
+Important examples are:
+
+* :doc:`main <../data/main-ping>` - contains the information collected by Telemetry (Histograms, hang stacks, ...)
+* :doc:`saved-session <../data/main-ping>` - has the same format as a main ping, but it contains the *"classic"* Telemetry payload with measurements covering the whole browser session. This is only a separate type to make storage of saved-session easier server-side. This is temporary and will be removed soon.
+* :doc:`crash <../data/crash-ping>` - a ping that is captured and sent after Firefox crashes.
+* ``activation`` - *planned* - sent right after installation or profile creation
+* ``upgrade`` - *planned* - sent right after an upgrade
+* :doc:`deletion <../data/deletion-ping>` - sent when FHR upload is disabled, requesting deletion of the data associated with this user
diff --git a/toolkit/components/telemetry/docs/concepts/sessions.rst b/toolkit/components/telemetry/docs/concepts/sessions.rst
new file mode 100644
index 000000000..088556978
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/sessions.rst
@@ -0,0 +1,40 @@
+========
+Sessions
+========
+
+A *session* is the time from when Firefox starts until it shut down.
+A session can be very long-running. E.g. for Mac users that are used to always put their laptops into sleep-mode, Firefox may run for weeks.
+We slice the sessions into smaller logical units called *subsessions*.
+
+Subsessions
+===========
+
+The first subsession starts when the browser starts. After that, we split the subsession for different reasons:
+
+* ``daily``, when crossing local midnight. This keeps latency acceptable by triggering a ping at least daily for most active users.
+* ``environment-change``, when a change to the *environment* happens. This happens for important changes to the Firefox settings and when addons activate or deactivate.
+
+On a subsession split, a :doc:`main ping <../data/main-ping>` with that reason will be submitted. We store the reason in the pings payload, to see what triggered it.
+
+A session always ends with a subsession with one of two reason:
+
+* ``shutdown``, when the browser was cleanly shut down. To avoid delaying shutdown, we only save this ping to disk and send it at the next opportunity (typically the next browsing session).
+* ``aborted-session``, when the browser crashed. While Firefox is active, we write the current ``main`` ping data to disk every 5 minutes. If the browser crashes, we find this data on disk on the next start and send it with this reason.
+
+.. image:: subsession_triggers.png
+
+Subsession data
+===============
+
+A subsessions data consists of:
+
+* general information: the date the subsession started, how long it lasted, etc.
+* specific measurements: histogram & scalar data, etc.
+
+This has some advantages:
+
+* Latency - Sending a ping with all the data of a subsession immediately after it ends means we get the data from installs faster. For ``main`` pings, we aim to send a ping at least daily by starting a new subsession at local midnight.
+* Correlation - By starting new subsessions when fundamental settings change (i.e. changes to the *environment*), we can correlate a subsessions data better to those settings.
+
+
+
diff --git a/toolkit/components/telemetry/docs/concepts/submission.rst b/toolkit/components/telemetry/docs/concepts/submission.rst
new file mode 100644
index 000000000..165917d40
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/submission.rst
@@ -0,0 +1,34 @@
+==========
+Submission
+==========
+
+*Note:* The server-side behaviour is documented in the `HTTP Edge Server specification <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification>`_.
+
+Pings are submitted via a common API on ``TelemetryController``.
+If a ping fails to successfully submit to the server immediately (e.g. because
+of missing internet connection), Telemetry will store it on disk and retry to
+send it until the maximum ping age is exceeded (14 days).
+
+*Note:* the :doc:`main pings <../data/main-ping>` are kept locally even after successful submission to enable the HealthReport and SelfSupport features. They will be deleted after their retention period of 180 days.
+
+Submission logic
+================
+
+Sending of pending pings starts as soon as the delayed startup is finished. They are sent in batches, newest-first, with up
+to 10 persisted pings per batch plus all unpersisted pings.
+The send logic then waits for each batch to complete.
+
+If it succeeds we trigger the next send of a ping batch. This is delayed as needed to only trigger one batch send per minute.
+
+If ping sending encounters an error that means retrying later, a backoff timeout behavior is
+triggered, exponentially increasing the timeout for the next try from 1 minute up to a limit of 120 minutes.
+Any new ping submissions and "idle-daily" events reset this behavior as a safety mechanism and trigger immediate ping sending.
+
+Status codes
+============
+
+The telemetry server team is working towards `the common services status codes <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification#Server_Responses>`_, but for now the following logic is sufficient for Telemetry:
+
+* `2XX` - success, don't resubmit
+* `4XX` - there was some problem with the request - the client should not try to resubmit as it would just receive the same response
+* `5XX` - there was a server-side error, the client should try to resubmit later
diff --git a/toolkit/components/telemetry/docs/concepts/subsession_triggers.png b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png
new file mode 100644
index 000000000..5717b00a9
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png
Binary files differ