diff options
Diffstat (limited to 'toolkit/components/telemetry/docs/collection')
5 files changed, 328 insertions, 0 deletions
diff --git a/toolkit/components/telemetry/docs/collection/custom-pings.rst b/toolkit/components/telemetry/docs/collection/custom-pings.rst new file mode 100644 index 000000000..daad87bfe --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/custom-pings.rst @@ -0,0 +1,74 @@ +======================= +Submitting custom pings +======================= + +Custom pings can be submitted from JavaScript using: + +.. code-block:: js + + TelemetryController.submitExternalPing(type, payload, options) + +- ``type`` - a ``string`` that is the type of the ping, limited to ``/^[a-z0-9][a-z0-9-]+[a-z0-9]$/i``. +- ``payload`` - the actual payload data for the ping, has to be a JSON style object. +- ``options`` - optional, an object containing additional options: + - ``addClientId``- whether to add the client id to the ping, defaults to ``false`` + - ``addEnvironment`` - whether to add the environment data to the ping, defaults to ``false`` + - ``overrideEnvironment`` - a JSON style object that overrides the environment data + +``TelemetryController`` will assemble a ping with the passed payload and the specified options. +That ping will be archived locally for use with Shield and inspection in ``about:telemetry``. +If the preferences allow upload of Telemetry pings, the ping will be uploaded at the next opportunity (this is subject to throttling, retry-on-failure, etc.). + +Submission constraints +---------------------- + +When submitting pings on shutdown, they should not be submitted after Telemetry shutdown. +Pings should be submitted at the latest within: + +- the `observer notification <https://developer.mozilla.org/de/docs/Observer_Notifications#Application_shutdown>`_ ``"profile-before-change"`` +- the :ref:`AsyncShutdown phase <AsyncShutdown_phases>` ``sendTelemetry`` + +There are other constraints that can lead to a ping submission getting dropped: + +- invalid ping type strings +- invalid payload types: E.g. strings instead of objects. +- oversized payloads: We currently only drop pings >1MB, but targetting sizes of <=10KB is recommended. + +Tools +===== + +Helpful tools for designing new pings include: + +- `gzipServer <https://github.com/mozilla/gzipServer>`_ - a Python script that can run locally and receives and saves Telemetry pings. Making Firefox send to it allows inspecting outgoing pings easily. +- ``about:telemetry`` - allows inspecting submitted pings from the local archive, including all custom ones. + +Designing custom pings +====================== + +In general, creating a new custom ping means you don't benefit automatically from the existing tooling. Further work is needed to make data show up in re:dash or other analysis tools. + +In addition to the `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`_, questions to guide a new pings design are: + +- Submission interval & triggers: + - What events trigger ping submission? + - What interval is the ping submitted in? + - Is there a throttling mechanism? + - What is the desired latency? (submitting "at least daily" still leads to certain latency tails) + - Are pings submitted on a clock schedule? Or based on "time since session start", "time since last ping" etc.? (I.e. will we get sharp spikes in submission volume?) +- Size and volume: + - What’s the size of the submitted payload? + - What's the full ping size including metadata in the pipeline? + - What’s the target population? + - What's the overall estimated volume? +- Dataset: + - Is it opt-out? + - Does it need to be opt-out? + - Does it need to be in a separate ping? (why can’t the data live in probes?) +- Privacy: + - Is there risk to leak PII? + - How is that risk mitigated? +- Data contents: + - Does the submitted data answer the posed product questions? + - Does the shape of the data allow to answer the questions efficiently? + - Is the data limited to whats needed to answer the questions? + - Does the data use common formats? (i.e. can we re-use tooling or analysis know-how) diff --git a/toolkit/components/telemetry/docs/collection/histograms.rst b/toolkit/components/telemetry/docs/collection/histograms.rst new file mode 100644 index 000000000..8d0233dbf --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/histograms.rst @@ -0,0 +1,5 @@ +========== +Histograms +========== + +Recording into histograms is currently documented in `a MDN article <https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe>`_. diff --git a/toolkit/components/telemetry/docs/collection/index.rst b/toolkit/components/telemetry/docs/collection/index.rst new file mode 100644 index 000000000..e4084e62a --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/index.rst @@ -0,0 +1,35 @@ +=============== +Data collection +=============== + +There are different APIs and formats to collect data in Firefox, all suiting different use cases. + +In general, we aim to submit data in a common format where possible. This has several advantages; from common code and tooling to sharing analysis know-how. + +In cases where this isn't possible and more flexibility is needed, we can submit custom pings or consider adding different data formats to existing pings. + +*Note:* Every new data collection must go through a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`_. + +The current data collection possibilities include: + +* :doc:`scalars` allow recording of a single value (string, boolean, a number) +* :doc:`histograms` can efficiently record multiple data points +* ``environment`` data records information about the system and settings a session occurs in +* ``TelemetryLog`` allows collecting ordered event entries (note: this does not have supporting analysis tools) +* :doc:`measuring elapsed time <measuring-time>` +* :doc:`custom pings <custom-pings>` + +.. toctree:: + :maxdepth: 2 + :titlesonly: + :hidden: + :glob: + + scalars + histograms + measuring-time + custom-pings + +Browser Usage Telemetry +~~~~~~~~~~~~~~~~~~~~~~~ +For more information, see :ref:`browserusagetelemetry`. diff --git a/toolkit/components/telemetry/docs/collection/measuring-time.rst b/toolkit/components/telemetry/docs/collection/measuring-time.rst new file mode 100644 index 000000000..918c8a85a --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/measuring-time.rst @@ -0,0 +1,74 @@ +====================== +Measuring elapsed time +====================== + +To make it easier to measure how long operations take, we have helpers for both JavaScript and C++. +These helpers record the elapsed time into histograms, so you have to create suitable histograms for them first. + +From JavaScript +=============== +JavaScript can measure elapsed time using `TelemetryStopwatch.jsm <https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryStopwatch.jsm>`_. + +``TelemetryStopwatch`` is a helper that simplifies recording elapsed time (in milliseconds) into histograms (plain or keyed). + +API: + +.. code-block:: js + + TelemetryStopwatch = { + // Start, cancel & finish recording elapsed time into a histogram. + // |aObject| is optional. If specificied, the timer is associated with this + // object, so multiple time measurements can be done concurrently. + start(histogramId, aObject); + cancel(histogramId, aObject); + finish(histogramId, aObject); + // Start, cancel & finished recording elapsed time into a keyed histogram. + // |key| specificies the key to record into. + // |aObject| is optional and used as above. + startKeyed(histogramId, key, aObject); + cancelKeyed(histogramId, key, aObject); + finishKeyed(histogramId, key, aObject); + }; + +Example: + +.. code-block:: js + + TelemetryStopwatch.start("SAMPLE_FILE_LOAD_TIME_MS"); + // ... start loading file. + if (failedToOpenFile) { + // Cancel this if the operation failed early etc. + TelemetryStopwatch.cancel("SAMPLE_FILE_LOAD_TIME_MS"); + return; + } + // ... do more work. + TelemetryStopwatch.finish("SAMPLE_FILE_LOAD_TIME_MS"); + +From C++ +======== + +API: + +.. code-block:: cpp + + // This helper class is the preferred way to record elapsed time. + template<ID id, TimerResolution res = MilliSecond> + class AutoTimer { + // Record into a plain histogram. + explicit AutoTimer(TimeStamp aStart = TimeStamp::Now()); + // Record into a keyed histogram, with key |aKey|. + explicit AutoTimer(const nsCString& aKey, + TimeStamp aStart = TimeStamp::Now()); + }; + + void AccumulateTimeDelta(ID id, TimeStamp start, TimeStamp end = TimeStamp::Now()); + +Example: + +.. code-block:: cpp + + { + Telemetry::AutoTimer<Telemetry::FIND_PLUGINS> telemetry; + // ... scan disk for plugins. + } + // When leaving the scope, AutoTimers destructor will record the time that passed. diff --git a/toolkit/components/telemetry/docs/collection/scalars.rst b/toolkit/components/telemetry/docs/collection/scalars.rst new file mode 100644 index 000000000..2c48601a4 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/scalars.rst @@ -0,0 +1,140 @@ +======= +Scalars +======= + +Historically we started to overload our histogram mechanism to also collect scalar data, +such as flag values, counts, labels and others. +The scalar measurement types are the suggested way to collect that kind of scalar data. +We currently only support recording of scalars from the parent process. +The serialized scalar data is submitted with the :doc:`main pings <../data/main-ping>`. + +The API +======= +Scalar probes can be managed either through the `nsITelemetry interface <https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/nsITelemetry.idl>`_ +or the `C++ API <https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Telemetry.h>`_. + +JS API +------ +Probes in privileged JavaScript code can use the following functions to manipulate scalars: + +.. code-block:: js + + Services.telemetry.scalarAdd(aName, aValue); + Services.telemetry.scalarSet(aName, aValue); + Services.telemetry.scalarSetMaximum(aName, aValue); + + Services.telemetry.keyedScalarAdd(aName, aKey, aValue); + Services.telemetry.keyedScalarSet(aName, aKey, aValue); + Services.telemetry.keyedScalarSetMaximum(aName, aKey, aValue); + +These functions can throw if, for example, an operation is performed on a scalar type that doesn't support it +(e.g. calling scalarSetMaximum on a scalar of the string kind). Please look at the `code documentation <https://dxr.mozilla.org/mozilla-central/search?q=regexp%3ATelemetryScalar%3A%3A(Set%7CAdd)+file%3ATelemetryScalar.cpp&redirect=false>`_ for +additional information. + +C++ API +------- +Probes in native code can use the more convenient helper functions declared in `Telemetry.h <https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Telemetry.h>`_: + +.. code-block:: cpp + + void ScalarAdd(mozilla::Telemetry::ScalarID aId, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, const nsAString& aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, bool aValue); + void ScalarSetMaximum(mozilla::Telemetry::ScalarID aId, uint32_t aValue); + + void ScalarAdd(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, bool aValue); + void ScalarSetMaximum(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, uint32_t aValue); + +The YAML definition file +======================== +Scalar probes are required to be registered, both for validation and transparency reasons, +in the `Scalars.yaml <https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Scalars.yaml>`_ +definition file. + +The probes in the definition file are represented in a fixed-depth, two-level structure: + +.. code-block:: yaml + + # The following is a group. + a.group.hierarchy: + a_probe_name: + kind: uint + ... + another_probe: + kind: string + ... + ... + group2: + probe: + kind: int + ... + +Group and probe names need to follow a few rules: + +- they cannot exceed 40 characters each; +- group names must be alpha-numeric + ``.``, with no leading/trailing digit or ``.``; +- probe names must be alpha-numeric + ``_``, with no leading/trailing digit or ``_``. + +A probe can be defined as follows: + +.. code-block:: yaml + + a.group.hierarchy: + a_scalar: + bug_numbers: + - 1276190 + description: A nice one-line description. + expires: never + kind: uint + notification_emails: + - telemetry-client-dev@mozilla.com + +Required Fields +--------------- + +- ``bug_numbers``: A list of unsigned integers representing the number of the bugs the probe was introduced in. +- ``description``: A single or multi-line string describing what data the probe collects and when it gets collected. +- ``expires``: The version number in which the scalar expires, e.g. "30"; a version number of type "N" and "N.0" is automatically converted to "N.0a1" in order to expire the scalar also in the development channels. A telemetry probe acting on an expired scalar will print a warning into the browser console. For scalars that never expire the value ``never`` can be used. +- ``kind``: A string representing the scalar type. Allowed values are ``uint``, ``string`` and ``boolean``. +- ``notification_emails``: A list of email addresses to notify with alerts of expiring probes. More importantly, these are used by the data steward to verify that the probe is still useful. + +Optional Fields +--------------- + +- ``cpp_guard``: A string that gets inserted as an ``#ifdef`` directive around the automatically generated C++ declaration. This is typically used for platform-specific scalars, e.g. ``ANDROID``. +- ``release_channel_collection``: This can be either ``opt-in`` (default) or ``opt-out``. With the former the scalar is submitted by default on pre-release channels; on the release channel only if the user opted into additional data collection. With the latter the scalar is submitted by default on release and pre-release channels, unless the user opted out. +- ``keyed``: A boolean that determines whether this is a keyed scalar. It defaults to ``False``. + +String type restrictions +------------------------ +To prevent abuses, the content of a string scalar is limited to 50 characters in length. Trying +to set a longer string will result in an error and no string being set. + +Keyed Scalars +------------- +Keyed scalars are collections of one of the available scalar types, indexed by a string key that can contain UTF8 characters and cannot be longer than 70 characters. Keyed scalars can contain up to 100 keys. This scalar type is for example useful when you want to break down certain counts by a name, like how often searches happen with which search engine. + +Keyed scalars should only be used if the set of keys are not known beforehand. If the keys are from a known set of strings, other options are preferred if suitable, like categorical histograms or splitting measurements up into separate scalars. + +The processor scripts +===================== +The scalar definition file is processed and checked for correctness at compile time. If it +conforms to the specification, the processor scripts generate two C++ headers files, included +by the Telemetry C++ core. + +gen-scalar-data.py +------------------ +This script is called by the build system to generate the ``TelemetryScalarData.h`` C++ header +file out of the scalar definitions. +This header file contains an array holding the scalar names and version strings, in addition +to an array of ``ScalarInfo`` structures representing all the scalars. + +gen-scalar-enum.py +------------------ +This script is called by the build system to generate the ``TelemetryScalarEnums.h`` C++ header +file out of the scalar definitions. +This header file contains an enum class with all the scalar identifiers used to access them +from code through the C++ API. |