Hypothesis owner: @apaskulin
Hypothesis owner delegate: @VirginiaPoundstone
Start date: Jul 8, 2024
Target completion date: Nov 1, 2024
Actual completion date: Nov 1, 2024
Hypothesis ID: SDS2.1.5
Hypothesis
If we design a documentation system that guides the experience of users building instrumentation using the Metrics Platform, we will enable those users to independently create instrumentation without direct support from Data Products teams, except in edge cases.
Scope
In scope:
- Creating an instrument
- Data contract and schema design docs
- Creating a place in the documentation system to add docs for to-be-developed features
- Client library documentation
- Template instrument docs standard
Out of scope:
- Documentation for concepts that are core to role-based expertise such as the basics of experiment design and data analysis
- Leading documentation for all features added during the fiscal year
- Maintainer docs for MP core and clients
- Migrating an existing instrument will be considered an edge case, and users will be directed to consult Data Platform
Ownership
Following the project's completion, the Data Products team will own all documentation created by this project as part of their overall ownership of the Metrics Platform. After the project, Tech Docs will be available for reviews and help through our regular support channels.
Project plan
- Phase 1: Research and planning
- Research existing documentation, and create project plan
- T370335: Update guide to creating an instrument with Metrics Platform
- Phase 2: Pre-MPIC updates and organization
- T371562: Create a consolidated intro for Metrics Platform
- T372686: Create an interim process for documenting Metrics Platform instruments
- T372685: De-duplicate Metrics Platform stream configuration docs
- T372683: Combine Metrics Platform getting started and setup guides
- T372682: De-duplicate Metrics Platform API docs
- T372681: Update guide to creating a custom schema for Metrics Platform
- T372679: Create measurement plan template and update instrumentation spec template
- Phase 3: Post-MPIC updates and organization
- T372690: Design information architecture for Metrics Platform docs
- T372680: Investigate schema visualization tools for schema.wikimedia.org
- T374742: Update Metrics Platform docs on required contextual attributes
- T376434: Consolidate analytical sampling docs
- T376489: Split Metrics Platform API docs by language
- T372688: Add documentation links to the MPIC form
- T378219: Split instrument guide
- Phase 4: Cleanup
- T370363: Document when to use Event Platform vs Metrics Platform for instrumentation
- T372689: Follow up on duplicated and unused pages for Metrics Platform
- T372687: Replace the click example in the Metrics Platform instrument guide
- T378220: Create a Metrics Platform glossary
- T378223: Redirect Metrics Platform wiki pages
- T378222: Update links to Metrics Platform docs in client code
- T378224: Create a guide to maintaining the Metrics Platform docs
Hypothesis result
This hypothesis was a success! In September 2024, the Growth team used the updated docs to create an instrument. Growth engineer Sergio Gimeno Saldaña stated that the docs were very helpful and that Growth was able to set up their instrument without asking Data Products questions about documented functionality. This fulfills the original criteria of the hypothesis to design documentation that enables users to independently create instrumentation without direct support from Data Products teams, except in edge cases. The Data Product team will continue to evaluate the effectiveness of the docs and iterate on them as more teams build instrumentation with the Metrics Platform.
Outcomes
- Documented end-to-end workflows: I created a complete guide to create an instrument with Metrics Platform from start to finish, including following process that are required but outside Metrics Platform itself (like following the Data Collection Guidelines). Uncovering these hidden requirements allows users to feel confident creating instruments without direct support from Data Products.
- Created modular docs to support UI tools: The ultimate goal of Metrics Platform is to provide a tool that allows product managers to create instrumentation with minimal technical complexity. To support this, I designed a documentation structure that was modular, with pages covering specific topics that can be linked to from these UIs to provide additional guidance for users to complete tasks using the UI. I then added these links to the current MPIC UI to improve the clarity and usability of the tool, which where critical issues identified in the MPIC UX study (SDS2.1.4).
- Reduced duplication: When I started working on the Metrics Platform docs , the most critical issue was duplication. This is a very common problem across all docs since information tends to proliferate naturally. As part of this project, I consolidated the existing Metrics Platform documentation so that all information is now stored in a single place. To support this, I used interlinking, page transclusion, and section transclusion. This is evidenced by the fact that I removed nearly as much content as I added during this project (~124,500 bytes added verses ~115,600 bytes removed).
- Ensured accuracy and consistency: With a rapidly evolving system like Metrics Platform, it's difficult to keep the docs up to date with the latest capabilities. I worked with Data Products to update the docs to reflect the latest system implementation. I also added citations in the docs with references to places in the codebases where functionality is defined, which will help ensure accuracy going forward.
- Created templates to streamline processes: Data collection activities require steps that reach beyond Metrics Platform. Any team creating instrumentation must also complete processes owned by other teams, such as Legal and Product Analytics. To make these processes easier and faster to complete, I worked with Product Analytics to create a standard template for data collection measurement plans and an updated template for instrumentation specs. These templates will improve the experience of everyone starting data collection, not just using Metrics Platform.
- Improved understanding of system functionality within Data Products: While the main goal of the project was to improve user-facing docs. The improved docs also helped increase knowledge of system functionality within the Data Products team. Metrics Platform includes a significant amount of complexity in the interaction between schemas and clients, so by documenting this, I was able to help increase knowledge sharing within Data Products and make it easier to onboard new system maintainers in the future.
Documentation
- Metrics Platform docs: https://wikitech.wikimedia.org/wiki/Metrics_Platform
- Information architecture and principles: https://wikitech.wikimedia.org/wiki/Metrics_Platform/Documentation_maintenance
- Example: Guide to creating an instrument
Recommended next steps
As part of this project, I identified these recommended next steps for Data Products to take to improve the docs, to be prioritized alongside their other work as they see fit:
- T378544: Create an architecture overview doc for Metrics Platform: Metrics Platform contains several components. Documenting how these components interact (even in an interim state) will help Data Products understand and work with the system as it develops.
- T378547: Create a feature availability status chart for Metrics Platform: Because some Metrics Platform functionality depends on the client, it would be helpful for users to have a chart that tracks the feature availability per client. This would also help align product management and engineers when recruiting early adopters.
- T378548: Add advice for where to write instrument code: This is a small outstanding element of user guidance that should be clarified and added to the docs.
- T376841: Render human-readable schemas on schema.wikimedia.org: By improving the readability of raw schema definitions, we can more effectively rely on schema definitions as a single-source of information about schema properties and avoid duplication.
In addition to these next steps, there are a few tasks related to this project that are still under review by Data Products. I'll continue to monitor any updates on these tasks over the next couple of weeks.
Risks
In the same way that there's always a risk of introducing tech debt in a codebase, there's an ongoing risk of introducing information debt in docs in the form of duplication. As Data Products works on potential training resources (T370183) and the experimentation scorecard (T374981), remember to integrate these efforts with the docs and not duplicate information.
Methodology
I used a unique methodology for this project that was tailored to documenting an evolving system. Typically, a documentation project begins with a complete audit of the existing content and design for the information architecture. Instead, to adapt to evolving workflows and to support emerging users of the platform, I started by focusing on improvements to a central document, testing it with users, and planning improvements to other pages from there. This allowed me to gain clarity on how the system works and reach a usable version of the docs early on, which supported the Growth team in using the docs to create their instrument mid-quarter. As a second step, I took a more traditional, top-down approach and audited the existing content to identify further opportunities for consolidation and clarification. As a final step, I looked at the documentation needs relative to the planned user experience and designed an information architecture to meet these needs. This three-stage process was effective in supporting early prototyping and frequent feedback cycles and could be used as a model for documenting rapidly evolving products.