(Translated by https://www.hiragana.jp/)
♟ Milimetric
Page MenuHomePhabricator

Milimetric (Dan Andreescu)
Staff Engineer (Data Engineering)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Oct 8 2014, 5:48 PM (506 w, 5 d)
Availability
Available
IRC Nick
Milimetric
LDAP User
Milimetric
MediaWiki User
Milimetric (WMF) [ Global Accounts ]

Recent Activity

Today

Milimetric closed T367526: Cloud VPS "dashiki" project Buster deprecation as Resolved.
Tue, Jun 25, 12:59 PM · Cloud-VPS (Debian Buster Deprecation)
Milimetric added a comment to T367526: Cloud VPS "dashiki" project Buster deprecation.

This is now done.

Tue, Jun 25, 12:59 PM · Cloud-VPS (Debian Buster Deprecation)
Milimetric updated the task description for T367526: Cloud VPS "dashiki" project Buster deprecation.
Tue, Jun 25, 12:59 PM · Cloud-VPS (Debian Buster Deprecation)
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

Great question, @mforns. This was mostly for performance reasons. I couldn't find a way to get Spark to optimally work on the full day of pageviews without first aggregating it like this to > 250. But the execution plan I ended up with looks pretty wild. Let's talk tomorrow when you have some time. I'm attaching the change here.

Tue, Jun 25, 12:50 AM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki

Yesterday

Milimetric moved T368183: MPIC: Build Location + Sample Rates component from Sprint Backlog to In Process on the Data Products (Data Products Sprint 15) board.
Mon, Jun 24, 4:15 PM · Metrics Platform Backlog, Data Products (Data Products Sprint 15)
Milimetric claimed T367526: Cloud VPS "dashiki" project Buster deprecation.

I've migrated and shut off the old instances. I will delete them in a couple of days, just in case. But everything's working fine without them. Did not know about the wmflabs -> wmcloud automatic redirect, that made everything very simple.

Mon, Jun 24, 2:46 PM · Cloud-VPS (Debian Buster Deprecation)
Milimetric added a comment to T366004: Add page-title to the x_analytics header.

I grouped a couple of tasks under this so we're less likely to lose them in the fray.

Mon, Jun 24, 1:52 PM · Data-Engineering
Milimetric added subtasks for T366004: Add page-title to the x_analytics header: T304362: Pageview definition relies on X-Analytics to determine special pages, T240676: Develop a consistent rule for which special pages count as pageviews.
Mon, Jun 24, 1:49 PM · Data-Engineering
Milimetric added a parent task for T304362: Pageview definition relies on X-Analytics to determine special pages: T366004: Add page-title to the x_analytics header.
Mon, Jun 24, 1:49 PM · Analytics-Data-Problem, Patch-Needs-Improvement, Data-Platform-SRE
Milimetric added a parent task for T240676: Develop a consistent rule for which special pages count as pageviews: T366004: Add page-title to the x_analytics header.
Mon, Jun 24, 1:49 PM · Movement-Insights, Data-Engineering-Icebox, Campaign-Registration

Fri, Jun 21

Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

The simpler way to do this, just two phases as opposed to progressive, gets us fairly similar results, with about 200 fewer rows which are all detailing specific browser versions.

Fri, Jun 21, 8:12 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

We get a ton more detailed results this way, and the total coverage increases to 99.7%. Still not 99.9%, but I think we may have too much detail at some point. I'm fairly happy with these results, and I'm going to prepare the new browser general query as a gerrit change. It'll be good to get some review.

Fri, Jun 21, 8:03 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric moved T368113: Design and merge the new tables of file tables from Incoming (new tickets) to To be estimated/discussed on the Data-Engineering board.

This might affect some data we sqoop into HDFS and some of how we compute commons impact metrics or similar future metrics. We have to wait until a schema change is proposed to know for sure.

Fri, Jun 21, 6:38 PM · Data-Engineering, Data Products, Schema-change, DBA
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

From a discussion with @Krinkle about the data, a preliminary idea of how to roll up is:

Fri, Jun 21, 3:09 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki

Thu, Jun 20

Milimetric moved T342267: Investigate surprising "10% Other" portion of Analytics Browsers report from Paused to Code Review / Tech Input on the Data Products (Data Products Sprint 15) board.
Thu, Jun 20, 9:17 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

The long and the short of it is that we can get that "other" to about 2% if we simply roll up remaining data by browser family and os family. We could get fancier but let's see what folks think about just this approach.

Thu, Jun 20, 9:17 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

Ok, so these rows represent more than 0.1% of all views for this day and they're aggregated all the way down, so this is what we were dumping before, followed by a big "Other" bucket:

Thu, Jun 20, 8:32 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

ok, I have some results for us to peruse, from rolling up in different ways. First of all, my query so we can debate whether or not it's accurate.

Thu, Jun 20, 8:26 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki

Mon, Jun 17

Milimetric claimed T367810: Spike: Can we recreate a skeleton page_change (revision_change) event from DB replica alone?.

Analysis available in this spreadsheet: https://docs.google.com/spreadsheets/d/1iSlH5XsRXV7mDoku0F5HbLNJmx1CMBm6ECakZMPUbU8/edit?usp=sharing

Mon, Jun 17, 7:43 PM · Dumps 2.0 (Kanban Board)
Milimetric added a comment to T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.

Made up some slides to help think about this data:

Mon, Jun 17, 4:53 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric moved T342267: Investigate surprising "10% Other" portion of Analytics Browsers report from In Process to Paused on the Data Products (Data Products Sprint 15) board.
Mon, Jun 17, 3:06 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki

Wed, Jun 12

Milimetric claimed T342267: Investigate surprising "10% Other" portion of Analytics Browsers report.
Wed, Jun 12, 2:02 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki
Milimetric moved T342267: Investigate surprising "10% Other" portion of Analytics Browsers report from Sprint Backlog to In Process on the Data Products (Data Products Sprint 15) board.
Wed, Jun 12, 2:02 PM · Patch-For-Review, Data Products (Data Products Sprint 15), Analytics-Data-Problem, MediaWiki-Platform-Team (Radar), Data-Engineering, Data-Engineering-Dashiki

Tue, Jun 11

Milimetric added a comment to T358373: [Dumps 2] Reconcillation job to detect and fetch missing/corrupted revisions.

This looks like a great system to get started with. I can think of some potential snags that come up, so as we build it let's keep an eye out for these and similar:

Tue, Jun 11, 7:01 PM · Patch-For-Review, Dumps 2.0 (Kanban Board)
Milimetric moved T366759: MPIC: Template form should update on post from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 14) board.
Tue, Jun 11, 12:49 AM · Data Products (Data Products Sprint 15), Metrics Platform Backlog
Milimetric moved T366758: MPIC: Modify form should prepopulate on instrument select when toggling between functions from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 14) board.
Tue, Jun 11, 12:49 AM · Data Products (Data Products Sprint 15), Metrics Platform Backlog

Fri, Jun 7

Milimetric added a comment to T364872: Unique devices per country spikes on wikifunctions .
select day, http_status, count(*) count_by_status
  from pageview_actor
 where year=2024 and month=4 and day in (19,26)
   and geocoded_data['country_code'] = 'HK'
   and normalized_host.project_class = 'wikifunctions'
 group by day, http_status
dayhttp_statuscount_by_status
19200389
193014311
19302931
26200198
26301133801
263021028
Fri, Jun 7, 6:29 PM · Movement-Insights, Analytics-Data-Problem, Data-Platform

Thu, Jun 6

Milimetric created T366820: Wikistats Link with Language Option.
Thu, Jun 6, 3:52 PM · Data-Engineering, Data Products, Data-Engineering-Wikistats

Wed, Jun 5

Milimetric awarded T239378: Disable parent task metadata by default for new sub tasks a Like token.
Wed, Jun 5, 8:56 PM · Patch-For-Review, User-brennen, Release-Engineering-Team, Phabricator, Developer Productivity
Milimetric updated the task description for T366720: Public DataHub.
Wed, Jun 5, 4:33 PM · Data Products, Data-Engineering
Milimetric created T366720: Public DataHub.
Wed, Jun 5, 4:12 PM · Data Products, Data-Engineering

Tue, Jun 4

Milimetric moved T366604: [MPIC] Access to MPIC App within WMF systems from In Process to Done on the Data Products (Data Products Sprint 14) board.

https://mpic.svc.eqiad.wmnet:30443/ is the endpoint we should use to talk to the mpic service making sure we don't take unnecessary hops through DNS.

Tue, Jun 4, 3:19 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data Products (Data Products Sprint 14)

Fri, May 31

Milimetric added a comment to T366369: MaxMind seems to be mapping the same IP to different countries.

hypothesis so far: maybe some workers are getting MaxMind updates on a staggered schedule from others, so there's always some variation?

Fri, May 31, 3:58 PM · Data-Engineering
Milimetric added a project to T366369: MaxMind seems to be mapping the same IP to different countries: Data-Engineering.

(and sub-country it's much worse)

Fri, May 31, 3:57 PM · Data-Engineering
Milimetric created T366369: MaxMind seems to be mapping the same IP to different countries.
Fri, May 31, 3:57 PM · Data-Engineering

Wed, May 29

Milimetric moved T360914: Update Dashiki Cloud Instances from In Process to Sign Off on the Data Products (Data Products Sprint 14) board.

The two instances have been moved, docs updated on wiki and in code, and proxies have been moved. The only problem is the new proxies can't use the old wmflabs.org domain. For now, I left the old proxies up and additionally set up the new proxies. So, for example, both https://pingback.wmflabs.org/ and https://pingback.wmcloud.org/ work. Whenever the old instances are deleted, the old URLs will stop working. I guess part of sign-off will be to communicate this and maybe delete the old instances?

Wed, May 29, 10:02 PM · Data-Engineering, Data-Engineering-Dashiki, Data Products (Data Products Sprint 14)
Milimetric added a comment to T360914: Update Dashiki Cloud Instances.

Keeping track of how I do this for future reference. (The previous task where I did this was T236586 and I failed to take good notes there)

Wed, May 29, 9:30 PM · Data-Engineering, Data-Engineering-Dashiki, Data Products (Data Products Sprint 14)
Milimetric moved T360914: Update Dashiki Cloud Instances from Sprint Backlog to In Process on the Data Products (Data Products Sprint 14) board.
Wed, May 29, 3:41 PM · Data-Engineering, Data-Engineering-Dashiki, Data Products (Data Products Sprint 14)
Milimetric edited projects for T360914: Update Dashiki Cloud Instances, added: Data Products (Data Products Sprint 14); removed Data Products.
Wed, May 29, 3:41 PM · Data-Engineering, Data-Engineering-Dashiki, Data Products (Data Products Sprint 14)

May 23 2024

Milimetric added a comment to T364178: MPIC: Replace Lookup field for Contextual Attributes when component is ready.

https://gitlab.wikimedia.org/repos/data-engineering/mpic/-/merge_requests/31 is including both the Menu Button (sorry I couldn't find a task for this) and the Multi-lookup, as well as using them from one field each. I'm happy to help with this more when I'm back next week.

May 23 2024, 3:15 PM · Data Products (Data Products Sprint 15), Metrics Platform Backlog
Milimetric claimed T364178: MPIC: Replace Lookup field for Contextual Attributes when component is ready.
May 23 2024, 3:10 PM · Data Products (Data Products Sprint 15), Metrics Platform Backlog
Milimetric moved T364178: MPIC: Replace Lookup field for Contextual Attributes when component is ready from Sprint Backlog to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.
May 23 2024, 3:10 PM · Data Products (Data Products Sprint 15), Metrics Platform Backlog
Milimetric moved T363432: MPIC: Build a button with dropdown component from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.
May 23 2024, 3:09 PM · Data Products (Data Products Sprint 14), Metrics Platform Backlog
Milimetric moved T363431: MPIC: Build a dynamically expanding Lookup field from In Process to Paused on the Data Products (Data Products Sprint 13) board.
May 23 2024, 3:09 PM · Data Products (Data Products Sprint 15), Metrics Platform Backlog

May 16 2024

Milimetric added a comment to T363858: MenuButton: Introduce a WIP MenuButton component to Codex.

quick update: resolved with Eric to work on this as a separate component. Will start on a patch now, keeping it in the Codex sandbox for now with T363432 as the goal.

May 16 2024, 5:37 PM · Design-System-Team (DST-Sprint-23 (2024-05-13 to 2024-05-24)), Codex

May 15 2024

Milimetric created T365074: Requesting access to cassandra-staging-devs for milimetric.
May 15 2024, 9:51 PM · SRE, SRE-Access-Requests

May 9 2024

Milimetric renamed T364391: commonswiki and enwiki dumps thrashing from commonswiki and enwiki dumps trashing to commonswiki and enwiki dumps thrashing.
May 9 2024, 11:22 AM · Data Products (Data Products Sprint 13), Dumps-Generation
Milimetric moved T358707: [Commons Impact Metrics] Create Airflow job that formats and loads the data to Cassandra for AQS from In Process to Paused on the Data Products (Data Products Sprint 13) board.
May 9 2024, 11:20 AM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.
May 9 2024, 11:19 AM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics
Milimetric moved T364250: May 1, 2024 wikidatawiki dump not started from Testing to Done on the Data Products (Data Products Sprint 13) board.
May 9 2024, 11:16 AM · Data Products (Data Products Sprint 14), Dumps-Generation
Milimetric moved T362552: Commons Impact AQS: integration tests and deployment for endpoints from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.
May 9 2024, 11:15 AM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric moved T361669: Implement Category Metrics Snapshot API from Sign Off to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.
May 9 2024, 11:14 AM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics

May 1 2024

Milimetric moved T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 13) board.
May 1 2024, 8:16 PM · Data Products (Data Products Sprint 13), Data-Platform-SRE (2024.04.15 - 2024.05.05), Commons-Impact-Metrics
Milimetric added a comment to T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps.

ok, I didn't do much here, just provided a very short description and detailed out the schemas as Marcel had them in the design doc. Please let me know if anyone was imagining something else.

May 1 2024, 8:16 PM · Data Products (Data Products Sprint 13), Data-Platform-SRE (2024.04.15 - 2024.05.05), Commons-Impact-Metrics
Milimetric added a comment to T358701: [Commons Impact Metrics] Create Airflow job that generates the public dumps.

filling out the readme right now, thanks Ben!

May 1 2024, 6:57 PM · Data Products (Data Products Sprint 13), Data-Platform-SRE (2024.04.15 - 2024.05.05), Commons-Impact-Metrics
Milimetric moved T360646: [Sprint 11 GOAL] SDS 2.5.6 Define User and Technical Requirements for experimentation flagging engine and analyze 3rd party option against those requirements from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:51 PM · Data Products (Data Products Sprint 12)
Milimetric moved T360649: [Sprint 11 GOAL] Commons Impact Metrics: Deliver Data Pipeline from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:51 PM · Data Products (Data Products Sprint 12)
Milimetric moved T360650: [Sprint 11 GOAL] Commons Impact Metrics: Two endpoints pass unit tests from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:50 PM · Data Products (Data Products Sprint 12)
Milimetric moved T362424: [SPRINT 12 GOAL] MP Instrumentation Configuration working prototype demoed on staging and revised based on preliminary feedback and ready for user testing from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:50 PM · Data Products (Data Products Sprint 12)
Milimetric moved T362428: [SPRINT 12 GOAL] Decide on approach, toolset, and style for AQS user docs from Sprint Goals to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:49 PM · Data Products (Data Products Sprint 12)
Milimetric moved T361335: Deploy the MP Instrumentation Configuration application to the DSE k8s cluster from Paused to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:45 PM · Data-Platform-SRE (2024.06.17 - 2024.07.07), Data Products, Epic, Metrics Platform Backlog
Milimetric moved T362144: [User Story] Build the MPIC API endpoints - PATCH /instrument/:slug from Code Review / Tech Input to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:41 PM · Data Products (Data Products Sprint 13), Patch-For-Review, Metrics Platform Backlog
Milimetric moved T361343: Create the MPIC Kubernetes chart from Sign Off to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:40 PM · Data-Platform-SRE (2024.05.06 - 2024.05.26), Data Products (Data Products Sprint 12), Metrics Platform Backlog
Milimetric moved T360748: [User Story] Create the MPIC database schema from Sign Off to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:40 PM · Data Products (Data Products Sprint 12), Metrics Platform Backlog
Milimetric moved T361668: Go project and solution setup from Sign Off to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:40 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics
Milimetric moved T358681: [Commons Impact Metrics] Productionize SparkSQL and Spark-Scala from Sign Off to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:39 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics
Milimetric moved T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg from Sign Off to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:39 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics
Milimetric moved T362214: [PHP] Metrics Platform client library validation error(s) from To Deploy to Done on the Data Products (Data Products Sprint 12) board.
May 1 2024, 6:37 PM · MW-1.43-notes (1.43.0-wmf.2; 2024-04-23), Data Products (Data Products Sprint 12), Metrics Platform Backlog
Milimetric moved T361741: mediawiki_history_snapshot_config_dag fails since the last change about the AQS config table from Sprint Backlog to Done on the Data Products (Data Products Sprint 13) board.

I am not sure this is 100% squashed because the behavior is so weird. Here's what I found, in short:

May 1 2024, 4:54 PM · Data Products (Data Products Sprint 13)

Apr 18 2024

Milimetric added a comment to T361889: Decision: OpenAPI spec viewer for AQS.

It would be cool to do a quick spike into Scalar and the customization we'd need there. Abstain as a voter here, I like all the options just fine and I have bad aesthetics when it comes to reading docs because I just start hacking and see what happens :)

Apr 18 2024, 7:48 AM · Data Products (Data Products Sprint 12), Tech-Docs-Team, Documentation, AQS2.0
Milimetric added a comment to T361887: Decision: AQS user documentation approach.

+1 for Option 2. For what it's worth, when we initially put up the endpoint docs on wikitech we were just doing so while we waited for a better end user experience than the swagger UI afforded us. I especially like the integration with wikitech described in option 2 (the discovery pages that would lead wiki users to the docs)

Apr 18 2024, 7:45 AM · Data Products (Data Products Sprint 12), Tech-Docs-Team, Documentation, AQS2.0

Apr 16 2024

Milimetric added a comment to T362268: Design the technical architecture for MPIC.

+1, SSR is kind of a pain if done in fancier ways, but in this way you get a lot for free and it helps even reduce code. As a bonus the user gets a great experience.

Apr 16 2024, 11:14 AM · Data Products (Data Products Sprint 12), Metrics Platform Backlog

Apr 15 2024

Milimetric moved T362551: Commons Impact AQS: endpoints with unit tests from Sprint Backlog to Code Review / Tech Input on the Data Products (Data Products Sprint 12) board.
Apr 15 2024, 4:17 PM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric updated the task description for T362552: Commons Impact AQS: integration tests and deployment for endpoints.
Apr 15 2024, 4:17 PM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric updated the task description for T362551: Commons Impact AQS: endpoints with unit tests.
Apr 15 2024, 4:17 PM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric changed the point value for T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from 21 to 5.
Apr 15 2024, 4:16 PM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics
Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from Code Review / Tech Input to Paused on the Data Products (Data Products Sprint 12) board.

I've broken this down into subtasks but I'm keeping it as something between an epic and an actual task. It's coordinating and has all the acceptance criteria, it was just too big. So I'll leave the other two subtasks on the boards while I'm on vacation and put this in paused. This can be resumed whenever you'd like to continue work on coordinating and deployment.

Apr 15 2024, 4:16 PM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics
Milimetric created T362552: Commons Impact AQS: integration tests and deployment for endpoints.
Apr 15 2024, 4:15 PM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric created T362551: Commons Impact AQS: endpoints with unit tests.
Apr 15 2024, 4:14 PM · Data Products (Data Products Sprint 14), Commons-Impact-Metrics
Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 12) board.
Apr 15 2024, 4:04 PM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics

Apr 11 2024

Sj awarded T249419: RFC: Render data visualizations on the server a Love token.
Apr 11 2024, 1:33 PM · Wikimedia-Performance-recommendation, JavaScript, MediaWiki-extensions-Graph, covid-19, TechCom-RFC
Milimetric added a comment to T361742: Requesting access to shell access to analytics client servers for AndyRussG.

Approved, welcome back Andy :)

Apr 11 2024, 10:45 AM · Patch-For-Review, SRE, SRE-Access-Requests
Milimetric added a comment to T362113: Requesting access to analytics-privatedata-users for Steph Toyofuku.

Approved

Apr 11 2024, 10:43 AM · Patch-For-Review, SRE, SRE-Access-Requests

Apr 4 2024

Milimetric set the point value for T361669: Implement Category Metrics Snapshot API to 3.
Apr 4 2024, 11:12 AM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics
Milimetric set the point value for T356748: Adding a AQS 2.0 endpoint guide to 2.
Apr 4 2024, 11:12 AM · Data Products, AQS2.0
Milimetric set the point value for T361668: Go project and solution setup to 8.
Apr 4 2024, 11:12 AM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics
Milimetric changed the point value for T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from 34 to 21.
Apr 4 2024, 11:11 AM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics
Milimetric moved T354823: [PHP] Remove dispatch method from Code Review / Tech Input to To Deploy on the Data Products (Data Products Sprint 11) board.
Apr 4 2024, 11:09 AM · Data Products (Data Products Sprint 11), Patch-For-Review, Metrics Platform Backlog, good first task

Apr 3 2024

Milimetric moved T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg from Sprint Backlog to In Process on the Data Products (Data Products Sprint 11) board.
Apr 3 2024, 9:43 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics
Milimetric claimed T358699: [Commons Impact Metrics] Create Airflow job that generates the datasets in Iceberg.
Apr 3 2024, 9:43 PM · Data Products (Data Products Sprint 12), Commons-Impact-Metrics

Apr 1 2024

Milimetric moved T360501: Update ReadMe Doc for Tests Framework from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 11) board.
Apr 1 2024, 4:14 PM · Data Products (Data Products Sprint 13), AQS2.0
Milimetric moved T360735: [User Story] Build the backend service for MPIC from Code Review / Tech Input to Done on the Data Products (Data Products Sprint 11) board.
Apr 1 2024, 4:09 PM · Metrics Platform Backlog, Data Products (Data Products Sprint 11)

Mar 29 2024

Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

I found a candidate bug. The script used to ask for the year and month, and after the change it asks for the day. generate_druid_unique_devices_per_domain_daily_aggregated_monthly.hql seems to have been adapted to give the correct result, but evidence to the contrary, the druid output seems to be only one day. Running now to prove or disprove.

Mar 29 2024, 6:32 PM · Data-Engineering, Movement-Insights, Data-Platform
Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.
NOTE: one key finding here is that DataHub is not kept in sync with these data migrations. If we don't address this, DataHub will become more of a source of confusion than clarity.
Mar 29 2024, 6:06 PM · Data-Engineering, Movement-Insights, Data-Platform
Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

Merge request 582 seems to have changed how we do this monthly druid segment aggregation, so the answer must be around here. Again I checked the new source table, now Iceberg (wmf_readership.unique_devices_per_project_family_daily) and again that seems to have data for all of January, for example.

Mar 29 2024, 6:04 PM · Data-Engineering, Movement-Insights, Data-Platform
Milimetric added a comment to T361242: Unique devices tables have missing or incorrect data for January and February 2024.

Looked at this a bit today.

Mar 29 2024, 5:53 PM · Data-Engineering, Movement-Insights, Data-Platform

Mar 25 2024

Milimetric added a comment to T342577: Data Quality - requestctl not getting set.

@VirginiaPoundstone: Looks like Giuseppe patched varnish to send more requestctls, so maybe that completely or partially solves the problem. I'd have to look through the data to see. I'm going to do a good job focusing and only do that if you put it in the sprint :) (should take no more than an hour, but it's probably not like a few seconds if I want to be more thorough)

Mar 25 2024, 5:14 PM · Data Products, SRE, Traffic
Milimetric moved T358718: [Commons Impact Metrics] Create a new AQS service with all the endpoints from Sprint Backlog to In Process on the Data Products (Data Products Sprint 11) board.
Mar 25 2024, 4:07 PM · Data Products (Data Products Sprint 15), Commons-Impact-Metrics
Milimetric created T360914: Update Dashiki Cloud Instances.
Mar 25 2024, 3:59 PM · Data-Engineering, Data-Engineering-Dashiki, Data Products (Data Products Sprint 14)

Mar 22 2024

Milimetric moved T356444: NEW BUG REPORT Wikipedia clickstream datasets link on Dumps "Other" page should point to HTML readme from In Process to Code Review / Tech Input on the Data Products (Data Products Sprint 11) board.

I made the puppet change but I need an SRE to merge. This is not well documented indeed, we should talk about a better way to maintain this interface that so many people use.

Mar 22 2024, 4:21 PM · Patch-For-Review, Data Products (Data Products Sprint 11), Data-Platform