User:Chlod/Analysis/2021 Pacific typhoon season

The 2021 Pacific typhoon season article has accumulated a total of 4,268 revisions (and counting) after only 9 months. Much of season articles in WikiProject Tropical cyclones succumb from inflated edit counts. For 2021 (not including two-year season articles), the North Atlantic basin accumulated a total of 3,582 revisions, the Eastern Pacific basin accumulated a total of 2,454 revisions, the North Indian Ocean basin accumulated a total of 1,137 revisions. The 2020 Pacific typhoon season accumulated a total of 5,980 revisions.

The extremely inflated edit count prompted me to investigate what exactly causes all these edits. This page documents the results of that investigation.

A brief note: "diff size" refers to the change in bytes of a diff.

Why bother? edit

Edits with barely any changes are rather mundane. They mostly change numbers and storm data around in order to present the latest values. The inflated edit counts, however, pose a significant problem for editors working on prose.

  1. An inflated edit count means it will be harder to navigate the page history for significant additions. Special:PageHistory does not include special filters for diff size. This is especially hard for those looking for specific revisions, such as those working on attributing intrawiki copies, which requires a revision ID as much as possible.
  2. Modifying current storm data on the season article will inevitably inflate edit counts on the storm articles as well, given that they are published while the storm is active. This is because editors will attempt to update data on all articles instead of just those on the season article.
  3. For new editors who have edited no other article, this might be considered a form of gaming extended-confirmed rights.

Data edit

 
Edits to the English Wikipedia article "2021 Pacific typhoon season" over time, trimmed to 1000 (positive and negative) in order to remove outlier points.
Statistics
 
Edits to the English Wikipedia article "2021 Pacific typhoon season" over time, trimmed to 100 (positive and negative) in order to remove outlier points.
Total edits 4,267
Average diff size 58.48
Edits with diff sizes < 1000 4,164
Edits with diff sizes < 100 3,496
Edits with diff sizes < 50 3,313
Edits with diff sizes < 20 2,869
Edits with diff sizes < 10 2,396
Edits with diff sizes < 5 1,944
Edits with diff sizes < 2 1,617
Zero-byte diff size edits 1,212
Season effects edits 325*
Current storm information edits 217*
Watches and warnings edits 149*
* Estimated based on edit summaries. May be completely inaccurate.
Sub-100 diff size edits by user
User Sub-100 diff size edits
HurricaneEdgar 1,395
Menia97! 368
Typhoon2013 336
Beraniladri19 201
Chlod 131
Akbermamps 97
Shadzarie 79
Meow 60
About123 58
ALEJOph769 54
Gummycow 41
CycloneEditor 38
Zero-byte diff size edits by user
User Zero-byte diff size edits
HurricaneEdgar 555
Menia97! 193
Typhoon2013 105
Beraniladri19 46
Shadzarie 36
ALEJOph769 32
Chlod 22
About123 19
CycloneEditor 15
Hurricaneboy23 12
Jasper Deng 12
Akbermamps 11

Interpretation edit

Based on the data above, at least 67.24% of edits can be attributed to low-byte data-related changes. These edits (which eventually get removed when a storm passes) do nothing but provide filler for the edit history, making it hard to navigate and also a computational waste of resources. Due to the sizes of season articles (some going over 250 kB), saving edits may take a long time for some editors. When crawling through diffs to look for insertions, the ~250 kB page must be re-rendered over and over again due to small number changes that are, at the end of a storm's passing, essentially in vain.

Solutions edit

What can be done about it? We are presented with a few options.

Option 1 edit

Entirely stop the usage of current storm information sections and current tropical cyclone infoboxes. This is the least suggested option.

Current storm information sections and {{infobox tropical cyclone current}} are highly visible parts of an article that provide the latest information to a reader. Wikipedia must recognize its existence as a source of information, and that said, we should not remove these sections for the sake of edit counts, especially when readers expect to see them most.

Option 2 edit

Move all data-related edits to the template space. This way, edits to templates are entirely isolated and kept separate from article edits, effectively reducing the edit count for articles.

This system follows a similar scheme to the {{X1}} sandbox template set, where each tropical cyclone will be assigned a placeholder value which it will hold until it dissipates. This is similar to the method that the Japan Meteorological Agency handles tropical cyclones on their cyclone details page (number assigned from 60 to 65). This method requires only wikitext, and is fairly easy to implement. This does, however, have the issue of inflating edit counts for the specific templates being used, or even worse, showing the incorrect storm information for a storm that existed in the earlier revision of the article. Thus, it is considered a less likely option.

Option 3 edit

Move all current storm information and current tracking data to Wikidata, where it should be. This is the most suggested option, as a structured data system like Wikidata is already optimized to handle large edit counts and minute data changes.

This falls under the idea of keeping data edits on Wikidata and prose edits on Wikipedia, forming a fine separation between the two. Implementation would require a but of Lua programming, but will definitely be useful in the long term. A problem that has to be addressed with this, however is the Watches and warnings section, which cannot be simply imported from structured data without a complicated data structure. If this option is used, W&W sections may need to be kept in prose anyways. Nonetheless, this already cuts down a lot of edits that would otherwise be in the season article.

Implementation plans for this option are detailed in User:Chlod/WikiProject Tropical cyclones data migration.

References edit