From Wikipedia, the free encyclopedia

This essay on benchmarking describes techniques and limits in measuring performance issues on Wikipedia. The term " benchmarking" has been used for many decades in testing computer performance. For example, with Wikipedia pages, a common technique is to invoke a template repeatedly, perhaps 400 times copied down a page, during an edit-preview, to check the duration of the 400 instances. The time span for each instance is, then, the total time minus 0.3 seconds (as minimum page-load), divided by 400. Timings of repeated text are limited to spans of about 1 minute, total time, otherwise the page could trigger a " WP:Wikimedia Foundation error" as cancelling the reformat due to a page-timeout limit.

Large swings in page-load times

Because articles are reformatted by various among the (400?) file servers, depending on availability, the time needed to load an article page can vary widely from minute-to-minute, not just for "very busy" times of the day. For example, during July 2012, one slow article, using dozens of large templates, took 12 seconds to reformat during an edit-preview, then within 1 minute, the repeated edit-preview (no changes) ran 20 seconds of server time, followed within the minute by a repeated edit-preview of 13 seconds. The time variation, slowed in the 2nd preview to 67% longer, was an unusually long delay, beyond the more-typical delays of 10%-40% for busy servers. That example indicates how a very-slow response can occur between 2 rapid responses, as showing a large swing in page-load times. For that reason, timings should be compared over numerous runs, selecting the minimum times to represent the underlying page-load time, as the typical technique when benchmarking any article performance issues.