Saturday, April 19, 2014

Fantastic Report

Ars Technica recently posted a study on PC Game Sales:

Right now, I can tell you that about 37 percent of the roughly 781 million games registered to various Steam accounts haven’t even been loaded a single time. I can tell you that Steam users have put an aggregate of about 3.8 billion hours into Dota 2.

Essentially, they found a way to systematically sample the profile pages of all Steam users, scraping games in library + hours played:

[We] scrape through more than a 100,000 pages a day. Using our knowledge of the Steam Community ID structure (and some light PHP/MySQL coding), we’ve been conducting what amounts to a rolling, randomized poll of the Steam user universe for about two months now. 
Just exactly the kind of thing I enjoy. The article goes on to describe their findings, the risks, and key insights their data scraping provides. I know I'll be bookmarking it as a guideline for my next project - both at work and in my personal "futzing" around.