Measurement by Proxy: On the Accuracy of Online Marketplace Measurements

Alejandro Cuevas, F.E.G. Miedema*, Kyle Soska, Nicolas Christin, R.S. van Wegberg

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

40 Downloads (Pure)


A number of recent studies have investigated online anony- mous (“dark web”) marketplaces. Almost all leverage a “measurement-by-proxy” design, in which researchers scrape market public pages, and take buyer reviews as a proxy for ac- tual transactions, to gain insights into market size and revenue. Yet, we do not know if and how this method biases results. We build a framework to reason about marketplace mea- surement accuracy, and use it to contrast estimates projected from scrapes of Hansa Market with data from a back-end database seized by the police. We further investigate, by sim- ulation, the impact of scraping frequency, consistency and rate-limits. We find that, even with a decent scraping regimen, one might miss approximately 46% of objects – with scraped listings differing significantly from not-scraped listings on price, views and product categories. This bias also impacts revenue calculations. We find Hansa’s total market revenue to be US $50M, which projections based on our scrapes un- derestimate by a factor of four. Simulations further show that studies based on one or two scrapes are likely to suffer from a very poor coverage (on average, 14% to 30%, respectively). A high scraping frequency is crucial to achieve reliable coverage, even without a consistent scraping routine. When high-frequency scraping is difficult, e.g., due to deployed anti- scraping countermeasures, innovative scraper design, such as scraping most popular listings first, helps improve cover- age. Finally, abundance estimators can provide insights on population coverage when population sizes are unknown.
Original languageEnglish
Title of host publicationProceedings of the 31st USENIX Security Symposium
PublisherUSENIX Association
Number of pages18
Publication statusPublished - 2022
Event31th Usenix security symposium - Boston, United States
Duration: 10 Aug 202212 Aug 2022
Conference number: 31


Conference31th Usenix security symposium
Country/TerritoryUnited States
Internet address


Dive into the research topics of 'Measurement by Proxy: On the Accuracy of Online Marketplace Measurements'. Together they form a unique fingerprint.

Cite this