Large-scale Author Verification: Temporal and Topical Influences

M. van Dam, C. Hauff

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

8 Citations (Scopus)

Abstract

The task of author verification is concerned with the question whether or not someone is the author of a given piece of text. Algorithms that extract writing style features from texts are used to determine how close in style different documents are. Currently, evaluations of author verification algorithms are restricted to small-scale corpora with usually less than one hundred test cases. In this work, we present a methodology to derive a large-scale author verification corpus based on Wikipedia Talkpages. We create a corpus based on English Wikipedia which is significantly larger than existing corpora. We investigate two dimensions on this corpus which so far have not received sufficient attention: the influence of topic and the influence of time on author verification accuracy.
Original languageEnglish
Title of host publicationProceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval
Pages1039-1042
Number of pages4
DOIs
Publication statusPublished - 2014
EventSIGIR '14: 37th international ACM SIGIR conference on Research and development in information retrieval - Gold Coast, Australia
Duration: 6 Jul 201411 Jul 2014

Conference

ConferenceSIGIR '14: 37th international ACM SIGIR conference on Research and development in information retrieval
CountryAustralia
CityGold Coast
Period6/07/1411/07/14

Cite this

Dam, M. V., & Hauff, C. (2014). Large-scale Author Verification: Temporal and Topical Influences. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 1039-1042) https://doi.org/10.1145/2600428.2609504