Nowhere to hide: Finding plagiarized documents based on sentence similarity

Nathaniel Gustafson*, Maria Soledad Pera, Yiu Kai Ng

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

15 Citations (Scopus)

Abstract

Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimP aD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.

Original languageEnglish
Title of host publicationProceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Pages690-696
Number of pages7
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 - Sydney, NSW, Australia
Duration: 9 Dec 200812 Dec 2008

Publication series

NameProceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

Conference

Conference2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Country/TerritoryAustralia
CitySydney, NSW
Period9/12/0812/12/08

Fingerprint

Dive into the research topics of 'Nowhere to hide: Finding plagiarized documents based on sentence similarity'. Together they form a unique fingerprint.

Cite this