Nowhere to hide: Finding plagiarized documents based on sentence similarity

Nathaniel Gustafson; Maria Soledad Pera; Yiu Kai Ng

doi:10.1109/WIIAT.2008.16

Nowhere to hide: Finding plagiarized documents based on sentence similarity

Nathaniel Gustafson^*, Maria Soledad Pera, Yiu Kai Ng

^*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

15 Citations (Scopus)

Abstract

Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimP aD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.

Original language	English
Title of host publication	Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Pages	690-696
Number of pages	7
DOIs	https://doi.org/10.1109/WIIAT.2008.16
Publication status	Published - 2008
Externally published	Yes
Event	2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 - Sydney, NSW, Australia Duration: 9 Dec 2008 → 12 Dec 2008

Publication series

Name	Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

Conference

Conference	2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Country/Territory	Australia
City	Sydney, NSW
Period	9/12/08 → 12/12/08

Access to Document

10.1109/WIIAT.2008.16

Cite this

Gustafson, N., Pera, M. S., & Ng, Y. K. (2008). Nowhere to hide: Finding plagiarized documents based on sentence similarity. In Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 (pp. 690-696). Article 4740531 (Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008). https://doi.org/10.1109/WIIAT.2008.16

@inproceedings{9abed7e80d91422b80a699d2f423c206,

title = "Nowhere to hide: Finding plagiarized documents based on sentence similarity",

abstract = "Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimP aD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.",

author = "Nathaniel Gustafson and Pera, {Maria Soledad} and Ng, {Yiu Kai}",

year = "2008",

doi = "10.1109/WIIAT.2008.16",

language = "English",

isbn = "9780769534961",

series = "Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008",

pages = "690--696",

booktitle = "Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008",

note = "2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 ; Conference date: 09-12-2008 Through 12-12-2008",

}

Gustafson, N, Pera, MS & Ng, YK 2008, Nowhere to hide: Finding plagiarized documents based on sentence similarity. in Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008., 4740531, Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008, pp. 690-696, 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008, Sydney, NSW, Australia, 9/12/08. https://doi.org/10.1109/WIIAT.2008.16

Nowhere to hide: Finding plagiarized documents based on sentence similarity. / Gustafson, Nathaniel; Pera, Maria Soledad; Ng, Yiu Kai.
Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008. 2008. p. 690-696 4740531 (Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Nowhere to hide

T2 - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

AU - Gustafson, Nathaniel

AU - Pera, Maria Soledad

AU - Ng, Yiu Kai

PY - 2008

Y1 - 2008

N2 - Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimP aD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.

AB - Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimP aD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.

UR - http://www.scopus.com/inward/record.url?scp=62949215789&partnerID=8YFLogxK

U2 - 10.1109/WIIAT.2008.16

DO - 10.1109/WIIAT.2008.16

M3 - Conference contribution

AN - SCOPUS:62949215789

SN - 9780769534961

T3 - Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

SP - 690

EP - 696

BT - Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

Y2 - 9 December 2008 through 12 December 2008

ER -

Nowhere to hide: Finding plagiarized documents based on sentence similarity

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this