Abstract
Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by their legal owners. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications and easy access on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimPaD, which (i) establishes the degree of resemblance between any two documents D 1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.
Original language | English |
---|---|
Pages (from-to) | 27-41 |
Number of pages | 15 |
Journal | Web Intelligence and Agent Systems |
Volume | 9 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2011 |
Externally published | Yes |
Keywords
- graphical view
- Plagiarism
- sentence similarity
- word manipulation
- word-correlation factor