Abstract
Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by their legal owners. Unfortunately, plagiarism is getting worse due to the increasing number of on-line publications and easy access on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimPaD, which (i) establishes the degree of resemblance between any two documents D 1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.
| Original language | English |
|---|---|
| Pages (from-to) | 27-41 |
| Number of pages | 15 |
| Journal | Web Intelligence and Agent Systems |
| Volume | 9 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 2011 |
| Externally published | Yes |
Keywords
- graphical view
- Plagiarism
- sentence similarity
- word manipulation
- word-correlation factor