Using structured knowledge and traditional word embeddings to generate concept representations in the educational domain

Oghenemaro Anuyah, Ion Madrazo Azpiazu, Maria Soledad Pera

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Citations (Scopus)

Abstract

To capitalize on the benefits associated with word embeddings, researchers working with data from domains such as medicine, sentiment analysis, or finance, have dedicated efforts to either taking advantage of popular, general-purpose embedding-learning strategies, such as Word2Vec, or developing new ones that explicitly consider domain knowledge in order to generate new domain-specific embeddings. In this manuscript, we instead propose a mixed strategy to generate enriched embeddings specifically designed for the educational domain. We do so by leveraging FastText embeddings pre-trained using Wikipedia, in addition to established educational standards that serve as structured knowledge sources to identify terms, topics, and subjects for each school grade. The results of an initial empirical analysis reveal that the proposed embedding-learning strategy, which infuses limited structured knowledge currently available for education into pre-trained embeddings, can better capture relationships and proximity among education-related terminology. Further, these results demonstrate the advantages of using domain-specific embeddings over general-purpose counterparts for capturing information that pertains to the educational area, along with potential applicability implications when it comes to text processing and analysis for K-12 curriculum-related tasks.

Original languageEnglish
Title of host publicationThe Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019
PublisherACM
Pages274-282
Number of pages9
ISBN (Electronic)9781450366755
DOIs
Publication statusPublished - 13 May 2019
Externally publishedYes
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: 13 May 201917 May 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
Country/TerritoryUnited States
CitySan Francisco
Period13/05/1917/05/19

Keywords

  • Domain knowledge
  • Enriched embeddings
  • K-12 curriculum
  • Structured knowledge
  • Text representation
  • Word embeddings

Fingerprint

Dive into the research topics of 'Using structured knowledge and traditional word embeddings to generate concept representations in the educational domain'. Together they form a unique fingerprint.

Cite this