Use of semantic, syntactic and sentiment features to automate essay evaluation

Janda, Harneet Kaur

dc.contributor.advisor	Mago, Vijay
dc.contributor.advisor	Du, Shan
dc.contributor.author	Janda, Harneet Kaur
dc.date.accessioned	2019-06-20T14:11:46Z
dc.date.available	2019-06-20T14:11:46Z
dc.identifier.uri	http://knowledgecommons.lakeheadu.ca/handle/2453/4345
dc.description.abstract	Manual grading of essays by humans is time-consuming and likely to be susceptible to inconsistencies and inaccuracies. Mostly performed within an academic institution, the task at hand is to grade hundreds of submitted essays and the major hurdle is the homogeneous assessment from the first till the last. It can take hours or sometimes even days to finish the assessment. Automating this tedious manual task is not only a relief to the teachers but also assures the students of consistent markings throughout. The challenge in automatizing is to recognize crucial aspects of natural language processing (NLP) which are vital for accurate automated essay evaluation. NLP is a subset of the field of artificial intelligence which deals with making computers understand the language used by humans for expression and then further process it. Since essays are a written textual form of expression and idea exchange, automating the essay assessment process through a computer system leverages progress from NLP field and automates one of the biggest manual tasks of educational systems. In recent years, an abundance of research has been done to automate essay evaluation processes, yet little has been done to take into consideration the syntax, semantic coherence and sentiments of the essay’s text together. Our proposed system incorporates not just the rule-based grammar and surface level coherence check but also includes the semantic similarity of the sentences. We propose to use graph-based relationships within the essay’s content and polarity of opinion expressions. Semantic similarity is determined between each statement of the essay to form these graph-based spatial relationships. Our algorithm uses 23 salient features with high predictive power, which is less than the current systems while considering every aspect to cover the dimensions that a human grader focuses on. Fewer features help us get rid of the redundancies of the data so that the predictions are based on more representative features and are robust to noisy data. The prediction of the scores is done with neural networks using the data released by the ASAP competition held by Kaggle. The resulting agreement between human grader’s score and the system’s prediction is measured using Quadratic Weighted Kappa (QWK). Our system produces a QWK of 0.793. Our results are repeatable and transparent, and every feature is very well explained as compared to other existing systems where authors have not explained the methodologies and feature extraction to a similar extent for the results to be reproduced.	en_US
dc.language.iso	en_US	en_US
dc.subject	Natural language processing	en_US
dc.subject	Automated essay evaluation (Computer Science)	en_US
dc.title	Use of semantic, syntactic and sentiment features to automate essay evaluation	en_US
dc.type	Thesis	en_US
etd.degree.name	Master of Science	en_US
etd.degree.level	Master	en_US
etd.degree.discipline	Computer Science	en_US
etd.degree.grantor	Lakehead University	en_US
dc.contributor.committeemember	Yang, Yimin
dc.contributor.committeemember	Giabbanelli, Philippe

Files in this item

Name:: JandaH2019m-1a.pdf
Size:: 1.038Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electronic Theses and Dissertations from 2009 [1746]

Show simple item record