Show simple item record

dc.contributor.advisorMago, Vijay
dc.contributor.authorShah, Neel J.
dc.date.accessioned2019-08-13T15:38:49Z
dc.date.available2019-08-13T15:38:49Z
dc.date.created2019
dc.date.issued2019
dc.identifier.urihttp://knowledgecommons.lakeheadu.ca/handle/2453/4353
dc.description.abstractReal-time online data processing is quickly becoming an essential tool in the analysis of social media for political trends, advertising, public health awareness programs and policy making. Traditionally, processes associated with offline analysis are productive and efficient only when the data collection is a one-time process. Currently, cutting edge research requires real-time data analysis that comes with a set of challenges, particularly the efficiency of continuous data fetching within the context of present NoSQL and relational databases. In this thesis, I demonstrate a solution to effectively address the challenges of real-time analysis using a configurable Elasticsearch search engine. We are using a distributed database architecture, pre-build indexing and standardizing the Elasticsearch framework for large scale text mining. The results from the Elasticsearch engine is visualized in almost real-time. We focused on taking our solution to the challenges of real-time data processing is to apply it on social media to conduct a large scale health analaysis in Canada. Social media a crucial database that provides information on a variety of topics such as health, food, feedback on products, and many others. At present, people utilize social media to share their daily lifestyles, for example, where they are going, what exercise are they doing, or what are they eating. By analyzing the information, collected from these individuals, the health of the population can be gauged. This analysis can become an integral part of the government’s efforts to study the health of people on a large scale. This is because public health is becoming the primary concern for many governments around the world, and they believe it is necessary to analyze the present scenario within the population before creating any new policies. Traditionally, governments use a door to door survey, for example, a census, or hospital information to decide their health policies. This information is limited and sometimes takes a long time to collect and analyze sufficiently enough to aid in decision making. Our approach is to try to solve such problems through the advancement of natural language processing algorithms and large scale data analysis. Results show, the proposed method provides the solution in less time with the same accuracy when compared to the traditional one.en_US
dc.language.isoen_USen_US
dc.subjectReal-time data analysisen_US
dc.subjectSocial media data analyticsen_US
dc.subjectElasticsearchen_US
dc.subjectHealth analysisen_US
dc.titleThe analysis of Canada's health through social media using machine learningen_US
dc.typeThesisen_US
etd.degree.nameMaster of Scienceen_US
etd.degree.levelMasteren_US
etd.degree.disciplineComputer Scienceen_US
etd.degree.grantorLakehead Universityen_US
dc.contributor.committeememberYang, Yimin
dc.contributor.committeememberSrivastava, Gautam


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record