Lakehead University Library Logo
    • Login
    View Item 
    •   Knowledge Commons
    • Electronic Theses and Dissertations
    • Electronic Theses and Dissertations from 2009
    • View Item
    •   Knowledge Commons
    • Electronic Theses and Dissertations
    • Electronic Theses and Dissertations from 2009
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    quick search

    Browse

    All of Knowledge CommonsCommunities & CollectionsBy Issue DateAuthorTitleSubjectDisciplineAdvisorCommittee MemberThis CollectionBy Issue DateAuthorTitleSubjectDisciplineAdvisorCommittee Member

    My Account

    Login

    Statistics

    View Usage Statistics

    The analysis of Canada's health through social media using machine learning

    Thumbnail

    View/Open

    ShahN2019m-1a.pdf (1.428Mb)

    Date

    2019

    Author

    Shah, Neel J.

    Degree

    Master of Science

    Discipline

    Computer Science

    Subject

    Real-time data analysis
    Social media data analytics
    Elasticsearch
    Health analysis

    Metadata

    Show full item record

    Abstract

    Real-time online data processing is quickly becoming an essential tool in the analysis of social media for political trends, advertising, public health awareness programs and policy making. Traditionally, processes associated with offline analysis are productive and efficient only when the data collection is a one-time process. Currently, cutting edge research requires real-time data analysis that comes with a set of challenges, particularly the efficiency of continuous data fetching within the context of present NoSQL and relational databases. In this thesis, I demonstrate a solution to effectively address the challenges of real-time analysis using a configurable Elasticsearch search engine. We are using a distributed database architecture, pre-build indexing and standardizing the Elasticsearch framework for large scale text mining. The results from the Elasticsearch engine is visualized in almost real-time. We focused on taking our solution to the challenges of real-time data processing is to apply it on social media to conduct a large scale health analaysis in Canada. Social media a crucial database that provides information on a variety of topics such as health, food, feedback on products, and many others. At present, people utilize social media to share their daily lifestyles, for example, where they are going, what exercise are they doing, or what are they eating. By analyzing the information, collected from these individuals, the health of the population can be gauged. This analysis can become an integral part of the government’s efforts to study the health of people on a large scale. This is because public health is becoming the primary concern for many governments around the world, and they believe it is necessary to analyze the present scenario within the population before creating any new policies. Traditionally, governments use a door to door survey, for example, a census, or hospital information to decide their health policies. This information is limited and sometimes takes a long time to collect and analyze sufficiently enough to aid in decision making. Our approach is to try to solve such problems through the advancement of natural language processing algorithms and large scale data analysis. Results show, the proposed method provides the solution in less time with the same accuracy when compared to the traditional one.

    URI

    http://knowledgecommons.lakeheadu.ca/handle/2453/4353

    Collections

    • Electronic Theses and Dissertations from 2009

    Lakehead University Library
    Contact Us | Send Feedback

     


    Lakehead University Library
    Contact Us | Send Feedback