dc.description.abstract | Visual classification is the perceptible/computational effort of arranging objects and visual
contexts into distinct labels. Humans and machines have mastered this advanced problem in
their own varied contexts. However, certain aspects inherent to the variability of the visual
stimuli present need to be overcome. This thesis analyses the different dimensions of visual
classification using a combination of human cognition and machine vision. Thus, it presents
novel approaches to joint multimodal learning for machine-learnt visual features and features
learnt using brain-visual embeddings via EEG.
First, the thesis proposes a pipeline structure of grayscale image-based encoding of brainevoked EEG signals as a spatio-temporal feature for improved data convergence. This encoding results in a new benchmark performance of 70% accuracy in multiclass EEG-based
classification (40 classes, a challenging benchmark EEG-ImageNet dataset) due to the inclusion of a stretched spatial space that accommodates all the responses of visual stimuli in a
single visual sample. As a second contribution, it develops a new approach for cross-modal
deep learning based on the concept of model concatenation. This unique model uses a mixed
input of deep features from the image and brain-evoked EEG data encoded with a grayscale
image encoding scheme. [...] | en_US |