A Tight Coupling Context-Based Framework for Dataset Discovery

dc.contributor.advisorAlagar, Vangalur
dc.contributor.advisorOrmandjieva, Olga
dc.contributor.authorAlsaig, Alaa
dc.date.accessioned2023-06-20T07:03:50Z
dc.date.available2023-06-20T07:03:50Z
dc.date.issued2023-05-15
dc.description.abstractDiscovering datasets of relevance to meet research goals is at the core of different analysis tasks in order to prove proposed hypothesis and theories. In particular, researchers in Artificial Intelligence (AI) and Machine Learning (ML) research domains where relevant datasets are essential for precise predictions have identified how the absence of methods to discover quality datasets are leading to delay and in many cases failure, of ML projects. Many research reports have brought out the absence of dataset discovery methods that fills the gap between analysis requirements and available datasets, and have given statistics to show how it hinders the process of analysis, with completion rate less than 2\%. To the best of our knowledge, removing the above inadequacies remains “an open problem of great importance”. It is in this context that the thesis is making a contribution on context-based tightly coupled framework that will tightly couple dataset providers and data analytics teams. Through this framework, dataset providers publish the metadata descriptions of their datasets and analysts formulate and submit rich queries with goal specifications and quality requirements. The dataset search engine component tightly couples the query specification with metadata specifications datasets through a formal contextualized semantic matching and quality-based ranking and discover all datasets that are relevant to analyst requirements. The thesis gives a proof of concept prototype implementation and reports on its performance and efficiency through a case study.
dc.format.extent167
dc.identifier.urihttps://hdl.handle.net/20.500.14154/68418
dc.language.isoen
dc.publisherConcordia University
dc.subjectContext Awareness
dc.subjectDataset Discovery
dc.subjectDataset Model
dc.subjectData Quality Features
dc.subjectDataset Context Model
dc.subjectData Discovery Framework
dc.titleA Tight Coupling Context-Based Framework for Dataset Discovery
dc.typeThesis
sdl.degree.departmentSoftware Engineering
sdl.degree.disciplineDataset Discovery
sdl.degree.grantorConcordia Universiity
sdl.degree.nameDoctor Of Philosophy

Files

Collections

Copyright owned by the Saudi Digital Library (SDL) © 2024