Showing posts with label Presentations. Show all posts
Showing posts with label Presentations. Show all posts

Thursday, September 12, 2013

INCF 2013 - Using experimental design to design neuroinformatics data structures

Video Link: http://www.youtube.com/watch?v=NPlBejxhLJg&feature=youtu.be

The interdisciplinary nature of neuroscience research leads to an explosion of different informatics tools, data structures, platforms and terminologies. A central difficulty faced by developers is that knowledge representations for any neuroscience subdomain must serve the domain-specific needs of that specified sub-community. Related representations overlap, they contradict each other, they have competing standards. The process of standardization is itself difficult to organize within the community and even harder to enforce in practice. This involves complex issues involving ease of use, computability, data availability as well as scientific correctness and philosophical purity.

In this talk, I present a novel, relatively simple conceptual design that makes a clear distinction between interpretive and observation knowledge to build a general framework for scientific data. Our methodology (called 'Knowledge Engineering from Experimental Design' or KEfED)  uses an experiment's protocol's to define the dependencies between its independent and dependent variables. These dependencies support the construction of a data structure that can capture (a) data points, (b) mean values, (c) statistical significance relations and (d) correlations. We will describe the underlying formalism of the KEfED approach, the tools we provide to help researchers build their own models, our approach to unify and standardize the definition of variables, the application of KEfED to complex neuroscience knowledge and possible research directions for this technology in the future.

Sunday, August 25, 2013

The SciKnowMine Project: Bridging BioNLP and Biocuration

Biological Natural Language Processing ('BioNLP') holds great promise to support and accelerate biocuration (organizing published biomedical knowledge into online resources such as databases) but has not yet generated viable open technology for use within the community. This is an area of active research and is the subject of shared evaluations such as 'BioCreative 4'. As the closing meeting of an NSF-funded infrastructure project (called 'SciKnowMine', #0849977), we held a workshop to (A) present an implementation of a system for document triage that we are currently deploying to the Mouse Genome Informatics (MGI) system, (B) present and develop a strategic plan for open-source community-driven tools that bridge between curators committed to improving the quality of their informatics resources and computer science specialists developing novel NLP technology. The meeting was well-attended by many experts from both communities and in-keeping with the vision of this blog of examining the issues inherent in developing scientific breakthroughs by explicitly describing the paradigms that different disciplines inhabit, the workshop was fully designed around the theme of finding connecting points between these two inter-dependent paradigms.

The workshop page is here: 


And my introduction and talk (which goes into some detail about the way we use paradigms) is here:


and here: