Welcome to the Semantic Mining of Activity, Social, and Health data Project (SMASH)

Two thirds of the US population are now overweight or obese. This incurs significant health risks and financial costs to society. Traditionally, support groups and other social reinforcement approaches have been popular and effective in dealing with unhealthy behaviors including overweight. Of the factors associated with sustained weight loss one of the most important is continued intervention with frequent social contacts. Research in the design and implementation of the SMASH (Semantic Mining of Activity, Social, and Health data) system will address a critical need for data mining tools to help understanding the influence of healthcare social networks, such as YesiWell, on sustained weight loss where the data are multi-dimensional, temporal, semantically heterogeneous, and very sensitive.

System design and implementation rest on five specific aims. The first aim is to develop a novel data mining and statistical learning approach to understand key factors that enable spread of healthy behaviors in a social network (Aim 1). We develop a formal and expressive Semantic Web ontology for the concepts used in describing the semantic features of healthcare data and social networks. We then bridge the domain knowledge in healthcare and social networks with formal mappings across those ontological concepts (Aim 2). Next, we develop novel recommendation approaches building on top of the influence modeling and prediction. In addition, we develop methods to utilize the recommendation as a means to better organize the social network such that the adoption of optimal health behaviors in the network can spread quickly and sustainably (Aim 3). To protect the privacy of human subjects during the data mining process for social network and health data, we consider the enforcement of differential privacy through a privacy preserving analysis layer. We develop novel solutions to preserve differential privacy for mining dynamic health data and social activities of human subjects (Aim 4). To support this research, we develop a web-accessible portal so that other researchers with little training in data mining will have shared access to data mining tools, ontologies, and social network analysis results (Aim 5). At the end of this project, data resources, tools, ontologies, and technologies will be made available to the larger research community.

This work is an inter-disciplinary collaboration among the PI Dejing Dou, expert in ontologies and semantic data mining, at the University of Oregon, Co-I Brigitte Piniewski MD, the chief medical officer at PeaceHealth Laboratories and the lead of YesiWell, Co-I Ruoming Jin, an expert in complex network and graph mining, at Kent State University, Co-I Xintao Wu, an expert in privacy preserving mining, at the University of North Carolina at Charlotte, Co-I Jessica Greene, an expert in health policy and online intervention, at the University of Oregon/George Washington University, Co-I Daniel Lowd, expert in statistical machine learning, at the University of Oregon, Consultant David Kil, the previous Chief Scientist at SKT Americas and program manager of YesiWell, and the founder of HealthMantic, and Co-I Junfeng Sun, a mathematical statistician at the NIH and an expert in design of clinical trials.

This project is being supported with a three-year R01 grant titled as "Understanding the Mechanism of Social Network Influence in Health Outcomes through Multidimensional and Semantic Data Mining Approaches" by the NIH/NIGMS (Award Number: R01GM103309 $1.54M 5/1/2013 - 2/29/2016).