Download UK-HDAN Proposal Development Workshop : Well Sorted Materials

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
UK-HDAN Proposal
Development Workshop :
Well Sorted Materials
6th January 2017
Contents
Introduction
1
Original Question
2
Dendrogram
3
Tree Map
4
Heat Map
5
Raw Group Data
6
For an online, interactive version of the visualisations in this document, go here:
www.well-sorted.org/output/January2017UKHDANProjectIdeas
Introduction
Dear participant,
Thank you for taking part in submitting and sorting your ideas.
This document contains several visualisations of your ideas, grouped by the average of your online sorts. They
are:
Dendrogram - This tree shows each submitted idea and its similarity to the others. The lower two ideas 'join' the
more people grouped those two ideas together. For example, if two ideas join at the bottom, every person
grouped those two together.
Tree Map - This visualisation presents an 'average' grouping. It is calculated by 'cutting' the Dendrogram at the
dashed line so that any items which join lower than that line are placed in the same group. In addition, rectangles
which share a side of the same length are more similar to each other than their peers.
Heat Map - This visualisation shows a similarity matrix where each idea is given a colour at the intersection with
another idea, showing how similar the two are. This is useful to see how well formed a group is. The more red
there is in a group (shown by the black lines), the more similar the ideas inside it were judged to be.
Raw Group Data - This table shows every submitted idea and its longer description. They are shown in the same
order as the Dendrogram (so similar ideas are close to each other) and split into the coloured groups used in the
Tree Map. In addition, each idea has been given a unique number so they are easier to find.
References
[1] Methven, T. S., Padilla, S., Corne, D. W., & Chantler, M. J. (2014, February). Research Strategy
Generation: Avoiding Academic 'Animal Farm'. In Proceedings of the companion publication of the 17th ACM
conference on Computer supported cooperative work & social computing (pp. 25-28). ACM.
doi>10.1145/2556420.2556785
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 1
Original Question
On this page, we present the original question which was posed by the event organiser.
This was originally presented above input boxes on the Well Sorted website itself, so we present it here as close
to how it originally appeared as possible.
The original question was as follows:
Please submit your Health Data Analytics Project Ideas below.
Project Title – this should be a short (max. 50 characters) informative summary of the idea in the form
<methodology required>for [<problem area>].
e.g. Time series analysis for continuous mental health data
If you are purely a methodologist, just omit the second part
Project Description – this should detail the healthcare opportunity/data science challenge you are seeking to
address (max. 255 characters)
Because we are trying to bring together methodologists & problem holders, we need you to express your ideas in
a way that is accessible to both groups.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 2
Dendrogram
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 3
Tree Map
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 4
Heat Map
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 5
Page 6 removed
Raw Group Data: Blue Group
Colour
#
Title
Description
Blue
5
Improving early disease
detection
Could understanding pattern of health care use and
existing conditions help predict who will develop a
serious/chronic health conditions or improved its early
diagnosis?
6
Predictive Analytics for
Emergency Departments
Using internal / external / historic / forecast data, to
create a set of predictive models for hospital ED's that
allow resources to be better utilised and assets
leveraged. Improve throughput / patient experience /
budgets.
7
Prevalence and patterns of
Multimorbidity
The ageing population has resulted in an increase in
the prevalence of people living with more than one
long term condition. To better manage and care for
these patients we need to identify the highly prevalent
combinations of these conditions.
8
Development of Multimorbidity
over the life course
The ageing population has resulted in an increase in
the prevalence of people living with more than one
long term condition. To develop prevention strategies
we need to understand the typical development of
different conditions over a person's life.
9
Using cohort intelligence to
improve svce redesign
Using algorithm driven, cohort analytics to provide
transformation teams with a deep understanding of
complex, multi morbid population needs. This will
inform the exploration and evaluation of new care
model responses that meet those needs.
10
ProjectDataSphere: Open
Oncology & Models
ProjectDataSpehere is an open-source data source
which contains ~36000 patients worth of clinical trial
time-series data across numerous cancer types.
Combination of mathematical biology and biostatistics
approaches could lead to new disease insights.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 7
Raw Group Data: Green Group
Colour
#
Title
Description
Green
11
A platform for rapid learning in
healthcare
OpenClinical.net is a toolset for creating and deploying
executable models of clinical decision-making and
workflow. The key challenge now is to "close the loop",
to capture, aggregate and learn from data acquired
when using the models for supporting care
12
Data tools driving collaborative
transformation
Using robust epidemiology, clinical ontologies and
financial structures to co-develop 'trusted' pre-validated
service redesign tools. These will facilitate systemwide collaborative working and the development of
sustainable transformation plans.
13
Bridging the Gap: From
Problems to Methods
A common obstacle to interdisciplinary research is the
use of different terminologies or descriptions of
research problems and methods in different areas of
science. We need a new approach for mapping
between problems and methods across disciplines.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 8
Raw Group Data: Orange Group
Colour
#
Title
Description
Orange
14
Recognising topological
patterns in data
Assumes data clusters have geometric meanings and
study them by topological theories. Problem: How can
data be clustered geometrically?
15
GM method for continous MH
data for evaluation
We propose genetic matching (GM) method to
maximize balance of observed covariates between
treated and control groups for economic evaluation
within observational studies. We estimate the average
treatment effect on the treated by one-to-one
matching.
16
Causal inference methods for
big data
Championed benefits of big data rely on prediction; not
a single claim involves causal inference. For health
and wellbeing, interest lies in understanding putative
causes, in order to intervene. We desperately need
causal inference methods for big data.
17
Mathematical Biology meets
Biostatistics
There is a growth in new algorithms to analyse
healthcare data. These approaches come from a "top
down" perspective and so lack any link to basic
science. There is clearly an opportunity to marry basic
science with "top down" approaches via
biomathematics
18
Algorithmic bias in health
research
As researchers, it is important to learn from, and try not
to reproduce the inherent biases present in our data
sets and researchers. How will lack of diversity
amongst data scientists and under-recorded users of
healthcare exacerbate health inequalities?
19
Data science meets knowledge
engineering
Data science & machine learning are strongest in
"analysis tasks" (diagnosis, prediction...). Patient care
also requires "synthesis tasks" (planning, design...)
studied by knowledge engineers. This project will show
how to unify the two approaches.
20
Analysing data by Machine
learning algorithms
It is specifically applied in streaming data. Machine
learning is used to optimise the patterns of data.
Problem: what about errors and missing values?
Machine learning cumulates errors iteratively.
21
Missing data methods for big
data
Dealing with missingness is a challenge for big data,
relying on multiple imputation or inverse probability
weighting, but each requires careful specification of the
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 9
missingness model, which becomes intractable for big
data. We seek to address this.
22
Multiple imputation of outcome
variables
We propose multiple imputation method to addressing
missing outcome data values for economic evaluation
within RCT. We first impute missing variables and then
analyses the completed data set and integrate it into
final economic evaluation results.
23
Time series analysis of daily
self-report
Large dataset of daily self-report data (up to 365 days
per individual) and hourly weather data; how to do time
series analysis / case crossover, correcting for
seasonality and other confounders and handling
missing data
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 10
Raw Group Data: Purple Group
Colour
#
Title
Description
Purple
24
Analysing detailed event data to
stratify people
Health records include tens of thousands of different
event types (e.g., diagnoses). If we can stratify (group)
people according to the events in their records, then
health professionals will be able to make better clinical
and planning decisions.
25
Top-down clustering to identify
new cohorts
Use unsupervised machine learning techniques to
discover the 'natural segments' patient populations.
Success can point towards new definitions of comorbidities and disease/treatment outcomes. This is
turn can influence the way care is commissioned.
26
Feature selection to Identify
children at risk
To define a minimum set of features (read codes) that
helps GPs identifying children at risk of disease. Case
study: Asthma. The set can be later used to assess
completeness of data gathered during consultations.
27
Deep Learning for Phenotyping
Recent advances in deep learning include generative
adversarial networks and variational autoencoders;
these systems offer the capacity to discover latent
structure in complex data, and could therefore be
complementary to conventional biomedical analysis.
28
Hierarchical events in temporal
data mining
Algorithms searching for patterns are limited due to the
number of events. Composing them into hierarchical
structure (e.g Read codes) and deciding what level of
the event description should be used could lead to
better discoveries in health domain.
29
Classification using dynamic
data
Patient data do not stay idle over time and the
evolution of data includes a wealth of information.
Identifying patterns in time (and possibly space) can
enhance preventative medicine and early
interventions.
30
Sensor Informatics via Bayesian
Gaussian Processes
Bayesian methods for time-series analysis have been
explored for various applications in health informatics
(often ICU applications), but much remains to be done
in the integration of hospital care (and home care) via
reliable mobile monitoring.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 11
Raw Group Data: Yellow Group
Colour
#
Title
Description
Yellow
31
Reliability Assessment of
Reported Methods
The reliability of research methods reported in
scientific papers often depends on the replicability of
the original studies. We propose to develop a
Knowledge Base of recent methodologies that can be
used for reliability/validity assessment purposes.
32
Dealing with temporality &
vagueness in EHR text
Text analytics is successfully used to support health
data analytics. However, there are open questions. For
example, the vague and nuanced nature of some EHR
text makes analysis difficult, and the linking of events
to times is largely unsolved.
33
Ambiguous diagnoses in
primary care data
Primary care data is usually in the form of codes and
free text. Users can add codes, the codes have varying
levels of diagnostic specificity and there use is often
based on personal preferences, resulting in ambiguity
and inconstancy in diagnoses.
34
Business rules to thoroughly
profile data quality
Organisations such as NHS Digital have mature data
quality procedures, but can only perform basic checks.
If they can develop more comprehensive rules then the
data quality improvements will have widespread
benefit for secondary uses of health data.
35
Data quality of longitudinal
health data
New Visual Analitycs tools to investigate data quality of
longidutinal data of EHR. Completeness and
Correctness are the most important features to first
investigate since the others (concordance,
consistency) depend on the quality of these.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 12
Raw Group Data: Pink Group
Colour
#
Title
Description
Pink
36
Joint models - intensive
longitudinal MH-SWB data
Joint models for intensive longitudinal health data will
develop ways of combining intensive longitudinal data
on mental health and well-being measures to explore
whether trajectories are parallel, divergent or
overlapping to some degree(GLLAMM/SEM model)
37
Attrition in longitudinal mHealth
studies
mHealth / smartphones provide new opportunities for
healthcare research. Engagement is, however, often
lower than in traditional studies. New methods for
increasing engagement and correcting for attrition is
therefore essential.
38
"digital enhancing" of 3-D
understanding of HRQOL
digital enhancement of three dimensional
understanding of health related quality of life HRQOL
develops methods to identify if a social dimensionscaptured digitally-e.g. social media can clarify "healthrelated QOL" as primary/1st, 2nd and 3rd dimensions
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 13
Raw Group Data: Silver Group
Colour
#
Title
Description
Silver
39
Agile safety assurance
methodology for mHealth app
We are developing a new approach to assuring the
safety of mHealth apps based on integrating safety
assurance activities into agile practices in a way that
can address the challenges resulting from mHealth
apps.
40
Principled validation for
smartphone applications
The market is a blizzard of health devices and apps,
few of which are validated in any sense of the word,
and fewer still which have any measurable health
benefit. Behavioural science needs to be more central
to this endeavour and linked to ultimate gain
41
Mobile data for personal health
forecasting
How can we make use of mobile and sensor data, in
combination with digital models of disease, to predict
and prevent negative health episodes, and to
encourage positive behaviour? Issues related to the Vs
of big data, but also integration and interface.
42
Analyse computer interaction
for mental health
Opportunity-early diagnosis helps to mitigate, and
possibly cure, many mental health conditions (e.g.
dementia). Develop a computer based monitoring
system to non intrusively track and data mine people's
behaviour with computers.
43
Behaviour change from IT
healthcare management
Challenge is to present the data mining results in
health care (e.g mental health/ dementia onset) to
users to effect behaviour change either for self referral
or to modify their behaviour when their treatment
needs to be changed.
44
Measurement of social
behaviour in Parkinson's
Existing smartphone apps and wearables for
Parkinson's disease tend to focus on motor symptoms.
Social and communicative difficulties can be significant
in Parkinson's; a tool to track social behaviour and
communication could inform therapy development.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Page 14
Raw Group Data: Brown Group
Colour
#
Title
Description
Brown
45
Linking commercial and clinical
data
To move towards prediction of sickness, how and
where do we capture data from our "social
determinants of health"? What can we learn from the
commercial marketing world and online behaviours it
captures, in order to predict clinical episodes?
46
Distributed analysis:
administrative & health data
Secure linkage & coanalysis of distributed health &
administrative data that is vertically partitioned. We
have approaches under development in DataSHIELD
(www.datashield.ac.uk) based on secure multi-party
computing. Seeking collaborators & applications.
47
Privacy-preserving data mining
Building systems that allow analysis on the lowest level
of details for confidential/sensitive data from one or
more sources without revealing sensitive information to
end users.
48
Data Protection and Research
Ethics Committees
A study of the issues RECs have with data protection
issues, particularly focussing on any differences with
the Confidentiality Advisory Group of the HSCIC.
49
What are users doing with the
data?
How can we audit / track user analyses to determine if
there is misuse of sensitive research data. Applications
include identifying potential inferential disclosure and
data misuse. Seeking existing solutions or
collaborators to co-develop solutions.
50
A Pilot Study of an
Anonymisation Matrix
We have developed an anonymisation framework to
guide decision-making by RECs. We would like to trial
this framework to assess its utility.
UK-HDAN Proposal Development Workshop - Materials Generated by Well-Sorted.Org
Powered by TCPDF (www.tcpdf.org)
Page 15