Menu

Why data science is unlocking new doors in mental health research

Data could be the key to new discoveries in mental health research – but it should be handled in the right way

Image by Gerd Altmann from Pixabay

Dr Elizabeth Kirkham

The importance of involving people with lived experience of mental illness in mental health research is becoming widely recognised.

However, this can sometimes turn into a tick-box exercise, whereby the views of people with lived experience are merely tacked on at the end, rather than used to shape the research itself.

Involving people with lived experience through the research process can help ensure that the views of those most affected are heard.

Data science makes new discoveries possible

A lot of my work focuses on ‘mental health data science’. This typically involves bringing together routinely collected health information, such as NHS records, from thousands of people. This data is then ‘de-identified’, which means the names and other identifying information have been removed before anyone, including myself or my colleagues, sees it.

Data science is important because it allows us to make discoveries that would not be possible with traditional methods, which typically involve small samples, made up of only the people who have the time and resources to take part in a study.

As mental health data science develops, it is essential that the expertise of people with lived experience is embedded within its practice. To support this aim, my research group at the University of Edinburgh worked with a team of experts to create a best practice checklist for mental health data science. This checklist can be used by researchers to ensure that their work is both scientifically rigorous and beneficial to the people who provide the data.

The opportunities and challenges of mental health data science

Mental health data science involves analysing data from thousands of people to spot trends which allow us to learn new things about mental illness and identify how we can improve mental health treatments.

Sometimes this means using data from studies in which large numbers of people have taken part, but it can also mean using routine data which was originally collected for a different purpose, such as when people received health care.

Greater use of routine data, such as electronic NHS records, has substantial potential for mental health research. Traditional research methods, in which people volunteer to take part in studies, often end up with biased samples which don’t reflect the real picture.

This is a particular problem for mental health research, as the people who are most in need may be missed out because they do not have the time or resources to take part. The use of routine data can help make findings applicable to a much wider range of people.

Whilst routine data presents many opportunities for mental health research; some people have understandable concerns about researchers accessing it.

Researchers aim to address these concerns in various ways, including making sure people cannot be identified from their data, ensuring that all studies receive strict ethical approval before data can be accessed, holding the data in secure digital and physical locations, and requiring that staff undergo training in safe data handling.

Centring people with lived experience in mental health data science

Mental health data science could not exist without information from people who live with mental illness. As such, its governance and practices must reflect the views of people with lived experience. To support this, we conducted a piece of research which resulted in the creation of a best practice checklist for mental health data science.

Clearly, views on how to use this kind of data will differ considerably across individuals, and it should go without saying that not everyone with lived experience has the same perspective on data science. Therefore, to create a useful checklist, we needed a research method that would allow us to distil everyone’s views into some form of consensus. The “Delphi” method was a great way to do this.

Although Delphi studies can be conducted in different ways, a typical online Delphi study works in the following manner. Experts are presented with an online survey in which they read information and provide individual, anonymous feedback on this information. The researchers combine the experts’ feedback and extract the key points. They then create an updated version of the information and send this new version back to the experts.

The process continues for three “rounds”, ultimately resulting in a document (in our case the checklist) which represents the condensed and combined knowledge from across the group of experts.

Importantly, we worked with people who had both expertise in data science (such as a related university degree, or experience working in a research team), and lived experience of mental illness.

As such, the experts were able to produce balanced and thorough perspectives entirely from their own experience, and there was no need for researchers to attempt to integrate views from people artificially separated into different groups (e.g., data scientists in one group and people with lived experience in another). This reduced the risk of bias, however unintentional, from the researchers.

It is important to mention here that, whilst well-intentioned events and publications often mistakenly assume that people with research expertise and lived experience represent separate groups of people, many professionals have experience with both. Indeed, it was not especially difficult for us to recruit more than 30 research experts with lived experience of mental illness, though the anonymity of the project may have helped with this.

The best practice checklist for mental health data science

The best practice checklist is the output of our Delphi study. It represents a set of guidelines for researchers and institutions who want to ensure that their research is both scientifically rigorous and in line with the views of people with lived experience of mental illness.

The checklist contains 14 items, each of which is divided into two sections: what researchers can do right away, and what the wider research community should seek to do in the future. These items cover four topics: security, anonymity, transparency, and community. Research teams can go through this list at the start of a project to make sure they have considered all the elements of best practice identified by our experts. To assist in this process, we have brought together a series of case studies from researchers across the UK to give practical examples of how the checklist can be implemented.

We would encourage all researchers working in mental health data science to download the checklist for free on our website.

Find out more

If you’d like to find out more about how we made the checklist and what it contains, have a look at this talk by Sue Fletcher-Watson, which includes a follow-up Q & A session with Mahmud Al-Gailani (formerly of VOX Scotland), senior data scientist Matthew Iveson, and me Elizabeth Kirkham.

To stay up to date with the work McPin does follow us on Twitter @McPinFoundation or sign up to our newsletter


Dr Elizabeth Kirkham is a Research Associate at the University of Edinburgh. You can follow her on Twitter at @EK_Neuro.

For more on why data science is so important to mental health research, read this blog from a lived experience researcher.