Professor John Hopper AM is a renowned Australian genetic epidemiologist. As director of Twins Research Australia, he uses statistical analyses to unravel enormous datasets, to find out how we can improve our health.
What happens when you combine skills in mathematical statistics with a career in medical research? If you’re alumnus Professor John Hopper AM (La Trobe University Distinguished Alumni Award, 2019), you start saving lives.
Throughout his career as a genetic epidemiologist, John has run hundreds of statistical analyses – and authored as many scientific papers – to make sense of large-scale health data. His models aim to improve people’s health by uncovering the genetic (‘nature’) and environmental (‘nurture’) factors that drive disease risk. Armed with this information, doctors can mitigate or even prevent the onset of disease, improving patients’ health outcomes through personalised care.
“My training at La Trobe was in mathematical statistics. It's not that I have great training in medicine. I'm not a biologist and I don't presume to know the answers, but I believe the information is in the data. And, thanks to La Trobe, I have the ability to uncover it,” John says.
John’s work has helped earlier diagnosis of diseases and conditions like breast cancer, colorectal cancer, melanoma and asthma. His latest research at Twins Research Australia investigates how the COVID-19 pandemic is affecting families’ health and life experiences. So, it’s remarkable to learn that he may have missed his calling altogether, if not for discovering the power of working with real data during his PhD at La Trobe.
“My PhD at La Trobe was the first opportunity I’d had to analyse real-world data. That was very important to me. Before that, the datasets we were using were a bit ‘Mickey Mouse’ – they were hypothetical examples that weren’t grounded in reality. And I think that's really what made me want to have a career in medical research, because I realised I had a skill that worked and could actually make a difference,” says John.
“Once you start to get into analyses like that, it's not just something you do while you're sitting at the computer. For example, I’d walk to the car and suddenly have a new idea and have to walk back to work at six o'clock at night to try it out!”
Making sense of what’s in the data
All statisticians face an unrelenting challenge: to make sense of what they’ve found in the data. For John, that challenge is also where the excitement lies.
“Darwin famously said, ‘Contrary to what I first expected’ – and that's what it's like as a statistician. The fun is in finding new things, especially if they're not what people expect. Once, I found an association of smoking with an outcome, and it took me two days to realise it was in the opposite direction to what I’d thought it was in!” John says.
As a statistician, you have an incredibly powerful scientific tool. You can find information people didn't know, just by analysing the data. You’re able to get a signal from the noise, to find the truth. You keep on thinking and thinking about how you can do better, about what other opportunities are there. You're never satisfied.
When John was completing his La Trobe PhD back in the 1980s, today’s culture of big data, machine learning and predictive analytics didn’t exist. Despite this, John embraced computers early on – to great success.
“I sat at a little computer terminal on my own and I slowly learnt how to make the computer help me solve problems. I was doing it in a very amateurish way, but I was recognising the value of the computer in fitting more sophisticated models and answering questions you couldn't with pen and paper. It seemed to be cheating,” John says.
“At that time, most people in the maths department weren’t using the computer, they were doing the analyses in their brains. All my other colleagues were brighter than me. They could work it out in their heads. I was just using the computer to get the answer, but it turned out I got a better answer.”
John’s time at La Trobe developed not only his passion for discovery, but also his talent for trying to do things differently. This sensibility benefited his career, as emerging approaches to health data collection resulted in vast family and population-level datasets.
“As knowledge of genetics advanced, we realised there was a need for population-based studies. We began collecting much more data – family data, epidemiological information, DNA and blood samples – to allow more sophisticated studies to be done. We spent a lot of time, my friends and colleagues in the '80s and '90s, laying the foundation for population-based family studies that are now proving to be extremely useful.”
Why twins make the perfect dataset
After graduating, John went to work at The University of Melbourne as a statistician in the medical research area. There, he was introduced to what would become one of the most important databases of his career: a twin registry.
“Twins put a magnifying glass on what it is to be human. They let us look at both genetic factors and environmental and lifestyle factors, not so much in competition, but rather how they combine to determine our health,” John says.
Twin studies allow statisticians to understand how the environment affects an individual, as well as how their genetic factors affect them. When you compare identical twins to nonidentical twins, for example, you can exclude, or ‘control for’, the impact of genetic factors, because identical twins have the same DNA.
Twins are special. They give us wonderful ways to differentiate between the roles of genetic and environmental factors on health and disease. And they allow us to answer questions about genes and environment in ways we couldn't otherwise do.
Since 1990, John has been Director of the Twins Research Australia – the voice of the Australian Twin Registry. One in six of all Australian twins are listed in the database, which captures data from opposite-sex twins, identical twins, non-identical twins and twin families. The registry has grown under John’s leadership to comprise 40,000 pairs of twins and has positioned Australia as a world-leader in twin studies.
For John, becoming an expert in analysing twin and family data using statistics has also led to his biggest discovery: that genetic and environmental factors work in combination, rather than in competition.
“One of the really important things I've discovered, which now seems so obvious, is that it's not natureversus nurture: it's nature and nurture, together. For example, the more you're at genetic risk, the more important your environment becomes. And by contrast, the more you're at environmental risk, the more important your genes become.”
From population data to personalised healthcare
Such a refined understanding of disease risk is, John believes, the future of personalised healthcare. With advances in data collection, machine learning algorithms and predictive analytics set to continue, John and fellow genetic epidemiologists hope to use statistical modelling to identify which patients are at greatest risk of disease. What they learn can, in turn, inform decisions about how to mitigate a person’s risk of illness and improve their long-term health.
Rather than everyone being treated as if they're all at equal risk of everything, which is really a last-century way of thinking about health, the future will be understanding what you're at risk for: what diseases really are a problem for you in your life, and what's the best thing to do about those diseases?
As an example, John describes how his work might impact breast cancer patients. Data collected from a mammogram can now be modelled to ascertain breast cancer risk and a care pathway to reduce it.
“I've realised we can learn more about a woman's risk of breast cancer from looking at her mammogram than looking at her DNA. Now mammography is digital, the computer can learn what it is about a mammogram that predicts a woman's future risk of breast cancer,” he says.
“In the future, that information might be given to a woman immediately, at the time of her mammogram. We might say to her, ‘Well, you need to come back in two years. You need to go and have a different scan. You need to do X and Y if you're going to reduce your risk of breast cancer.’”
John’s research on breast cancer is driven by the same thing that sparked his love of computer-based statistics as a PhD student at La Trobe: real datasets, delivering real impact.
“The most important thing about research is can it, does it, will it make a difference? Are you changing the way things operate, are you saving lives? I’ve seen how important it is to focus on where your work is actually relevant, by having contact with the people my research will impact. It’s for women affected by breast cancer and their families that I’m doing this work, to prevent them developing the disease, or to be diagnosed earlier and have much better outcomes.”
To John, the future of health is data-driven. He feels lucky, and a little amazed, at his career journey, as well as grateful for the important role La Trobe University played. Looking ahead, he’s excited by the next generation of statisticians who, in his opinion, are set to achieve extraordinary things.
“Back when I finished my PhD, we were trying to work out how to use new data and figure out the statistical ideas, the design ideas. Now we're in the 21st century. We have amazing computing power to analyse big datasets on epigenetics and genomics. And we also have a whole new generation of computer scientists, statisticians and mathematicians who can make it work. They’re opening up a whole world of machine learning and artificial intelligence. That's the brave new future that I see: really bright people with powerful tools doing great science.”
Last updated: 1st June 2020