In Conversation With Gagandeep Kang

Reading time: 8 Minutes
We speak to the eminent scientist about machine learning, artificial intelligence, and public health research.
BY DEBDUTTA PAUL

Professor Gagandeep Kang is a medical scientist with a distinguished voice for equality in public health. Currently Director in the Division of Global Health at the Bill and Melinda Gates Foundation and formerly a professor at the Christian Medical College, Vellore, she has contributed significantly to research on diarrhoea and rotavirus vaccines. From 2013 to 2022, she chaired the World Health Organisation (WHO) Southeast Asian Regional Immunization Technical Advisory Group; from 2019 to 2022, she was a member of a working group on COVID-19 vaccines established by the Strategic Advisory Group of Experts at the World Health Organization; and a Member of the Global Health Scientific Advisory Committee of the Bill and Melinda Gates Foundation until 2023. A recipient of many awards, among them the Infosys Prize in Life Sciences in 2016, she was elected a Fellow of the Royal Society in 2019 and the US National Academy of Medicine in 2022.

During her visit to ICTS-TIFR for the program on Machine Learning for Health and Disease, Professor Kang (GK) spoke to Debdutta Paul (DP). The full text of the interview is reproduced below. The answers are lightly edited for clarity. The questions and initials are in bold, and DP’s additions are in square brackets.

DP: Earlier, at ICTS, we discussed machine learning and artificial intelligence and their role in society. Coming to the context of health that the present meeting is about, we would love to know your thoughts on whether machine learning and artificial intelligence can revolutionise global health challenges.

GK: I think the potential exists; we have to make sure that we achieve the potential in the right way, which is to take an approach that calculates both benefits and risks. And the risks of not being able to do things properly are much greater in low- and middle-income countries than they are in high-income countries. Some of these challenges are contextual, and many of the approaches that have been developed have not been developed specifically for the kinds of settings and geographies and populations that we work with. How do we ensure that the results that are generated from the models that are built are actually relevant to us and are not sending us down a path that may not be appropriate for us? So [there are] lots of benefits, but [it needs] to be approached very carefully.

DP: Given the private nature of health data, is it risky to interpret machine-learned data and results to design medical programs?

GK: Well, the problem really is where is it a benefit and where is it a risk. If we think about ourselves, if I have a medical problem, the more information the doctor has about me, the better it is for the doctor to be able to make a diagnosis and then decide on what is an appropriate treatment for me. Right? Now, if that information gets into the wrong hands, it doesn’t stay contained within the medical system — then that is a problem. Perhaps I have a disease that is stigmatising, [say] I have cancer, and an employer learns about that and decides that they shouldn’t employ me. That is a danger to me. That’s why privacy is really important. But in order for models to be built and for systems to learn, what we need is a lot of well-annotated data. So how do you annotate data really well without also putting in a situation that results in the identification of individuals? It needs to be very carefully done. Some of these issues have already been discussed in traditional clinical research. There you have the system of an honest broker, which is somebody who is responsible for being an intermediary between the data producers and the data users and is the person that ensures that privacy is maintained to the extent that is possible, protecting the interests of the individuals whose data has been contributed. As the amount of information we collect gets more and more, these kinds of systems and their appropriate functioning is going to be really important for us to develop and test.

DP: What are your thoughts on the availability of primary data, especially in India, as compared to the global north, especially the US?

GK: I frequently say that I feel like I wasted about 20 years of my life because I was a primary data collector; I built the systems to collect data. Now, this is public health data, it should have been available within the public health system, but I couldn’t trust what was available, or it just wasn’t available. If I needed data to inform what I wanted to do, I had to go out and collect it myself because I wanted quality data. When you know that your primary data is of high quality, then you can rely on it to do things that can benefit populations. But if you have questions about the quality of your primary data, then you’re always uncertain that you’re travelling in the right direction.

DP: What are the biggest challenges that you have faced in this exercise?

GK: Much of what I do is actually [a] measurement of how much disease is out there, and they can be huge underestimates if you’re not looking in the right place. For example, I work on typhoid. Now, if you read papers that come from hospitals, they will tell you that over the last 20 years, typhoid has declined very significantly. Why do they say that? It’s because typhoid is actually being treated outside the hospital, with people getting antibiotics from pharmacies or informal providers. So the burden of typhoid gets hidden because you’re looking in the wrong place for typhoid. That’s an example of making sure that the framing of your question is right so that you don’t get lost because you went to the wrong data source.

DP: Is there any big challenge that you face in terms of social and gender inequalities that can creep into collecting data?

GK: It’s very interesting to me. I work on diarrheal disease, and diarrhoea is a neglected disease in any case. Everybody is more interested in cancer and neuroscience, etc. But if you look at diarrhoea in hospitals, 60% of all people who come in, especially children who come in with diarrhoea, are boys. Okay? When you talk to people and they look at data, they ask, are boys more susceptible to diarrhoea? How do you address that question? You then begin to look at diarrhoea within households in the community. And you find that actually, girls and boys have the same amount of diarrhoea; it’s just that when it is a question of going to a healthcare facility where you need to pay for care, it becomes 60-40 instead of 50-50. Not only that, when we did a deeper dive into this data, we found that girls were brought to care 24 hours later than boys. A girl has to be sick for three days, but a boy sick for two days gets taken to care. So, there certainly are gender divides that can be hidden if you’re not collecting the right data.

DP: Can you tell us how these problems can be mitigated in the long run?

GK: I think higher quality primary data collection, and potentially for certain kinds of problems, being able to put the contribution of data in the hands of the people who have the disease, rather than leaving it to a data system to collect that data. In health data, we frequently have surveys. What if, instead of that, you could upload your information? Would that result in more accurate information? There is potential for introducing biases there. It may be the more educated, the richer people, the ones more interested in monitoring themselves that will upload data, but it is at least one of the potential solutions that we could be thinking about.

DP: Coming to your work, can you tell us any one problem that you’re working on currently and why you’re interested in that?

GK: I moved three months ago to work for the Gates Foundation. And one of the areas that my group handles is the Institute for Health Metrics and Evaluation. And that is responsible for producing every year the Global Burden of Disease Study. They have started now to use AI to answer questions for themselves and the work that we have asked them to do. One example is every time they produce a paper, they ask everyone around the world to put in their comments. Sometimes, they might wind up with [about] 5000 comments on a paper. And they have to decide, ‘Do we wait one year while we collate those 5000 comments, decide which ones to use, and change the paper based on that, and which ones are just general comments that we don’t need to respond to?’ They are training an AI model to look at that to see if they can decrease the human need for [a] review of comments that come in from collaborators. Similar to that, there are many other areas that they are working on. But that is a resource-rich group. They have lots of people, lots of money, lots of modellers, [and] lots of scientists who can help build these models. And they have more access to AI tools than other people do. They work with Microsoft, for example. My worry is we are enabling a good group to become even better. What about the other end of the spectrum, where you want to work with local partners for local problems? And towards that, there are some efforts that are being made to see how we can identify people that are interested in these issues, in agriculture, in human health, in pandemic preparedness, that are located in Asia and Africa whom we could support to develop their models.

DP: If you have any final thoughts for researchers here or for the general public, please feel free to share.

GK: Well, I think it’s a great time to be thinking about research and especially with the new tools that are being developed. If you have tools that can take much of the grunt work out of what you’re doing — sensibly — then it gives you the opportunity for much greater explorations than you’ve had in the past. So your time becomes even more valuable than it was [earlier]. Have fun with it.

DP: Thank you so much for your time and answers.


Header photograph by Shantaraj S.

Leave a Reply