Baldeep Singh, MD, with staff at Samaritan House

Purvesh Khatri, PhD

Adventures of a Proud Data Parasite

Purvesh Khatri, PhD

Adventures of a Proud Data Parasite

In an era in which analyzing other people’s data has been likened to research parasitism by no less an authority than The New England Journal of Medicine, Purvesh Khatri, PhD, assistant professor of medicine (biomedical informatics research–Institute for Immunity, Transplantation and Infection), declares, “I’m not a research parasite, because that implies that I’m stealing somebody else’s idea. I am repurposing data to ask and answer questions that are not addressable using traditional approaches. I’m a proud data parasite.”

Khatri’s research has asked many questions and produced many answers of significance in the past three years. That is especially unusual for someone who came to this country in the late 1990s with a degree in communications and wanting to be a software engineer. After several career turns, he finds himself a bioinformatician on a quest to improve diagnostics and therapies for infectious diseases. And, perhaps next, autoimmune diseases. 

Informaticians study reams of data about diseases in hopes of recognizing patterns in the data, understanding the causes of these patterns, and designing algorithms to recognize further patterns. Using such a process, Khatri and his group recently showed that they can diagnose patients with an infection two to five days before patients could be clinically diagnosed. They do this by looking at data from gene expressions of the body in response to a given infection. This work was published in Science Translational Medicine in 2015. While this was a useful discovery, it lacked the specificity that Khatri sought: “The problem with that approach was that we could not differentiate between bacterial and viral infections.”

The next step was to try to identify the host response specific to viral infections. Immunity published that study, again looking at gene expression, which demonstrated not only that there is a common host response to multiple viruses different from bacteria, but also that host responses to viruses differed from one another, so that “we could distinguish among viruses.” 

With this information in hand, it was natural to ask another related question, which Khatri indeed asked: “Now that we’ve seen this on the viral side, does the immune response recognize different bacteria?” The bacterium they studied was mycobacterium tuberculosis, and again the result was positive: “Yes, there is a very specific host response to tuberculosis that allows us to distinguish active tuberculosis patients from patients with other bacterial infectious diseases, other respiratory diseases, latent tuberculosis infection and so on.” 

The immediate clinical action once it’s known that a patient has tuberculosis is to give curative antibiotics. But Khatri wondered if the host response might also serve as a biomarker for treatment response, and his group performed another study that was reported in Lancet Respiratory Medicine in 2016: “If the treatment is working, bacteria are going to die. Once there are no bacteria left in the system there will be no host response, and you will know the patient is cured. So it’s not just diagnostic; it should also allow you to monitor patients upon successful treatment.”

In an era in which analyzing other people’s data has been likened to research parasitism by no less an authority than The New England Journal of Medicine, Purvesh Khatri, PhD, assistant professor of medicine (biomedical informatics research–Institute for Immunity, Transplantation and Infection), declares, “I’m not a research parasite, because that implies that I’m stealing somebody else’s idea. I am repurposing data to ask and answer questions that are not addressable using traditional approaches. I’m a proud data parasite.”

Khatri’s research has asked many questions and produced many answers of significance in the past three years. That is especially unusual for someone who came to this country in the late 1990s with a degree in communications and wanting to be a software engineer. After several career turns, he finds himself a bioinformatician on a quest to improve diagnostics and therapies for infectious diseases. And, perhaps next, autoimmune diseases. 

Informaticians study reams of data about diseases in hopes of recognizing patterns in the data, understanding the causes of these patterns, and designing algorithms to recognize further patterns. Using such a process, Khatri and his group recently showed that they can diagnose patients with an infection two to five days before patients could be clinically diagnosed. They do this by looking at data from gene expressions of the body in response to a given infection. This work was published in Science Translational Medicine in 2015. While this was a useful discovery, it lacked the specificity that Khatri sought: “The problem with that approach was that we could not differentiate between bacterial and viral infections.”

The next step was to try to identify the host response specific to viral infections. Immunity published that study, again looking at gene expression, which demonstrated not only that there is a common host response to multiple viruses different from bacteria, but also that host responses to viruses differed from one another, so that “we could distinguish among viruses.” 

With this information in hand, it was natural to ask another related question, which Khatri indeed asked: “Now that we’ve seen this on the viral side, does the immune response recognize different bacteria?” The bacterium they studied was mycobacterium tuberculosis, and again the result was positive: “Yes, there is a very specific host response to tuberculosis that allows us to distinguish active tuberculosis patients from patients with other bacterial infectious diseases, other respiratory diseases, latent tuberculosis infection and so on.” 

The immediate clinical action once it’s known that a patient has tuberculosis is to give curative antibiotics. But Khatri wondered if the host response might also serve as a biomarker for treatment response, and his group performed another study that was reported in Lancet Respiratory Medicine in 2016: “If the treatment is working, bacteria are going to die. Once there are no bacteria left in the system there will be no host response, and you will know the patient is cured. So it’s not just diagnostic; it should also allow you to monitor patients upon successful treatment.”

Collaboration is the best thing about Stanford School of Medicine, especially for a data parasite like me.

Collaboration is the best thing about Stanford School of Medicine, especially for a data parasite like me.

Without antibiotic-like therapies for viral infections, Khatri’s lab sought to look at data from vaccinated patients for more biomarkers. “When you think about vaccination,” he explains, “you are giving patients the infection without the corresponding symptoms. Knowing that there is a virus-specific host response, we wondered if that would also show up when a patient is successfully vaccinated. And the answer was yes!”

In other words, patients who respond to the vaccination they were given — “those we call successfully vaccinated” — have the same response to the vaccination as patients who get the viral infection. The importance of this finding, Khatri says, is that “this gives us the opportunity to develop new immune metrics for successful vaccination.” 

Influenza was the virus of choice for this research because nearly everyone is advised to get vaccinated against it every year. The actual strain of influenza in a given year turns out to be unimportant, as this study demonstrates, because Khatri’s group found “the same host response to 17 different strains of influenza. It doesn’t matter if it’s a Vietnam strain or a California strain or an Australian strain. If you have influenza, then you are going to have this same response as long as you are successfully responding to it. Therefore, as long as a patient mounts a response to an influenza vaccine, you know that they would mount that response to all strains of the flu.”

The next step in this ongoing campaign is to try to determine if it is possible to identify patients who might not need vaccination. Since 50 percent of patients who inhale live virus do not get infected, Khatri explains, “we want to know what is different about the people who literally put their nose into the virus and don’t get infected. The key is to know whether an individual patient falls into the never or the always category of influenza patients.” In the era of personalized medicine, Khatri says, this will help reduce the disease burden by prevention, not by treatment.

A basic question that continues to intrigue Khatri is “How do you understand the immune system? One of the things that my lab is starting to show is that there are different immune responses to different groups of diseases. Organ transplantation looks very different from infectious diseases. And autoimmune diseases look very different from organ transplant and infectious diseases. My lab is working in each one of these areas.” 

Khatri is someone who believes that the more heterogeneity there is in the data he has access to, the better the results he will find. Thus it is not surprising to learn that he feels that there are better ways to study diseases. He explains, “The way we have been studying autoimmune diseases may not be the best way to look at them. There are similarities among autoimmune diseases, and a better way to study them might be to look at them in groups. For example, fibrosis. Everybody looks at fibrosis differently depending on whether it is in lung, skin, heart or kidney. Madeleine Scott, a student in the Medical Scientist Training Program in our lab, is looking at fibrosis across organs, so we can narrow down what causes it. It’s an important disease to study because if you have idiopathic pulmonary fibrosis, the median survival is three years.” 

As a researcher whose chief tool is a computer with access to volumes of publicly available data, Khatri is quick to explain that this “doesn’t mean that we don’t need to do experiments. We are a 100 percent dry lab, but we have been really lucky to have some fantastic collaborators here at Stanford to work with us and validate our findings. The way our collaborators believe our data and our analyses is just fantastic.”

Two examples Khatri mentioned were Jason Andrews, MD, and Shirit Einav, MD, both assistant professors of infectious diseases. “Jason has essentially created two cohorts for us, one in Nepal and one in Brazil, to further test our biomarkers. Shirit is now testing the drugs we predict will work in patients in mice in her lab. She’s amazing; I just have to show her our analyses, and she designs the experiments to test hypotheses from our analyses.”

Without antibiotic-like therapies for viral infections, Khatri’s lab sought to look at data from vaccinated patients for more biomarkers. “When you think about vaccination,” he explains, “you are giving patients the infection without the corresponding symptoms. Knowing that there is a virus-specific host response, we wondered if that would also show up when a patient is successfully vaccinated. And the answer was yes!”

In other words, patients who respond to the vaccination they were given — “those we call successfully vaccinated” — have the same response to the vaccination as patients who get the viral infection. The importance of this finding, Khatri says, is that “this gives us the opportunity to develop new immune metrics for successful vaccination.”

Influenza was the virus of choice for this research because nearly everyone is advised to get vaccinated against it every year. The actual strain of influenza in a given year turns out to be unimportant, as this study demonstrates, because Khatri’s group found “the same host response to 17 different strains of influenza. It doesn’t matter if it’s a Vietnam strain or a California strain or an Australian strain. If you have influenza, then you are going to have this same response as long as you are successfully responding to it. Therefore, as long as a patient mounts a response to an influenza vaccine, you know that they would mount that response to all strains of the flu.”

The next step in this ongoing campaign is to try to determine if it is possible to identify patients who might not need vaccination. Since 50 percent of patients who inhale live virus do not get infected, Khatri explains, “we want to know what is different about the people who literally put their nose into the virus and don’t get infected. The key is to know whether an individual patient falls into the never or the always category of influenza patients.” In the era of personalized medicine, Khatri says, this will help reduce the disease burden by prevention, not by treatment.

A basic question that continues to intrigue Khatri is “How do you understand the immune system? One of the things that my lab is starting to show is that there are different immune responses to different groups of diseases. Organ transplantation looks very different from infectious diseases. And autoimmune diseases look very different from organ transplant and infectious diseases. My lab is working in each one of these areas.”

Khatri is someone who believes that the more heterogeneity there is in the data he has access to, the better the results he will find. Thus it is not surprising to learn that he feels that there are better ways to study diseases. He explains, “The way we have been studying autoimmune diseases may not be the best way to look at them. There are similarities among autoimmune diseases, and a better way to study them might be to look at them in groups. For example, fibrosis. Everybody looks at fibrosis differently depending on whether it is in lung, skin, heart or kidney. Madeleine Scott, a student in the Medical Scientist Training Program in our lab, is looking at fibrosis across organs, so we can narrow down what causes it. It’s an important disease to study because if you have idiopathic pulmonary fibrosis, the median survival is three years.”

As a researcher whose chief tool is a computer with access to volumes of publicly available data, Khatri is quick to explain that this “doesn’t mean that we don’t need to do experiments. We are a 100 percent dry lab, but we have been really lucky to have some fantastic collaborators here at Stanford to work with us and validate our findings. The way our collaborators believe our data and our analyses is just fantastic.”

Two examples Khatri mentioned were Jason Andrews, MD, and Shirit Einav, MD, both assistant professors of infectious diseases. “Jason has essentially created two cohorts for us, one in Nepal and one in Brazil, to further test our biomarkers. Shirit is now testing the drugs we predict will work in patients in mice in her lab. She’s amazing; I just have to show her our analyses, and she designs the experiments to test hypotheses from our analyses.”