How biostatistics can improve cancer care, pediatric heart health and health equity

A University of Michigan researcher explains how data, teamwork and statistical science can reveal disparities, guide care and improve health outcomes
Mousumi Banerjee grew up in Kolkata listening to her father reciting Shakespeare, Shelley, Keats and Byron’s poetry from memory. Her mother taught Bengali literature, and Banerjee spent long summer afternoons reading Tagore and other stalwarts of the Bengali Renaissance period.
A high school teacher introduced her to the beauty of mathematics. Calculus was her favorite subject because the “concept of limit going to infinity fascinated me.”
“From my childhood, I loved numbers,” Banerjee said. “I don’t know where it came from because everyone in my family was in literature and languages, so from that respect I was somewhat an outlier.”
Banerjee followed that fascination to the Indian Statistical Institute in Kolkata, where she earned bachelor’s and master’s degrees in statistics. She later earned a PhD in statistics from the University of Wisconsin.
Equations and numbers might be my first love, but as beautiful as pure logic is, it’s much more rewarding and meaningful to me to apply that love for numbers to public health research, with the possibility of helping people live healthier lives.”
At first, she saw statistics mostly as a branch of mathematics, a field full of puzzles to solve. But her first week at the Indian Statistical Institute quickly changed her view. She had been known in high school as a “math genius,” but at ISI she found herself surrounded by students just as gifted—and even stronger. She was also the only woman in her cohort.
The experience was humbling, but it widened her understanding of what statistics could be.
That understanding deepened at Wisconsin. Banerjee arrived with strong theoretical training, but little experience handling messy data and using statistics to answer real-world questions. As a graduate research assistant, she worked on projects involving air pollution in Midwest parks, income tax data and sleep apnea in young adults. Those projects helped her see how data, when studied carefully, could reveal patterns that mattered in people’s lives.
Her path to public health was not planned. It began in earnest when she joined Wayne State University, where her faculty appointment was in the medical school’s departments of pathology and urology. Wayne State did not have a school of public health at the time, and Banerjee found herself in a new world.
She joined a large NIH-funded prostate cancer research group as its statistician. The group met weekly at 6:30 a.m. with more than two dozen physicians, many of them surgeons. Banerjee was often the only statistician in the room.
“At first, I sat through those meetings without understanding what they were talking about,” she said.
She realized she had two choices: leave or learn the language of medicine. She chose to learn. Banerjee sat in on cancer biology classes, shadowed doctors in clinics and worked to understand the science behind the data.
The effort paid off. A few years later, after presenting research on racial and ethnic disparities in prostate cancer at an American Society of Clinical Oncology meeting, an audience member told her she seemed to know “a lot of statistics for a clinician.”
Around the same time, Banerjee began working with the Surveillance, Epidemiology, and End Results (SEER), a national cancer registry. The Detroit SEER registry, one of the oldest in the country, at that time included Wayne, Oakland and Macomb counties and had a large African American population.
That data helped Banerjee and her collaborators study racial and ethnic disparities in cancer care and outcomes. Their work included research on church-based screening programs, cancer survival and the roles that biology and access to care play in health disparities.
Through that work, Banerjee came to see public health as a place where statistics could help answer urgent questions.
“Thus began my journey in public health,” she said.
Today, Banerjee is the Anant M. Kshirsagar Collegiate Research Professor of Biostatistics and research professor of Global Public Health at the University of Michigan Public Health.
Her path has moved through literature, mathematics, medicine and public health. In this Q&A, Banerjee reflects on the challenges that shaped her, the storytelling power of biostatistics and the ways data can be used to improve lives.
What parts of public health are the most interesting for you?
Equations and numbers might be my first love, but as beautiful as pure logic is, it’s much more rewarding and meaningful to me to apply that love for numbers to public health research, with the possibility of helping people live healthier lives.
Has there been an obstacle or challenge that you've overcome to get where you are today?
Absolutely yes, both in my personal and professional journey. I think my life has been like Gaudi’s architecture: no straight lines, only full of bends and curves and twists. At age 4, I had encephalitis and almost died due to misdiagnosis. Instead, I suffered severe neurologic damage to the brain. As a child I had impaired learning and the doctors treating me told my parents that I will not make it beyond middle school. Against all odds and with my mother’s ardent support, I earned National Merit scholarships at the high school level. There have been many other challenges in my journey, many sharps and bends. My mother has been my steadfast champion. I am who I am today because of her.
My experiences have taught me to be a fighter, and to be unafraid of being an outlier. Being an outlier, I strive to be influential (pun intended, especially for my BIO 650 students).
What drew you to biostatistics?
In addition to my love for problem solving with numbers, I also love the story-telling aspect of biostatistics.
Biostatistics provides the fundamental framework to uncover evidence in data to tell a story about health and healthcare. One of my broad interests is to study variation in cancer care in the population, and their downstream implications on outcomes. For example, in an earlier study we looked at the use of imaging tests after primary treatment of thyroid cancer in the United States and found that there has been a marked rise in the use of imaging tests since the late 1990s. This has increased subsequent treatment for recurrence, but with no clear improvement in downstream survival.
Our study showed that except for radioiodine scans in presumably iodine avid disease, it is unclear whether more imaging equals better care. It is also not clear whether the benefits of greater imaging outweigh the financial costs, heightened patient anxiety, and risk of patient harm from the treatment for recurrence. Our study provided the foundation needed to define the appropriate long-term follow-up of patients, and groundwork for future cost effectiveness studies, randomized controlled trials, and studies assessing the role of the patient and the physician in determining the optimal surveillance plan.
Biostatistics empowered us to establish the importance of curbing unnecessary imaging and tailoring imaging for surveillance to patient risk.
What is your main area of research?
My methodological research spans diverse areas of biostatistics including predictive modeling, causal inference, machine learning, big data, correlated data, multilevel modeling, and survival analyses. My substantive areas of interest are cancer and pediatric heart disease. I love the team science aspect of my work.
I serve as director of Biostatistics for the Pediatric Cardiac Critical Care Consortium (PC4) Analytic Center, a consortium involving more than 70 cardiac intensive care units with a goal of improving pediatric cardiac care. The PC4 registry provides key information about outcomes for the most vulnerable patients and serves as a fundamental tool to improve quality. Being part of this multidisciplinary team and contributing my biostatistical expertise in a meaningful way has been equally gratifying and humbling, because our work has made it possible to move the needle in helping children with critical heart disease.
Your work spans predictive modeling and machine learning for health outcomes. What are some of the biggest challenges and breakthroughs you've experienced applying these methods to complex health data, especially in cancer and pediatric heart disease?
I have done work in predictive modeling of diverse, heterogeneous data, such as population disease registries, electronic health records, and administrative healthcare data. The rapid expansion of health data has created both opportunities and challenges, requiring principled statistical methods to address issues of high dimensionality, format variability, dependencies, and observational biases. Machine learning and other modern statistical methods are providing new opportunities to take advantage of these previously untapped resources for patient benefit. My focus has been on developing statistical methods to learn from diverse data types, facilitate better understanding of underlying mechanisms, and improve clinical decision making.
How do you see biostatistics enabling collaborations in global public health, and can you share an example where cross-border statistical analysis led to real-world improvements?
I have led initiatives to enhance biostatistical support of global public health research, education, and training at Michigan and with international partners. My focus region has primarily been South Asia. I have established research and educational partnerships with academic and nonprofit organizations in India, Bangladesh and Nepal to address issues on maternal and child health, water and sanitation, and cancer screening and surveillance. I recently completed a Rotary International funded project to bring safe and sustainable water and sanitation in Sunderbans, a distressed rural area of Eastern India. It has also been my tremendous pleasure and privilege to serve as primary faculty mentor for multiple Michigan student organizations working at the interface of research, practice, and implementation to bring social change.
Health disparities are a key focus in your research. How do you leverage causal inference and advanced statistical methods to identify and address inequities in care delivery—and what has surprised you most in your findings so far?
In healthcare delivery, complex interactions between patient, provider, and hospital factors can influence practice patterns, quality of care, and outcomes. Multi-institution collaboratives and nationwide registries linked to healthcare claims data, offer the opportunity to delve into questions around health disparities, factors influencing care delivery and outcomes, variation in care quality, over/under-use of optimal treatments, diffusion of medical technology, and missed diagnoses. Together with my students and collaborators, we have developed novel statistical methods to study racial, ethnic, and other hidden disparities in healthcare, variation in care delivery and outcomes, and healthcare quality metrics.
Written by Bob Cunningham
Media Contact
Destiny Cook
PR and Communications ManagerUniversity of Michigan School of Public Health734-647-8650





