#018 - Predicting hemodynamic significance of the PDA using multi-modal physiological data
- Mickael Guigui
- 2 hours ago
- 11 min read

Hello friends 👋
In this episode of Fellows Friday, Rupa welcomes Giulia Lima, a third-year neonatal perinatal medicine fellow at Harvard and incoming Assistant Professor at the University of Miami. Giulia walks us through her fellowship research journey, from a large epidemiological study of premature infants with congenital heart disease through the Children's Hospital Neonatal Consortium, to her Marshall Klaus Award-winning work on optimal surgical timing. The centerpiece of the conversation is her machine learning model designed to predict hemodynamically significant PDA using continuous physiological data, including heart rate variability and diastolic blood pressure trends, without relying on echocardiography. Giulia also reflects on the power of mentorship and her vision for integrating artificial intelligence into real-time clinical decision-making in the NICU.
----
Short Bio: Dr. Giulia Lima is a third-year neonatology fellow at the Harvard Neonatal-Perinatal Medicine Fellowship Program. She earned her MD from the Federal University of Rio de Janeiro and completed her pediatrics residency at the University of Florida before joining Harvard in 2023. Her work focuses on large data analysis and predictive analytics in the NICU, with the goal of integrating advanced quantitative methods into front-line care for premature and high-risk infants. She is the recipient of the 2024 Children’s Hospitals Neonatal Consortium Mentored Fellow Research Award and the 2025 American Academy of Pediatrics Marshall Klaus Neonatal-Perinatal Research Award. She will soon be transitioning to her new role as an Assistant Professor at the University of Miami, where she hopes to continue her work in predictive analytics.
----
The transcript of today's episode can be found below 👇
Srirupa (00:00) Hello everyone, welcome to another fantastic episode of Rupa's Fellows Friday. I am so excited to welcome Giulia Lima from Harvard Neonatal Perinatal Medicine Fellowship Program. She is a third year NICU fellow. I'm so excited to have her and highlight some fantastic work that Giulia has been doing as a fellow at Harvard. Giulia earned her medical degree from the Federal University of Rio de Janeiro and completed her pediatric residency at the University of Florida before joining Harvard in 2023. She has done some fantastic work on large data analysis, which we're all going to dig into and understand, because I think most of our listeners are still getting familiar with machine learning and data analysis. So it would be fantastic to understand that from Giulia. She's also the recipient of the 2024 Children's Hospital Neonatal Consortium (CHNC) Mentored Fellow Research Award and the 2025 AAP Marshall Klaus Neonatal Perinatal Research Award. We're going to hear all about the great research she's been doing and as she transitions into her new role as assistant professor at University of Miami. Welcome Giulia, how is your day going so far?
Giulia Lima (01:19) Thank you, Rupa. I'm so excited to be here. Pretty good, I'm towards the end of fellowship, so I have a lot of time to finish up my research, a lot of clinical time right now, all good.
Srirupa (01:29) It sounds like you've kept yourself busy these three years. I love that June is coming up, such a big transition month for all of us. Share with us the work that earned you the Marshall Klaus award, and the work you want to talk about today involving your machine learning skills.
Giulia Lima (01:50) Absolutely. I did three big projects during fellowship. The first was with the Children's Hospital Neonatal Consortium (CHNC) through their Mentored Fellow Research Award, the first one I started. It was more of an epidemiological study, using a very large dataset of 11,000 premature babies with congenital heart disease. It taught me how to deal with multi institutional data and large data analysis, and helped me understand what my next steps would be. With the Marshall Klaus Award, I worked with single institutional data, all Boston Children's Hospital patients, also premature babies with congenital heart disease. That's when I really got my hands on the data and learned to do more advanced statistical analysis. We looked at the question of optimal surgical timing for premature babies with congenital heart disease, which I presented at PAS this year with my amazing co-mentors, Dr. Sarah Morton and Phil Levy. Now, as I move into my assistant professor role, I'm dedicating myself to learning how to deal with continuous data monitoring, which is truly big data. Even though we don't have a very large number of patients, you have second to second data: EKG, heart rate, respiratory rate, SpO2, NIRS, blood pressure. It's very complex and interesting data to work with, and it lets me move from the epidemiological work I did with CHNC toward individualized precision medicine analysis, still within clinical research, which is what I intend to do at Miami.
Srirupa (04:15) That's amazing, and I think that's the future, honestly. We have so much data at the bedside, plus additional data from echo, and the question is how best to integrate data on complex medical questions. One of the most complex questions we face as neonatologists, and one I personally love asking, is how do you define a hemodynamically significant PDA (patent ductus arteriosus)? There have been so many trials, and it's hard to interpret them given the variation in definitions. Walk us through your project, and break down for us what machine learning is and how we could integrate it going forward.
Giulia Lima (05:04) Absolutely. This project is being done at Brigham and Women's Hospital, our birthing hospital where we have our small babies, in partnership with Dr. Mohamed El-Dib's lab. At Brigham they don't routinely collect physiological data, which is the case for a lot of NICUs. I'm hoping that as prediction modeling and machine learning gain traction, that will change, and it's one of the reasons I'm excited to go to Miami, they collect this kind of data and have been doing this research for a long time. Basically, once you have the data, you have to decide what each variable can tell you. So you go through an exercise based a lot on clinical practice. One of the big things used for prediction modeling from heart rate is heart rate variability, since high variability usually means a baby is doing well and low variability usually means something is wrong. For PDA specifically, we looked at diastolic blood pressure, slope trends in diastolic blood pressure, and trends in mean arterial pressure. I was excited to look at NIRS data, regional saturation, especially renal saturation, seemed very relevant for PDA, but it didn't make it into the final model because there were too many missing data points, the time babies had NIRS on didn't overlap enough with our timeframe of interest. Since I didn't have NIRS, I used variation in BUN and creatinine values to see if that was predictive instead. I also looked at clinical data points like whether mom had chorioamnionitis or whether the baby was on respiratory support.
Giulia Lima (07:28) The truth is, you don't know ahead of time which variables will actually be predictive, that's where machine learning comes in: identifying which features, and which combinations of features, matter. Mean blood pressure might be just as important as diastolic blood pressure, but they're so correlated that putting both in the model can actually make it perform worse. So you go through a standardized feature selection process. You start by thinking through what's likely important, then the model runs through variations of features automatically, testing whether adding a given feature helps or not. Once you have the best feature set, you can look at how the model performs. My model, still preliminary, ended up including heart rate variability, which was very significant, diastolic blood pressure, also very significant, and type of respiratory support (mechanical ventilation versus CPAP), which was the most important distinction. Gestational age did not make it in, it was too collinear with other features, and oxygen saturation also didn't make the final model, same reason. What I found interesting is that we had extremely high sensitivity, above 90%, with modest specificity, around 70 to 75%. So for screening purposes, the model seems to work well: if it says a baby does not have a hemodynamically significant PDA, that's probably correct. This gets back to your second question, how do we define hemodynamically significant PDA, because the model is only as good as the labels I give it. The way a prediction model learns is I give it a training set, telling it which babies have a hemodynamically significant PDA and which don't, and based on those features it learns to predict that label in babies it hasn't seen. If I give it the wrong label, it learns incorrectly. There are two ways we're considering doing this labeling. One option is simply going by whether the clinical team decided to treat the PDA, but that has flaws, the clinical team doesn't always correctly identify whether a PDA is truly hemodynamically significant, and there's a strong temporal trend. Three or four years ago, teams were much more likely to treat even a moderate PDA on echo without clear signs like diastolic flow reversal. Now, with changes in practice and new evidence, even with large PDAs, teams often decide not to treat. The good news is that at Brigham they were using a PDA score based on what Iowa does, which gives a more objective way to define hemodynamic significance, regardless of whether that score's clinical meaningfulness is fully established yet, that's still an open question. Our next question becomes whether or not to treat, though that's not really the goal of this particular study. We're partnering with cardiology and going back to retrospectively score echoes that didn't have a PDA score done, so we can train the model on more accurate labels.
Giulia Lima (12:14) I was a bit concerned about our specificity not being as good, and that can be adjusted in the model, you can trade some sensitivity for higher specificity. But I think high sensitivity gives more useful information, especially in places without 24/7 echo availability, or when a baby is too unstable and you don't think doing an echo right now would change management. You get that extra piece of information without relying on echo. I'm very pro echo, I love what Iowa does for hemodynamic assessment, but that's not the reality in a lot of places. So I think having this extra tool would be valuable. The model did predict babies with SIP (spontaneous intestinal perforation), NEC, or sepsis as having a hemodynamically significant PDA, which shows the model may be a bit too sensitive in some sense, but I was actually glad about that, because it tells me the model is genuinely picking up on hemodynamic instability. Whether we need to adjust for that is something we'll keep working on, but I think it was a good start, and overall I'd favor higher sensitivity over higher specificity for this kind of screening tool.
Srirupa (13:36) Absolutely, and this matters because, as you rightly said, the unknown part is how best to define hemodynamically significant PDA, so anything that gets us closer to that definition helps. On the model itself, my question is whether it changed with gestational age. I ask because now that we're resuscitating babies as early as 22 weeks, given myocardial immaturity, I'd think there might be heart rate variability explained just by prematurity itself. Did your model perform differently across gestational ages?
Giulia Lima (14:25) That's such an interesting question. In the univariable analysis, using standard statistical methods, gestational age was the most significant variable, with a very low p-value, unsurprisingly. But when added to the model, it didn't improve performance. I think there are a couple of reasons. First, we only included babies at 28 weeks gestational age or less, if we'd included older babies it probably would have made more of a difference. Second, gestational age was likely correlated with other, more important variables already in the model, that's part of the black box nature of these models, you don't fully understand it until you test it. I also tried running the same model excluding all the 28 weekers, since almost none of them had a hemodynamically significant PDA, and it performed slightly worse, I think because those babies helped the model learn what it looks like to not have a significant PDA, so it was important to include everyone. But gestational age and birth weight did not end up in the final model, which was very interesting.
Srirupa (16:08) Interesting, all new layers being peeled back. I could talk about this all day, it's so interesting, but I'd love for you to speak to how important mentorship was during your three years. You've clearly kept yourself busy with these projects. Share with us, especially for incoming fellows interested in this line of research, advice on mentorship, and your biggest takeaway from your three years.
Giulia Lima (16:43) I don't think I can highlight my mentors enough, they were fundamental to the researcher I am today. In my first year, the fellowship helped guide me toward the ideal mentor based on my interests, and I had pretty broad interests. The only thing I really knew was that I wanted to work with data and learn how to manage big data myself. That was very vague, I didn't have a specific disease focus, I knew I liked hemodynamics broadly, but wasn't clear on specifics. The program identified Sarah Morton, who has a strong genetics and genomics background but also the skills to work with the kind of data I liked. At first I wasn't particularly interested in genomics and genetics and wasn't sure that was the right direction, but I'm glad the program didn't just let me do whatever I wanted, it was absolutely the right decision. Sarah was an amazing mentor, and since she does a lot of work in congenital heart disease, Phil Levy was part of that too, and they were both wonderful. I think the most important thing when choosing a mentor is, first, think about what you want to learn, don't restrict yourself to something too specific. Think about what skills you want once you graduate fellowship, then choose people genuinely committed to your success and your growth. That makes all the difference, surrounding yourself with people like that opens up opportunities you wouldn't have thought possible. The other thing is, of course, know your limitations and learn to say no, but I feel fellowship is the time to explore, learn, and get your hands on the data. So while knowing my limitations, I really tried to say yes to almost everything, and I think that made a big difference over these three years, especially with mentors who support you and know what would be helpful, you have to trust them. That was very meaningful to me. I learned more than I can describe about handling data, writing papers, being a corresponding author, and doing reviews, and I wouldn't have been able to do that without the right mentors.
Srirupa (19:59) That's amazing, and you've said such important words. I truly relate to the fact that fellowship is three extremely valuable years where you learn, explore your limitations, and discover things you can do that you thought you couldn't, that's what makes up for all the hard work of getting to fellowship in the first place. One final question, Giulia. How do you envision your research moving forward? I know you're transitioning to Miami, how do you envision the next five years of your career?
Giulia Lima (20:39) I'm so excited about going to Miami. Not every place has the kind of data I want to work with, they were part of the original HeRO trial, the prediction model for sepsis, and they've been working with this kind of data for many years. What I envision is using this data to really change management, individualizing care for each patient. That most likely means integrating prediction models into electronic medical records, I think that might very well be the reality within the next five years, and I want to be part of that change. There are a lot of things that need to happen, more places need to be doing this kind of research, you need multi institutional projects, you need validation, and ultimately you need randomized clinical trials, because it doesn't matter how good a model is at predicting something if it doesn't actually change outcomes. There are a lot of steps ahead, but I think the way to lead this change is full integration with electronic medical records, using artificial intelligence to help clinicians, because I've always been amazed by how much data we have every day, every second. Our brains just aren't capable of interpreting all of it, and with the growth of artificial intelligence, I think that's the way forward, and I hope to be part of that movement.
Srirupa (22:23) Absolutely, and you said the right words, we have so much data being actively collected right at the bedside while we're talking. It's so important to have physicians like you wanting to integrate that into real clinical practice, that's wonderful. This was fantastic, Giulia. I'm so glad I caught you before you graduated fellowship, because I think a lot of our incoming fellows can take a lot of notes from this episode on how best to integrate their interests into their three years of fellowship. Thank you so much for joining us, and I hope to see wonderful things from you in the future.
Giulia Lima (22:40) Thank you, it's a pleasure to be here.




Comments