Karim Galil: Welcome to the Patientless podcast. We discuss the good, the bad and the ugly about real world data and AI in clinical research. This is your host, Karim Galil, Co-Founder and CEO of Mendel AI. I invite key thought leaders across the broad spectrum of believers and descenders of AI to share their experiences with actual AI and real world data initiatives.
All right. Welcome to another episode of Patientless and today we, have a very interesting guest from a very interesting company. I'm happy to welcome Eze Abosi to our podcast. Eze is vice President of New Products at Optum Life Sciences. Why don't you introduce yourself Eze
Eze Abosi: Thank you so much, Karim.
Been a pleasure to, to interact with you and the broader Mendel team. I'm excited to be part of the Patientless Podcast. And so in terms of my background generally, I've been in data analytics services, associated just solutions supporting life sciences for about, 16 years now. I spent 11 of my 16 ish years in the industry, with one company called Decision Resources Group, which has since been acquired by a public firm known as Clarivate.
In addition to my time at Decision Resources Group or DRG, as it's well known as, I've also worked for some very large organizations in this space, such as IQVIA. Likewise, I've been part of startup solutions and, and employers. Which I'll have one general theme, leveraging data analytics and insights to ultimately support pharma in terms of the various needs, in their workflows.
And so whether it's discovery, whether it's medical R&D or even commercial use cases, I've been able to basically collaborate with my colleagues as well as with, my pharmaceutical clients, to find solutions that help them, streamline their business issues. But currently I'm launching new products for life sciences on behalf of Optum Life Science.
Really cool role. in particular to new products. I'm laser focused on clinical genomics, and so being able to take high quality, genomic data, integrate that, in a very, a kind of appropriate way into Optum's core EHR and claims assets, and using that insight to derive, really just noval ways of discovering molecules, to derive novel ways of, of really gauging access or being able to support clients with, the proper stories that they can really kinda communicate to their stakeholders such as regulators. We're also launching some very unique and powerful NLP solutions, natural language processing, which of course is near and dear to your heart and, clinical trial solutions. So above and beyond those products, I also do support partnerships for Optum Life Sciences, which gives me the opportunity to interact with a variety of different innovative companies. So that's a, that's a concise summary of, of what I'm currently doing and kind of where I've been over my career.
Karim Galil: Tons of, exciting things. The clinical genomic one in specific is probably where we're gonna spend a lot of time. It is, it's, I think it's the holy grail of data is combining the phenotypic and the genotypic data together.
But before we get into that, so part of your role, if I understand correctly, is you're able to reach out to pretty much everyone that's working on the next big thing or on an exciting solution in healthcare, and you are able to basically understand what they do and see if there is any sort of, uh, fit into the Optum ecosystem, uh, whether through partnership, commercial agreement or acquisition, and.
You obviously have the luxury of an app Optum.com email, which means everyone is gonna respond to that. Is that part of what you do?
Eze Abosi: That is spot on. I'm, I'm incredibly surprised every day at how willing individuals are to respond to my emails.
Karim Galil: Well, if you have 80 million patient lives, everyone will respond to your emails. Which basically means like you have firsthand access to what's actually happening in healthcare. Right? So the way I look at the healthcare ecosystem, right? It's not only the regular payers, providers, pharma, the three P thing, there is now a whole ecosystem of small, medium size and even big startups that are changing healthcare as we know it today.
And there's two kind of people that you can get, like get to know those companies from their eyes. One is the investment community, the VC community. The other is folks like you who actually understand healthcare are actually doing healthcare and are like, I wouldn't say interrogating, but like investigating what's happening out there.
So how, how, how do you look at the healthcare technical ecosystem today? What's exciting, what's working, what's not working?
Eze Abosi: Well, that's a great question. Three main pillars. Data, analytics, consulting.
There are legacy players and certainly kind of innovative entrepreneurs and fairly large startups that are incredibly impactful at each of those core pillars. And of course there's an interplay. There could be some great, notable companies with fantastic data assets and also great platform.
But generally, I try to think about the marketplace and those three core silos. That being said, I think the analytical pillar can really be kind of peeled back. And so within the analytical pillar, I think about, I consider, basically the method. And so, the actual, analytical expertise or IP that differentiates how a certain company or a certain entrepreneur in some cases analyzes data.
I compare that to the actual technology that allows the client, whoever it may be, to interact with the data. And then I think what's also, very important is the visualization. Um, because ultimately we are analyzing this data, to derive insights that'll support how we address a business issue. And so that analytical layer can be assessed from my perspective, into technology versus analytical methods versus visualization platforms.
Karim Galil: That's very interesting. So you're basically looking at it as like a tech stack. Some sort of interfaces and a layer in between that actually kind of intricates the data and asks the, the right kind of questions.
So what's exciting? Like what are you seeing that you are personally excited about now?
Eze Abosi: So if you just kind of think about everything we've just talked about and, and just also kind of pay, pay unique attention to those three core pillars. For every single kind of workflow within healthcare, broadly, they're gonna be very, some, they're gonna be some very unique, players that are either legacy or just starting up in each of those core pillars.
Likewise, if you wanna get one level deeper for the life sciences, um, my perspective on, for example, innovative analytical platforms will differ if we're referring to the discovery use case versus medical versus clinical trial optimization versus, for example, commercial. And so that's what's most exciting is that depending on.
The primary kind of objective, of your business question or need. Um, there could be a wide variety of different intellectual properties, different platforms or different companies, um, that you may want to consider if not interacting with and that's what I find so exciting.
Karim Galil: the foundation of all of that is, at the end of the day, access to the data, right?
So you can have all the best interfaces and tech stack and IP, but if you don't have access to the data, you, you, you're pretty much like, um, it's like you have a Google, you build the best page ranking system, but you don't have access to the internet, right? Um, so are, are you guys at Optum willing and open to like, collaborate on that because you have a very significant access to data.
Also, what are you seeing entrepreneurs and companies solving for? So obviously everyone is not gonna go to Optum and say, Hey, can we have access to data? So what creative solutions are you seeing out there that has to do with interoperability and access to the data so that you're able to power those applications?
Eze Abosi: That's an incredibly good question. So the, I guess the, the simple answer is that our clients, are requesting at a more, at a higher and higher frequency, that we be, become more flexible in the way that we can allow them to partner with other key suppliers in their value chains. Um, and so ultimately the clients are driving it.
That being said, Optum is such a relevant stakeholder in many different aspects of the, of the healthcare marketplace, especially here in the us, that we have a very unique opportunity to collaborate and drive efficiencies across the broader healthcare ecosystem. Because of those opportunities, we're constantly assessing, how we can better partner, um, and leverage our data and analytics and technology to, for example, improve the way drugs are delivered to patients, by channel.
Versus, um, being able to support how data analysts and technology can help not only providers, but also pharma, uh, keep the patients aware of clinical trials as a care option, and certainly helping or using the aforementioned, again, data and analytic technology, um, to support how, um, a pharma company, um, can leverage evidence, um, to, to speak to the value of their technology or their drug, um, to a regulator or to a payer.
So the, the opportunity to collaborate is plentiful. From the Optum or the Greater United Health Group perspective.
Karim Galil: He, he, hearing you describing that and getting to know you throughout the last few weeks, I believe you're almost building like an app store on top of the data that you guys have been able to successfully accumulate. Is that like a good way of thinking about it?
Eze Abosi: We're building in that direction.
What's really exciting from my perspective is that my manager, um, who's the Chief Growth Officer of Optum Life Sciences, um, his name is Brian Irwin. He comes from a very unique background. He was, he studied in pharma, working for a Japanese multinational pharmaceutical company. He then moved on, um, to one of the, I think today what may very well be the most notable, uh, data agnostic clinical trial platform in the industry.
From there, after successfully helping the, the management team basically, sell the, the asset off, um, to, to a larger organization. He moved to Optum Ventures. And so he's very strategic in how he views kind of the opportunity in life sciences and he recognizes the importance of platform.
And so when I hear application, what I'm kind of, um, thinking about and rationalizing in my mindset are different platforms that the user, um, can leverage as needed. To ultimately answer a business question. Exactly. And so yes, we are absolutely thinking that way. Whether we realize that through organic or potential kind of strategic strategic initiatives, I think that's gonna be the exciting thing to watch.
Karim Galil: That's super exciting, because you're pretty much building an ecosystem around a very unique asset of data and basically looking at it as a teamwork rather than, it's an Optum thing. It's like, what's best for the end client and what can we bring there? But, um, one, one thing that I hate is when we just say data, because data is a very, like, generic term, what data?
Is it structured? Is it unstructured? Is it images? It's to your point, is it clinical? Is it genomic data? It's a pretty broad term. And um, lately you hear more and more things like data became a commodity, which I agree. If we say. Structured data became a commodity. Claims became a commodity, but data is not a commodity.
Unstructured data remains to be very hard to access. You guys are one of the very few players that did a good job there. Genomic data and combining and marrying it to the clinical unstructured data remains to be very hard to access. So let's speak about this a little bit more. Um, like how, what are you seeing happening in, in, in the unstructured data world and why?
I mean, why is it the future is unstructured data, not the claims data?
Eze Abosi: Great question, Karim. Um, I think what's driving the relevance of unstructured data is the continual focus on specializing drug development to highly precise populations. So let's think about in the nineties when the industry was laser focused on developing, molecules that were often focused on the volume of patients available and making available to patients therapeutics that could support highly prevalent, likely chronic diseases like diabetes, COPD, asthma, et cetera. Um, but as this shift from payer from, um, from volume to value, um, continues to transpire and ultimately it's about triple negative breast cancer. And so being able to take a broader population and then specifically tailor, uh, drug development to accommodate very specialized populations within a broader marketplace that's driving the need and the value, I would say of the unstructured data because the broader structured elements or tables, if you will, in data science, if you will, lingo, the physician notes, the symptoms, et cetera, you may not have the information to, to better understand specific cohorts like the triple negative population within breast cancer. And so that's the value.
It's the industry's, overall emphasis on driving value rather than volume.
Karim Galil: A hundred percent and so here's the thing. I believe in the next 10 years, there is not gonna be any clinical trial that is submitted to the FDA without some element of real world data combined with it.
Like, I'm not saying, and I'm not suggesting that real world data is gonna, um, I I think we tend to always be extreme in healthcare thinking zeros and ones. So it's real world data or traditional clinical trials. I think the answer is some way in between, in that spectrum. It's gonna, you're gonna find phase one, phase two, but phase three, phase four, there's a lot of things that you can get smaller cohort sizes, you can eliminate some end.
Like you can do a lot, you can better design the trial, but there is gonna be an element of real world data in every clinical trial in the next 10 years. And to your point, you're gonna need it to find the cohort of patients, but also the complexity of the endpoints that the FDA is asking for simply does not exist in a structured format.
Like good luck trying to find progression in cancer through like querying some tables. Right? which actually makes me very surprised that still. We are finding most of the data players, like data selling players, right? Or folks who are selling access to data, are selling access to structured data.
And I was like, Hello? Like, are, are you guys seeing the switch that's happening? Do you know of anyone that is actually giving access to unstructured data at volume like you guys at Optum?
Eze Abosi: Wow. So the answer is yes, but it's, it's few and far between when you compare that number, um, to the broader kind of real world data supplier, um, marketplace overall.
And so I'll actually revert back to a comment you made earlier about apps, how you see Optum and other kind of innovative real world data and analytical suppliers to the life sciences moving towards a suite of applications in the very near. I agree because of not only the seamless access you can deliver to, the consumer of that data.
Because in your traditional app kind of mentality, I think about being able to pull out your phone or your tablet and access that insight. Um, and so with, with very complex data like EHR or claims or the integration of both, which is of course the goal, um, it's, it's, it's, it's a little bit more difficult than what I just described.
Um, but the beauty is to be able to seamlessly kind of deliver, um, that insight, um, in real time on demand to the user. That's fantastic for the consumer of the data, but from a, from a, from the supplier perspective, what the application kind of interface allows us to is manage, um, and protect the data accordingly.
And so rather than, for example, delivering that data in your native environment, uh, when we can deliver it through a platform like a SAS based platform, if you will, um, that allows us to ensure that it's being used appropriately. And then most importantly, the confidentiality and the privacy, um, of the, um, ultimate kind of patient or consumers, um, where the insights are being derived that remains protected.
And so that's absolutely key, is just keeping in mind, um, that we must protect the consumer as we, as we look to combine novel data types across the value chain and the ecosystem.
Karim Galil: But when, when, when I look at the unstructured data landscape, um, the problem that I'm seeing is or the argument that folks with access to structured and claims data would say is like, Listen, yeah, I don't have, I have a pixelated idea of the patient through the structured data, but I have 300 million patient lives, so I have the breadth.
Um, and if I go the unstructured data out, I'm bottlenecked by abstraction. So maybe I have higher resolution picture, but only of the 30,000 patient population. Right. You guys are at Optum as, as, as far as I know, like one of the very few folks who have solved that problem where you are able to give access to breadth and depth at the same time.
Eze Abosi: Yeah. And so I wanted to begin commenting on that question with an emphasis on privacy because the unstructured data may contain private information, whether it's personally identifiable information or protected health information. And so the first thing we need to do when we think of we broadly the industry, um, is thinking about commercializing an unstructured data asset is to ensure that it's completely compliance to the best of our ability.
But I think that benchmark is well over 99% at this point, And so assuming that the data is protected, um, I think we then have to consider the delivery modality. Um, being able to take a massive amount of unstructured data that is hopefully compliant, um, and then delivering it to a, a, a clients of yours in their native environments that may be feasible.
Um, but I think that there's a, uh, a level of risk there, um, that some companies may prefer, um, to avoid. Um, and at Optum we tend to be very conservative. So going back to that initial point you made about applications, being able to provide an on-demand resource, um, to answer your business question yet be, um, a very appropriate medium, um, for a company like Optum to deliver, um, insights such as unstructured data that's really interesting.
We're continually investigating kind of, that's paradigm. Um, because at scale across therapeutic areas that may be one potential avenue, um, that we can explore, um, that way, um, we can deliver the insights, allow you to execute, for example, models as you see fit. Um, but we can ultimately just ensure that it's being used in the appropriate way and again that the, and the privacy of the consumer is completely protected.
So if you're going to do that, deliver unstructured data at scale, it must be done in the most appropriate way cuz it's incredibly sensitive information. I would, I think it's potentially even more sensitive than your classic structured elements that you would see in, for example, the EHR tables or in claims.
Karim Galil: Definitely. I mean, in structured data like hash, first name, hash, last name hash social security. So you're basically saying there's two elements of complexity here when it comes to unstructured data. One is PHI is How can you make sure that the unstructured data is phi free at scale?
And to your point, the benchmark here, like the safe harbor, is you need to exceed the 99% accuracy. Like that's the levels of tolerance that HIPAA can, can afford. The second element of complexity is now that this data is PHI free, you need to be able to abstract or, or index this data at scale. Which we today, like the state of art is to do it via humans.
And humans would take, we did the math before. It would take um, half a century and quarter a Tridion dollars just to abstract patients of 2021. If you get all patients, all medical records that were generated in 2021, all unstructured data and hire 80,000 abstracters, which seems to be the total number of abstracters in the us, it will take them that much time and that much money to just do this one single year.
Right. So abstraction at scale is a problem, but your point also, which is an interesting one, is de-identifying at scale. Would you say also there's a third level of complexity, which. How can you also tokenize this data and marry it to like genomic data, structured data? Correct. Would be, would that be also a third level of complexity?
Eze Abosi: That is the dream, um, that is the dream is to take, um, especially when you, when you're thinking about complex markets like autoimmune and certainly oncology, the, the key that value add that our clients are looking for really across all use cases, across the entire values value chain. Whether you're discovering molecules or you're trying to protect the life cycle of your, uh, asset that's about to go off patent, Being able to marry the genotypic and the insights, um, at a high rate of varacity.
Across these disparate data types. That's exactly what we're trying to do and achieve on behalf of our clients. I think the ideal is certainly the claims in the EMR linkage and by emr I'm talking about both the structured as well as the unstructured insights, but also integrating, um, imaging and certainly integrating the appropriate kind of genomic information relevance to that category.
And so at Optum, uh, within oncology, we tend to work with either NGS, high quality NGS data that's derived from either liquid biopsies or IHC, um, within non-oncology, broad whole genome sequencing. And so being able to take those insights and pair it with EHR and claims very powerful. Um, if you can include the images even more powerful.
And I think what differentiates, I don't think, I would certainly argue, um, that what differentiates our approach at. Is that our linkage truly does incorporate not only the genomic elements, but also the clinical as well as, uh, claims elements. Because if you look like, look at a topic like adherence, um, and so when we're talking about these very highly complex, uh, disease populations, um, the therapeutics available for them if they're available, um, tend to have some pretty significant side effects.
And so when you're assessing the patient journey and trying to use that information, um, to sort support your business, um, being able to understand is this patient population not adherence because of, for example, the efficacy of the drug, the side effect profile of the drug, or, um, the affordability of the drug is a major, major issue.
Um, and so by integrating the claims elements, um, we can differentiate versus your, if you will, your standard clinical genomics asset available marketplace today. That's,
Karim Galil: that's very evident even from the dynamics of the market today. I mean, we're, we're seeing companies like Datavant, which was very focused on the structured data elements, now expanding their business to how can you tokenize, how can you merge unstructured and structured data assets, which few years ago wasn't something that we felt was a priority for them.
Today we're seeing that becoming more and more of a priority. Speaking about, um, genomic data, and I wanna come back to unstructured data in a second here, but, um, I just saw on, on your LinkedIn, you have an interesting, um, webinar tomorrow with Guardant Health. , um, what's happening, What is Optum doing with Garden?
Eze Abosi: Yeah, so it's uh, gonna be a very dynamic conversation about, uh, the relevance of verbal evidence in terms of clinical genomics and how we can support, uh, the life sciences. And so this is, this kind of, this theme has been permeating throughout the entire year, especially given the FDA guidance.
Um, guidance is the key word there on how to leverage real world evidence, um, in your regulatory submissions and broadly in your clinical trial design, optimization, et cetera. And so, um, that's gonna be the focus of the conversation and I think it's gonna be a very lively discussion because we're not only gonna have the kind of a thought leadership from Optum and from a clinical genomic standpoint.
Um, but um, very much looking forward to, uh, to speaking with Nuray Yurt, who is the worldwide, um, artificial intelligence lead for Novartis. And she has historically focused on oncology. So this is a topic that certainly kind of resonates, um, with her workflow. And likewise, Naveen Kumar from Gardens, um, who leads, um, their core business of commercializing a liquid biopsy to ultimately expedites a diagnosis, uh, patients.
Karim Galil: Are we gonna see an Optum slash Guardant Health combined data asset of clinical genomic data?
Eze Abosi: We might. We might. And so I, we, Optum, we partner actively with a variety of different firms in the healthcare ecosystem, but we tend to be very conservative and confidential .
Karim Galil: Hey, I'm trying to get an exclusive for this podcast.
Eze Abosi: Yeah, absolutely. Absolutely. But, but I, I, um, I think the audience can read between the lines. Yeah.
Karim Galil: So, the first attempt to combine clinical and genomic data was Foundation Medicine at Flatiron. So I believe few years ago, three or four years ago, they, they wanted to do like a, the first clinical genomic database, and they did some analysis trying to find patients that both exist in the Flatiron network, but has also taken a foundation medicine testing.
I believe the result was like around lesser than 20,000 patients that they were able to put together, um, which was fairly, uh, surprisingly low number given the, the access that the two companies had. Um, what's your take on that? Like what's the, Like what, again, talking about breadths and depth right. Do you think that every clinical genomic database is gonna suffer from the breadth problem we're gonna see only like few thousands of patients?
Or is there any current attempts that you may or may not disclose that you may or may not be aware of where we're gonna see like a hundred thousand patients or a 200,000 patient cohort where you can see sequenced, structured, unstructured data altogether in one platform?
Eze Abosi: Yeah, so great question.
Especially given that we're sitting in New York City very close to the Flatiron District. The key there though, in terms of, in terms of touching on your, your point about breadth is you have essentially one. Very relevant EMR and one flagship, if you will, genomics firm, um, combining efforts. And so love the innovation, but it's, it, those are, those are, those are two unique entities for pan therapeutic insights. In other words, we don't specialize just in oncology.
Uh, we can give you insights that are relevant to oncology as well as autoimmune, as well as neurology, et cetera. And so that is, um, how we are approaching it. That being said, the end for the oncology relevant population today within our clinical genomics asset exceeds a hundred thousand patients.
Wow. And what's that is compelling breadth, but in terms of depth, what really makes our asset very unique, and I talk about this all the time with our product lead internally, um, is, is the, the availability of sequential genomic analyses. In other words, some patients in our assets, um, have up to 20, um, tests, genomic tests associated with their, with their patient profiles.
And so you can, for example, gauge kind of the initial, um, genomic profile and see how that potentially evolves over time after various lines of therapy are initiated. In other words, did this targeted therapeutic work impressive or did this immunotherapy deliver better results at certain lines across the patient journey?
And so that's sequential analysis and being able to temporally, um, provide insights from genomic as well as EHR and claims based perspective. That's what's really interesting. Secondarily, um, because we're pan therapeutic, we can look at this one particular, I don't know, lung cancer patient with like an exon 20 mutation.
And we can, not only understand their interactions with their oncology specialists or an oncology providers, but we can also, for example, understand their interactions with the different specialties or primary care providers that are also managing them to facilitate a true patient journey across the entire kind of healthcare continuum.
Not just within the silo focus of what the oncologist or the oncology nurse has been kind of associating or interacting with the patients.
Karim Galil: And, and, and for that to happen, you have, like, I'm positive you're not partnering with a diagnostic company, you're partnering with an ecosystem Exactly. Of diagnostic companies.
Um, that's a really impressive, uh, product. There is three elements, uh, any scientist would like to see in data, right depth. They wanna see as much data as possible, like structured on structured clinical images, breadth. The third, which he touched on is longitude, ality. It's like, if you give me great access for a snapshot of a patient, patient is like a book, it's a journey.
It doesn't really affect my analysis limits what I can do. So it seems like you guys are hitting on all those three aspects. It's the important aspect, the breadth aspect, and the depth aspect
Eze Abosi: of this all. I, I would agree. I would agree. And in terms of depth, I have to tell you that the unstructured insights are what really differentiate, um, what I think is the, uh, what makes our data at Optum, um, deeper than your traditional suppliers.
Yes, integrating genomic EHR and claims is extremely helpful. It's even better if I can tell the clients, um, what staging, um, the, the cancer patient's tumor may be at within, given, within any given points or time in their journey. Likewise, and what's becoming incredibly powerful is treatment response.
Especially when you start looking at. Markets or disease categories, um, outside of oncology with an emphasis on autoimmune. In other words, we have this very complex patient for this very complex autoimmune disease. Um, it's incredibly valuable to understand their genomic profile. Um, but why did they not respond to this core biologic that seems to work with most other patients, this population, or why were there side effects, um, so much more acute, um, for this particular genomic profile versus that genomic profile.
And so being able to leverage different techniques to extract, um, that, um, that insight via, for example, NLP, uh, for those unique and very complex populations, that is a key differentiator and an area of focus for our clients moving forward.
Karim Galil: NLP, you, you asked for it. I was planning not to talk about it , um, but.
So, um, what are you seeing there? Like, because obviously with with depth, uh, sorry, with breadth comes the complexity off, you can't rely, again, as I told, like, uh, as I've already mentioned, like I, I don't think a human force, a brute force can, can help, even if you have access to the breadth, can help you actually achieve the wide scale indexing or structuring of the data.
So what's exciting in nlp,
Eze Abosi: um, um, so, uh, specifically in oncology, um, staging biomarkers, tumor response, um, the elements required by the FDA for approval of your medication that are not available in tructure data simply put, um, an autoimmune severity, um, is a very, and also treatment response, which is actually very much in tune, um, with my remarks in oncology.
It basically, it resonates very well with the autoimmune categories. Neurology being able to take insights from the pathology reports. , um, redact appropriately and then extract the relevant insights to understand what the pathologist saw in their reports and how could that can help inform your drug developments.
Um, so for instance, within the case of Alzheimer's and PET scans being able to kind of assess whether it's through structured data, whether it's through real world data, whether it's through your own clinical trial data, being able to assess, um, how, what types of patients may respond to those therapies, uh, based on not only what's written in the unstructured data, but also what is available in the structured elements, and of course what's available in the actual images, a major area for drug development and just adding value moving forward.
And I'll lastly just touch on, uh, for instance, the, uh, the ability to, um, extract the unstructured elements via NLP and, um, use it as part of the evidence that you leverage and, for example, a synthetic control arm supporting your regulatory application. I think what's really interesting, is the amount of data, that you'll ultimately, uncover, with any good NLP solution. And so it becomes incredibly important s to sift through the elements that actually matter to your question. Yeah. Um, and so although it's incredibly, I guess it's helpful to understand that, for example, 10 years ago per this EMR record and the unstructured.
Physician notes this patient was diagnosed with a certain disease and was prescribed Tylenol. Is that really that relevant when you're trying to assess, for example, the triple negative breast cancer patient dynamic?
Karim Galil: So it's more of a targeted extraction task.
Eze Abosi: Exactly. I think in terms of unmet need, that is where there is significant amounts of opportunity, for the industry to take advantage.
Um, how can you help me at scale understand what information abstracted from the NLP technology is relevant to my business question so I can run more efficiently? That is kind of the next, um, the next arena that I think novel technology platforms will be able to support in terms of NLP and the real world data.
Karim Galil: Um, same way that you guys, I'm sure suffer from that statement, data is a commodity, which is a very, again, broad statement. Structured data is a commodity, but nobody's saying that. We, we also in the, in the tech side of things, hear a lot, NLP is a commodity. Um, NLP is a solved problem. Uh, we're gonna use GPT-3 or we're gonna use, uh, transformers, we're gonna use Bert.
So there's like a lot of, existing technologies. Do, do you, do you believe in that? Do you think it's an, it's a commodity? Uh, again, like the purpose of this podcast is to bring in, um, folks who are excited and folks who are not excited about AI. It's not in any, in any shape or form, uh, uh, promotion for Mendel.
So I'm not gonna take it first. I truly wanna know, do you believe that NLP today is a commodity or it remains to be an unsolved problem for the use cases, or to your point, the questions that folks are asking today?
Eze Abosi: Yes. NLP is a commodity.
Karim Galil: Okay,
Eze Abosi: what is not a commodity?
Karim Galil: I'm gonna take it personally. .
Eze Abosi: You asked me the word direct feedback and so what's not a commodity is the NLP frameworks that understand the nuances of healthcare.
More specifically, the NLP frameworks that can understand the life sciences workflows. Uh, a fantastic NLP solution should look incredibly different supporting drug developments versus commercial operations. Interesting. And where I was, I was approached by, I won't say their name and NLP platform today.
And they wanted to basically sell me, frankly sell me, um, a solution that we could install locally within our Optum servers. And it'll help us, you know, derive NLP at scale. Well, that sounds fantastic. Please share with me your experience in life sciences and specific disease categories that you have focused on with this NLP framework?
Well, that's not really how it works. We basically sell you a box and what you do with it is up to you. Okay. No delete .
Karim Galil: It's a very interesting point. So you're basically saying if you look at NLP as, as a generic field, it is a commodity. But if you look at NLP that understands the context of drug development and the context of the endpoints that like pharma would be interested in, that becomes a different story.
Eze Abosi: Absolutely. And so it's, so if, if you, theoretically you being a, a potential partner, if you are trying to position an NLP solution to me, um, that can solve all problems in healthcare. That's a red flag.
Karim Galil: A hundred percent.
Eze Abosi: If you are trying to position an NLP project to me, that can solve all life science workflow.
Uh, needs. That's also red flag because I inherently believe that the approach for just generally supporting discovery, medical and R&D kind of that core kind of medical function, it's, it must be distinct and different than being able to support commercial. And so for, let me give you some tangible examples there.
If, if I am an appropriate NLP solution, um, supporting the commercial workflows, um, the, the thought process within the extraction is going to be how can I take this information and better support my field force or HCP engagement? Um, that is just a much, that's just a much different mindset than being able to leverage that same NLP technology, um, to support discovering of new biomarkers Mm, or new ways.
Where we can capture disease severity or new ways to find misdiagnosed or undiagnosed patients, or patients broadly progressing in their disease. Although it's all still pharma, um, from my perspective, the use cases are distinct enough that it can't be a one size fits all, and I've still yet to discover one particular solution that can appease both.
That's so even just within life sciences, pharma, commercial and pharma R&D,
Karim Galil: That's very true. Um, we were interviewing someone, um, and he's from outside healthcare. He, he never touched healthcare before, but he comes in from the tech world and we were explaining like we're giving a demo of our product, like making sure that he understands how it works.
And he asked us that question, it's like, can the NLP extract anything? And our answer was no. Because what we have seen some data variables, same medications, AI can do better than humans because medications like you have to read a thousand pages record. Humans are like, get tired. So AI will do better, but outcomes, it's hard for both humans and AI.
Three humans will not agree on the outcome of a specific patient after reading the record. So you cannot expect a machine to to, to do that. Um, so we, anyways, we were going through that and he was like, Do you guys have, uh, the same thing like autonomous driving L1, L2, L3, L4's? Like what the heck is that?
I was like, Well, in autonomous driving there's four categories, right? There's the L1, which is like fully autonomous, and then there's the L2, which is like a Tesla where your hands are on the wheel, or I forgot the categorization, but the idea was like going on a spectrum from full autonomous to actually just speed control, right?
And there is many things in between. I believe the mistake that is happening in health tech today is packaging NLP as a Google driverless car, fully autonomous, your hands are off the wheel. We got you. Whether you are in Egypt or Nigeria or in New York or California, this car is gonna be fully autonomous.
Where the reality of the matter is, it's somewhere in between. It, it, it really depends on the use case. It depends to your point, even on the therapeutic area. And it, there is a lot of variabilities on how autonomous you can get with an NLP solution. And the more realistic we get about it, I think the better we're gonna be able to tailor solutions and, and, and use NLP as an augmentation tool rather than an L1 rather than a fully autonomous, your hands are off the wheel.
We got you kind of an approach.
Eze Abosi: Yeah, I completely agree. Which actually sparks the thought from my perspective. Um, how in or how interesting would it be? And who will be the first NLP solution for the life sciences, if not just medical R&D, um, that can accommodate multiple languages at scale. Um, I think it makes, it's a very kind of rational approach to, maybe it's just my very myopic western mindset to focus on the English language.
Yeah. Um, at this stage in this particular type of products developments. Um, but it begs a very interesting question that you just touched on as you kind of think about kind of the global rollout of artificial intelligence or basically smarters, uh, that can drive themselves. I just, um, just the nuances in healthcare, although like guidelines generally maybe consistent across, across, um, across cultures.
Um, there, there, there are many, many nuances. Um, especially by language. Especially by dialect. I have a certain language.
Karim Galil: I practiced medicine in Egypt and I, I wrote medical records in English, but in Egypt, I can tell you the way I would approach a patient or the way I would describe the context of a patient is different than what I would do here in the us.
The drugs that we can prescribe there is different. The healthcare system there is different. Um, I wanna make sure the patient get covered. Yeah. So like it's, everything is different. Uh, one of our competitors have, on their website, we support five languages and it was like, maybe you support five languages, but you don't support five different contexts Exactly.
Of how to practice medicine. Right. it is just different. Um, That's actually a very interesting point.
Eze Abosi: Yeah. It actually raises another point that I wanted to share, which is, there's some really unique categories from data, from a data science perspective within neurology because a number of these neurological conditions, especially for the elderly, are literally life changing because they'll not only formally confirm you have a certain disease, but your life will be changed dramatically.
Dementia. Alzheimer's multiple sclerosis, Parkinson's. it's an unwritten rule from my understanding that essentially providers will do everything in their power to elongate the diagnostic journey for those patients. Because once you're formally, diagnosed with Alzheimer's, you ne you can no longer retire.
For example, in a long-term care facility, it must be a facility that accommodates dementia patients. And so there are, there are many kind of ad hoc NLP exercises, um, that I have been a part of throughout my career where we're basically looking for signals of individuals that essentially look like they have Alzheimer's, but have yet to be formally diagnosed.
And this has become incredibly, uh, pertinent as again, the industry looks to specialize the disease in therapeutic developments and. More so focus on tailoring therapies for the early onset neurological patients rather than those that are severe and basically in the later stages of their disease.
Karim Galil: That's very interesting.
Eze Abosi: Yeah, it's very interesting from an NLP perspective, and today, I don't think there's a clear winner in that arena since from my understanding, um, natural language processing tends to, tends to be really focused on not just oncology, but specifically solid tumors. It's another kind of unmet need, um, that I think an inva an innovative entrepreneur will ultimately solve within that very unique category of neurology.
Karim Galil: I can tell you why, uh, most of, most of the AI approach, so a AI and machine learning are used. Synonymously, right? Like people think when you say machine learning, they think it's AI and AI is machine learning, which is not really true. Machine learning is a subset of AI that, and most of of it, relies on statistical modeling.
So you wanna give a lot of data, then the machine learns from this data, certain rules and, and, and things, right? Oncology tends to be the one where there's a lot of data out there. Relatively a lot of data compared to other therapeutic areas where you can. So it becomes a lower hanging fruit for any NLP company to build a statistical model on top of it because you have access to the data, right?
At Mendel of the way we solve that is we, we do statistical model. But half of our platform is not reliant on statistical modeling because simply you don't have enough data out there. My co-founder always makes a joke. He says, If we ask an AI system today, what's the best treatment for something?
Probably it's gonna say Opioids. And that's not The treatment is just, you know, because of the Opioid crisis, learning from the behaviors of humans, AI is gonna learn something that is not actually clinically correct, it just because it's existent in the data at scale. So how can you build an AI system that is not just relying on statistics, but also relying on what is true and what's false from a clinical perspective, from a nuanced clinical perspective becomes a tough problem.
Eze Abosi: Yeah, I completely agree. I would take that same exact mindset and apply it to ultra rare disease where the diagnostic odyssey is often transpiring over 6, 8, 10 plus years. And so, how can you train a model, for population that really, there's not a volume, a lot of volume that exists and to start off with, and the volume that does exist is likely going to be relevant to patients that are undiagnosed.
It's a very complex problem, but that's why we have individuals like you and your team, thank you solving these complex problems for us. In the life sciences,
Karim Galil: The answer relies in, in symbolic AI. And if you actually go to Google Scholar and search machine learning, you're gonna find papers published as, as like yesterday, right?
Like a lot of papers, if you put symbolic AI, probably the last paper published was like late nineties. It's an approach that nobody's using anymore, just because it's not trending, it's not, it's not sexy enough. And, uh, that's the problem when you approach, you know, like if, if you're trying to build a drill without looking at the hole that the customer's trying to actually use the drill for, you end up with single platformed.
I wanna be conscious of your time. Super exciting work that, that you guys are doing on, again, like getting clinical data with unstructured, with structured, The more exciting thing is the fact that you're opening, you have a network effect that a lot of companies don't have and you are opening that network effect for innovation and you're basically getting everyone to participate so that the end user gets the highest value.
And that's what we are seeing as a winning solution today, right? Like we're seeing, um, had iPhone not open the platform for others to build apps or Android, they would have been a Blackberry, right? And healthcare has a lot of Blackberries and it's refreshing to see a company like you, moving towards being the iPhone or the Android of healthcare because at the end of the day, the industry that we're in is touching patient.
I'm sure when you see a PubMed paper published on top of an Optum data asset, it just, you feel like you're your job worthwhile, right?
Eze Abosi: Really just happy, to have the opportunity to kind of, to serve our clients and all of our key stakeholders and really is a privilege. Um, can I just, before we wrap up, I want, I want to ask you just one key question, Karim.
Which is that, diversity, equity and inclusion has just been a major topic, resonating throughout our society here in the US as well as just globally, more so than most throughout the pandemic and as we come outta the pandemic. And that has been certainly pertinent as well to, to healthcare.
And one of the key focuses of our clients moving forward, is ensuring that during the clinical development, the clinical trial process, they have a population that represents the actual potential users of that potential investigational drug. So circling back to your team at Mendel, really curious cause a great way to, from a data science perspective, just to understand the relevance of like health equity and diversity is to look at SDOH or social determinants of health.
And so it, are there any novel approaches, methods et cetera that you can comment on, on how you can extract SDOH variables, using NLP. Since that is an, along with the genotypic and phenotypic insights, that is another data type that we're laser focused on at Optum.
Karim Galil: That's an awesome question. so I, think the future is not gonna be clinical data. It's not gonna be clinical genomic, it's gonna be social clinical genomic data. I, that's actually the reality of it. And there's two, there's two answers for that.
One is obviously in NLP and AI in general, trying to extract certain data variables like ethnicity, and that's usually tough. It, it's, it's not easy. And I can tell you why, like, usually ethnicity exists. in a checkbox that the patient fills in, if any, when they come in and machines detecting a checkbox today is a very tough problem.
It's, it's a really tough problem. You need to have top-notch OCR. You need to have top-notch form detect. There are companies built just on form detection, so ethnicity is an endpoint that remains to exist, unfortunately, into forms and get, getting those out is not easy. But we we're doing a lot of work on that.
The approaches that I'm seeing that I'm personally very excited about, uh, are actually outside the NLP. We're seeing folks getting grocery data and trying to understand from your shopping list. Like if you say get your Safeway data, you can understand how much the patient actually makes, right? If this patient is buying expensive brands or buying cheap brands, if this patient is buying every week or every two weeks. Is there malnutrition? What the quality of the things that they're buying? So getting data from Costco, Safeway and actually marrying it to the EMR data and to the genomic data is pretty impressive. We're seeing even folks who are getting water quality from the zip code that the patient exists in and marrying it to it.
There is a lot of data variables and parameters that we're seeing some of our clients adding it there, but unfortunately not all of them are easily available. But to your point, definitely a move that's happening. The other way of looking at it also is if you start looking at it from an international, to your point earlier, right?
A lot of the data, what you have today is based on white male Caucasians in West Europe or the us, but then take that same drug and go to Egypt, for example, right? It's a whole different story. you raised the point about affordability. Maybe it's a great drug, but nobody can afford It's a hundred thousand dollars drug.
So Aldi was a great example for that. A hundred thousand dollars drug, great for hepatitis C, but no Egyptian have a hundred thousand dollars to spend on a drug and they had to work to, to make this happen. And those elements are still yet not existent in the data. And um, I'm excited to see that this is becoming more a topic that folks are talking about it today.
Eze Abosi: Absolutely. Yeah. Great answer.
Karim Galil: But it's interesting, that every time someone is taking a Pfizer vaccine, Covid vaccine, there is a data element in it that is coming in from Optum. And here is what's exciting. I mean, I'm all for privacy and, and all of that for the record. I'm all for privacy. Yet, when you look at the grand benefit of the population, right, does it really worth prohibiting data sharing for all the concerns about privacy when you can truly, truly save lives if you integrate those data into the day-to-day practice of drug development or commercialization?
I don't know the answer is, uh, is more philosophical than just the 99% accuracy that HIPAA is asking for.
Eze Abosi: Agreed.
Karim Galil: Yeah. Hey, Eze, thank you so much for taking the time, time passed by. It was a great discussion. we touched on a lot of different things. And again, thank you for making the trip and, it's time for us to get some dinners.
So, we're gonna have to wrap up. Any final thoughts or comments?
Eze Abosi: So thank you. Thank you to the Mendel team. This was truly a pleasure truly an organic and just free flowing discussion. Thank you for having me. Looking forward to just, uh, keeping the, uh, the lines of communication open and hopefully we can further collaborate moving forward. Thank you again. Appreciate it.
Karim Galil: Thank you. Today's episode was recorded by Benny Pham. Thank you, Benny. He's behind the camera and yeah, we'll see you in the next episode. Bye.
Delays in clinical trial enrollment and difficulties enrolling representative samples continue to vex sponsors, sites, and patient populations. Here we investigated use of an artificial intelligence-powered technology, Mendel.ai, as a means of overcoming bottlenecks and potential biases associated with standard patient prescreening processes in an oncology setting.
The application of Artificial Intelligence (AI) in healthcare has been revolutionary, especially with the recent advancements in transformer-based Large Language Models (LLMs). However, the task of understanding unstructured electronic medical records remains a challenge given the nature of the records (e.g., disorganization, inconsistency, and redundancy) and the inability of LLMs to derive reasoning paradigms that allow for comprehensive understanding of medical variables. In this work, we examine the power of coupling symbolic reasoning with language modeling toward improved understanding of unstructured clinical texts. We show that such a combination improves the extraction of several medical variables from unstructured records. In addition, we show that the state-of-the-art commercially-free LLMs enjoy retrieval capabilities comparable to those provided by their commercial counterparts. Finally, we elaborate on the need for LLM steering through the application of symbolic reasoning as the exclusive use of LLMs results in the lowest performance.
We’ve changed our look. Our goal remains the same: make medicine objective. The new site highlights the way our proprietary AI enables organizations to achieve quality and scale when structuring unstructured data. It comes down supercharging your clinical abstraction. We’ve validated that our human in the loop abstraction approach can support a machine that understands medical context like a physician. In our own experiments, the number of variables needing correction decreased by 40%. High quality abstraction = high quality data for cohort selection, real-world evidence, and registries.
The customer, a key player in the genomics space, had a strategic initiative to build a clinic genomic database to support their life sciences customers.
One clinical trial organization was using manual chart review and was looking to reduce the time it takes to find eligible patients.
From the Desk of the AI Team
Organizations that use patient data for internal or external research need to take steps to prevent the exposure of PHI to those who are not authorized to view it. They do this by redacting specific categories of identifiers from every patient document. Once the identifiers are masked, the risk profile of these datasets is significantly reduced. But how do you ensure that redaction engines are working to the highest accuracy?
The Mendel team is still buzzing from our week-long retreat in Cairo. The theme of the retreat was “coming together” and it was the first time the American and other remote employees were united with their Egyptian counterparts. Although there were many adventures–missing flights, seeing the pyramids, haggling at Khan el-Khalili–the highlight of the trip was collaborating together, as one global organization.
Competence via comprehension
Artificial intelligence (AI) is playing an increasingly important role in the healthcare industry. But to fully leverage the potential of AI, it must be equipped with clinical reasoning skills - the ability to truly comprehend clinical data, or in other words, to read it as a doctor would. When it comes to data processing tools, only a tool capable of clinical reasoning can effectively process unstructured clinical data.
Sailu Challapalli, our Chief Product Officer, spoke at a recent Harvard Business School Healthcare panel. The event brought together different healthcare and AI experts to discuss large language models and their impact.
Manually abstracting patient data at scale is an herculean task for humans alone. It is slow, expensive, difficult, and requires extreme precision and accuracy. Organizations have to choose between breadth and depth when it comes to making data useful for decision making. Because of these challenges, the Mendel team created Carbon. Carbon is an easy to use workspace that allows clinical abstraction teams to efficiently curate high quality clinical datasets at scale. The foundation of Carbon is Mendel’s AI. Carbon pulls directly from Mendel’s AI platform to give abstractors a headstart in identifying relevant data elements within a patient’s chart.
Within the real world evidence space, the generally accepted process for creating a regulatory grade data set is to have two human abstractors work with the same set of documents and bring in a third reviewer to adjudicate the differences. These datasets also serve a second purpose - as a reference standard against which the performance of human abstractors can be measured. Although this remains the industry standard, it is expensive, time consuming and difficult to scale.
From the Desk of the AI Team
AI projects have created tangible results for a wide range of industries. Despite the innovation, it is important to remember that AI is not a magic wand that will solve every problem in every industry with a single wave.