How we're using generative AI to support outpatient care in Peru

A blog by Peter Sahota, PhD, from Inavya, a Frontier Tech implementing Partner

Pilot: EmpatIA - AI to enhance healthcare in remote areas of Peru

 

Artificial Intelligence (AI) can be positioned as a copilot, meaning an assistant or companion that can support humans to achieve goals, or as an agent that can do tasks semi-autonomously, or as a collaborative partner working alongside humans.  Large language models (LLMs) in particular, which are capable of human-level conversational understanding, can also be a powerful coach or tutor.  The wide availability of LLMs and other generative AI models provides a unique opportunity to develop a bespoke AI solution that meets the identified needs of outpatients.  Inavya’s concept of an AI Avatr draws on all the above ideas in a way that works in healthcare and in a global context.  For us, the term Avatr is about empowering people to act smartly and connectedly by dynamically representing their interests and concerns in digital form.

Working in partnership with hospitals such as Detecta Clinic to understand the needs of healthcare teams and outpatients, we have been developing and testing an App-based AI conversation agent to support outpatient care. This gives outpatients who are following a treatment plan the level of healthcare literacy they aspire to and the medical information that their healthcare team would want to share with them in response to their queries.  Functionally, this means that the AI agent will have a normal human conversation with an outpatient that exactly reflects what their healthcare team would say, if only they could be available for the patient at any time and for any length of time.  

In the context of Peru, we also focused in on two further aspects, which were the language used for the conversation (‘multi-lingual ability’) and the ability to give answers that are illustrated with relevant images or videos (‘multi-modal content’).  A final development underway is a multi-disciplinary team of agents to search, reason and act to perform tasks on behalf of the outpatient (‘multi-agent development’).

AI conversation agent

In an era of widespread digital technologies including websites, Apps and Chat GPT, expectations are rising about how healthcare should be delivered.  Further, in an era of social media, there are new ways to educate and inform patients effectively.  In this context, using digital tools for healthcare information, advice, support and encouragement becomes very intuitive.  However, when providing support to an outpatient with a given medical condition and agreed treatment plan, a higher threshold is needed for such communication, in terms of accuracy of information, consistency of advice, relevance of support, and encouragement that is aligned to the goals of the agreed treatment plan.

As such, outpatients find it very valuable to be able to turn to an App on their smartphone that can hold a conversation at the level of a healthcare professional who is fully informed about their current medical status,their treatment plan and the care protocols and pathways in use and validated by their hospital.    Feedback from Detecta Clinic is that their patients find it easier to take in information that is delivered in different ways, especially as videos and images.

Our AI conversation agent has been developed by orchestrating the interaction between LLMs, tools and frameworks. This means that, in contrast with general-purpose chat models like ChatGPT, we optimise the way in which LLMs and other AI capabilities are deployed to meet only the identified needs of this specific user group.  We use retrieval augmented generation (RAG), which means it learns all and only the information provided by each healthcare team, and draws on just that information when talking with that team’s outpatients.  This knowledge can be quickly and easily loaded into a knowledge base by members of the hospital team.  In this way, the AI agent becomes a bespoke service for each clinical unit, providing advice and guidance that is soundly based on their patient profile and treatment protocols.  This is important as such protocols differ from one country to another, and to some extent even from one hospital to another, based on local and regional needs and priorities.  By setting out values and norms that we want the AI agent to adhere to in an AI constitution, which can draw on a hospital’s own code of conduct for its healthcare team, we align the behaviour of the AI with the standards of knowledge, professionalism and tone of the best hospital clinicians.  We continue to develop and test this agent, and to improve it in response to feedback.

The behaviour of LLMs is non-deterministic, which means that the conversation agent will not give the same response to the same query asked a second time.  This means that the exact behaviour of the conversation agent cannot be predicted with certainty or exactly controlled.  As such, to ensure safety and appropriateness of the response, various checks and constraints have to be made.  These may include a separate peer review agent that critiques the response and passes its suggestions for improvement to the first agent.  Pre-built guardrails frameworks that ensure alignment with expected behaviour can also be used.  AI capabilities continue to improve, and we expect these limitations to be minimised or overcome with future capabilities.

In this example, the system can give bespoke advice based on its knowledge of the patient’s details, condition and treatment plan

Multi-lingual ability

In terms of language, the Spanish spoken in Peru is very similar to standard Spanish, but has some differences in vocabulary and pronunciation.  In addition, Peru is home to indigenous languages, such as Quechua and Aymara that are particularly spoken in rural areas. Both languages are also official languages of Peru, alongside Spanish.  Quechua is today spoken by about seven million Peruvians, mostly in the southern and central highland areas of Peru, and especially in the area around Cusco, where the percentage of speakers rises to 41%.  Aymara is spoken by almost two million Peruvians, mostly in the southern region around Tiwanaku and near Bolivia.  As these regions are hard to reach for existing healthcare services, there would be a disproportionate benefit from being able to provide an AI agent that can talk fluently in these languages.

For the AI agent, attempts were thus made to find an LLM which would converse in these languages.  Peruvian Spanish was not a problem.  In addition, a number of LLMs were identified that claim to have Quechua and Aymara languages, but we were not able to get good conversational responses in testing with these LLMs.  This is because, on closer examination, the amount of training data used to train the models was not enough, and these models have likely been released as a base model which can be improved in terms of linguistic understanding through fine-tuning with more training data in these languages.  Finding a way to work with these languages can bring distinctive value to the work, and could be a pathfinder for similar attempts to work with less-resourced languages elsewhere in the world.

Multi-modal content

In terms of multi-modal content, we learnt from Detecta Clinic and other hospitals that they have videos and factsheets with images that they provide to patients. However, if the AI agent is able to deliver these resources to outpatients as and when needed, in a more personalised and targeted way, this will be more effective.

We have thus improved the conversation agent by adding a text-to-image similarity search capability.  This means that when the agent talks with the outpatient, it can add images to illustrate its response where appropriate.  For example, if the outpatient asks how much meat they should eat, the agent can give an answer and also show a picture of the recommended portion size, which may be different for different patients depending on their gender.  As this is a RAG system, the agent will only draw on images previously loaded into the knowledge base by the hospital team.  As a first step, the agent builds up its own image bank by extracting images from PDFs and other documents as well as directly from jpegs.

However, in initial testing, we found a couple of weaknesses which will be addressed in a further phase of development -

  • Extraction of images from PDF docs is slow (~2 minutes per PDF)

  • Selection of an image to illustrate the answer sometimes works well, sometimes less well

In the future, we can also implement video capability, and the AI may be able to generate images and short-form videos that are relevant to the patient’s question and reflect their individual needs.  Videos can support the patient by, for example, demonstrating an exercise they should do, or showing them how to use a given medical device.

The text-to-image retrieval ability also lays a foundation for further powerful use-cases.  For example, text-to-image retrieval can be used to bring up visualizations that will help inform patients, improve communication between healthcare professionals, and support medical education. In medical education, for example, medical images can be asked what patterns might be seen for a certain rare disease.  In drug discovery, text-to-image retrieval can be used to generate visual models of protein structures based on textual descriptions or genetic information. 

Building on this, an image-to-image similarity search can be used to bring up images similar to those of someone being newly tested and diagnosed.  This will support diagnosis.  An AI model trained on an image bank of medical scans, such as mammography, ultrasound and MRI scans for breast cancer, can be used to improve diagnostic accuracy and speed, by comparing each new set of medical images with the labelled images in the image bank.  Also, detection of anomalies in medical scans that could indicate the presence of a disease such as breast cancer can be done more quickly and accurately with a trained AI model.

Multi-agent development

AI agents are software entities which can use the intelligence of LLMs to process requests from users in a flexible way. They have skills, memory and access to tools and knowledge, so they can create and execute plans, by reasoning through problems, deciding how to handle input and what tools to use, and adapting and evolving their approach based on outcomes of each step.

Our ongoing multi-agent development will improve the quality of the response by routing the outpatient’s query to any one or more of a number of AI experts in the muti-disciplinary AI team, such as nutritionist, exercise physiologist and so on.  Each of these AI experts may have access to different skills and tools, e.g. searching a database and crafting a report to the user or to another AI expert.

This same agentic capability can later be deployed for a range of other use-cases including -

  • Automating away workflows for clinicians, such as preparing the discharge plan and doing routine documentation, which currently takes about 10 hours per week per clinician;

  • Clinical decision support, providing clinicians with relevant and timely information they need for treatment planning and patient care;

  • Medical research and education, drawing on the latest peer-reviewed publications and medical datasets to provide timely knowledge and insight;

  • Supporting the development of evidence-based and data-driven clinical guidelines and care pathways, and especially by drawing on the vast amounts of dark data that are not otherwise being used.

AI safety and alignment

AI alignment refers to the need to ensure that what AI systems are actually doing is aligned with what we want and expect them to do.  AI safety is a wider concept, which includes risks of harm through unintentional effects or malicious use.

In a healthcare setting, it is important that the advice provided is as accurate and clear as possible.  We should balance the need to provide a helpful response against the imperative not to provide a harmful response.  Every effort should be taken to avoid hallucinations, which are inaccurate and misleading responses.  Conversations not relevant to the patient’s condition and treatment should also be avoided.  Content that is offensive, illicit or that perpetuates bias and stereotypes should be avoided.

We are continuing to work on a variety of issues related to AI safety and alignment as we move towards a production deployment.  These include - 

  • the use of explicit guardrails to moderate the response

  • red-teaming exercises to test the AI's responses to unexpected or adversarial inputs, to probe for security vulnerabilities, and to evaluate bias in the AI's decision-making processes

  • improving the interpretability of the system, and ensuring that users have confidence in the system

  • testing with and feedback from clinicians, patients and other stakeholders on alignment with ethical values and the treatment protocol, including a feedback mechanism within the chat agent that outpatients can use to report problematic behaviour

  • data protection compliance, including access and audit controls, encryption, user informed consent mechanisms, and data collection minimization

In this example, we see that the system can illustrate its answer if useful, and also how it can provide responses that balance helpfulness and harmlessness for the outpatient

In this example, we see how the system politely declines to answer any answer that is out-of-scope or from a malicious user attempting to get access to unauthorized information

The capabilities and benefits outlined above represent a much-needed advance on current processes at hospitals in Peru. These processes typically involve handing out printed factsheets or sharing a limited range of videos in ways that are not customised or personalized and not given in real time as needed by outpatients. These features are also expected to be especially helpful to people living in remote areas who cannot access healthcare services in person.

Further challenges are to make a robust deployment in a production environment, which fully considers all aspects of AI safety and alignment, regulatory compliance, security audit, authentication mechanisms, and easy scalability.

Literature on adherence to medication has established that patients’ lack of knowledge, worries, concerns and misconceptions are a major factor in why patients don’t adhere to their treatment plan.  Although healthcare professionals are advised to address these systematically, they typically lack time to do so.  By directly tackling these things, the outpatient's confidence level in managing their health condition should improve, leading to better patient-physician concordance and increased rates of adherence to the treatment plan.

Feedback from Peru healthcare experts

During a March 2024 working visit to Peru, meetings with clinicians from Detecta Clinic provided an opportunity to explore GenAI's potential benefits for supporting clinical teams and patients.  

Comments and recommendations from these discussions are documented in our report, reflecting strong interest in the development and deployment of GenAI to meet pressing healthcare access challenges for patients across Peru, particularly those who live in the highlands and rainforest. Distance creates a significant barrier to healthcare access, leading to further complications and costs to patients and their families.

The project team also presented to students and academics from the Cayetano Heredia University about GenAI's potential utility to support healthcare transformation. The presentation was followed by a rich discussion during the Q&A segment, which included topics such as GenAI’s potential to support innovation development for medical and engineering students and faculty. 

Cayetano Heredia University: GenAI Seminar


If you’d like to dig in further…

🚀 Explore this pilot’s profile page

📚 Read about the pilot’s first Sprint and their three key phases — “Project EmpatIA Starts

📚 Learn about the pilot’s local impact and global reach — “Project EmpatIA: Local Impact, Global Reach

📚 Explore learnings from the pilot’s third Sprint — “Project EmpatIA: Innovating within the Peru Ecosystem”

📚 Read the fourth sprint findings on engaging healthcare providers — “Project EmpatIA: Engaging with Healthcare Providers & Patients in Peru”

📚 Read the final learnings from the pilot here — “Project EmpatIA: Enabling Healthcare Transformation in Peru”

Frontier Tech Hub
The Frontier Technologies Hub works with UK Foreign, Commonwealth and Development Office (FCDO) staff and global partners to understand the potential for innovative tech in the development context, and then test and scale their ideas.
Previous
Previous

Five learnings from the Peruvian healthcare ecosystem, where AI is ready to assist

Next
Next

Frontier Tech Newsletter: Tech x Elections