I'm going to switch up topics for this Friday. Normally I focus on Credit Union Data and AI journeys. But some conversations stay with you long after the lunch bill is paid.
On Tuesday I was catching up with former colleagues, and the topic that kept resurfacing was GenAI and AI agents. How do you let people experiment with new tools in clinical and research environments without creating ungoverned risk or undermining patient trust?
The examples we touched on felt familiar. People want to try GenAI on complex research data to see whether it can surface patterns or hypotheses they might otherwise miss. They would like to explore potential patient cohorts or study populations with natural language, instead of waiting weeks for a traditional request to work its way through the queue. Underneath those ideas is a common tension: how do you explore what is possible without stepping outside HIPAA, NIH or NCI expectations, IRB oversight, and institutional policies?
Quick guardrails before perfect frameworks
Walking away from that conversation, I was reminded that policy usually trails practice. People are already using tools. In most organizations, some form of ghost AI is here whether anyone has written a standard for it or not.
That made me think about some of the ways we tackled what started as a GenAI “fad” and turned it into a useful supporting tool. The moves that helped most were small and concrete. Not a long AI strategy, but a short, narrow policy that does two things clearly. It sets a few basic boundaries for GenAI at work, and it points people to an enterprise option they can actually start using.
The intent is not to shut anything down. It is to acknowledge reality and give people a safer path. If all they hear is “no,” ghost usage will keep growing quietly. If they hear “here is where this is okay, and here is the tool we support,” most will choose the supported path.
In plain language, I think about it as saying: work information belongs in AI tools the organization can see and manage, and some kinds of content belong only in specific places for specific purposes. Nothing elaborate. Just enough clarity that people are not guessing where the lines are.
Not all AI use cases need the same guardrails
That lunch also reinforced a pattern I keep seeing. AI questions get tangled because we talk about them as if they were one thing. In practice, most of the GenAI demand in clinical and research settings falls into two different tracks.
The first track is everyday GenAI that supports the work around the work. Drafting and editing documents. Summarizing papers or guidelines. Turning messy notes into something coherent. For many people, that is where GenAI will live most of the time.
The second track is data‑intensive AI and agents that touch clinical and research data. Using GenAI with specialized models or codecs on clinical, imaging, or research datasets to surface patterns or hypotheses. Exploring potential cohorts or study populations through natural language against governed data. Agents that reach into multiple systems to assemble information for studies or operations.
Both tracks matter. They just do not need the same scaffolding. Everyday GenAI is still bounded by privacy and confidentiality, but it is a different risk story than a system that queries live data or influences trial design and operations. Separating the two makes it easier to imagine what “just enough” governance looks like for each.
A cohorting example that keeps it high level
Cohort exploration is one place where this shows up clearly.
Right now, a common pattern looks like this. Someone has a question about a potential population, submits a request, and waits while an analyst builds a cohort. By the time the dataset is ready, the question may have shifted.
The version many teams are aiming for is more iterative. Describe a population in plain language through a secure interface. Let the system translate that into structured queries against a governed data mart. Get back a count and some high‑level characteristics, clearly labeled as exploratory rather than IRB‑ready or suitable for individual decisions.
At that level, it is less about replacing existing processes and more about giving researchers and clinicians a faster way to see whether an idea is worth pursuing.
Thinking in tiers instead of a single switch
When I try to make all of this practical, I keep coming back to a three‑tier picture rather than a single on or off switch. It will not fit every organization exactly, but it has been a useful mental model.
- Tier 1 is low‑risk exploration. Synthetic or de‑identified data, schemas, documentation. The point is learning what GenAI and simple agents can do, without any live clinical data in the mix.
- Tier 2 is a more controlled data environment. De‑identified or limited datasets under existing governance, with logging, time‑bounded access, and clear expectations about re‑identification. This is where those cohorting ideas and exploratory AI on research data feel more comfortable.
- Tier 3 is anything close to production clinical use. Systems that touch near‑real‑time EHR, registries, or operational data in ways that might change care, access, or trial pathways. Here it becomes hard to avoid the language of validation, monitoring, humans firmly in the loop, and clear ownership when something goes wrong.
Underneath that, I picture a small, cross‑functional group asking the same questions whenever something new appears. What can this system reach. How much can it do on its own. Who is on the hook for the outcomes. That feels less like building a big governance machine, and more like making sure someone is consistently looking at the right things.
For me, that is where Tuesday’s lunch landed. The conversation itself stays in the room, as it should. What it surfaced, though, is a broader question that applies in a lot of places. Given that people are already using GenAI and agents, how visible do you want that to be, and what is the lightest structure that still keeps patients, research, and trust where they need to be?
In your world, what would a version of that lightest structure look like over the next year, and where are you starting to notice ghost AI at the edges of your research and clinical workflows?