Work Next to AI, Not With It

July 19, 2023 Kevin Schulman, Founder, DonorVoice and DVCanvass

I’m struck by the hubris of AI prognosticators, perhaps including myself in that category though at least I’m admitting it…

There are some forecasters who have a pretty good track record – geopolitical gurus and bridge (card game) experts are good as are meteorologists (really). Technology forecasters on the other hand are pretty lousy, including experts. MIT’s Technology Review has done a historically bad job of reading the tea leaves and predicting the future of tech. Its authors missed smartphones entirely…

One sliver of the AI future replete with chest-thumping harrumphs is the notion that AI does/will work best in partnership with humans. Maybe, maybe not.

One instance where it currently doesn’t? Radiology, one of many fields where AI disruption is already here.

Analysis was done comparing,

Radiologist accuracy without AI
AI alone
Radiologist + AI

AI beat the radiologists 65% of the time but on average, the Radiologist + AI combo did no better than either of the other two conditions. This despite the Radiologists having what was determined to be useful, contextual and patient specific information that AI didn’t – e.g., doctor’s notes, patient history.

In theory this is the perfect combo of human with unique, special, “small data” combined with the massive computing power of AI. But the combo failed. Why?

That the combo failed as an average hides a lot. There was heterogeneity in the data. When the AI was highly confident the human + AI combo worked as intended. But when the AI had anything other than a very high probability diagnosis the humans intervened to create sub-optimal outcomes. They also took longer per patient to produce a less good result.

Human bias intervened. When the machine said ” I think it’s this…” the human decided to weight their own opinion too heavily. The humans also ignored the fact that most of the time their opinion and that of the machine were the same – i.e., highly correlated. That should have increased confidence and shortened decision times, but people seem to exhibit an anti-automation bias.

The human experts under-respond to information other than their own. Where does this happen in our world?

Selection for one. I can remember many an instance where good-old fashioned statistical modeling of RFM data was used to select X cases for a mailing only to be second guessed by fundraising ‘experts” who insisted on forcing all the $50-100, with 0 to 12 recency into the select regardless of the model.

At least the radiologists have a medical degree and a residency focused on nothing but reading scans. Those fundraising experts forcing their pet-segments in the mailing tend to be the biggest title in the room but that’s about it.

Kevin

Feedback

5 responses to “Work Next to AI, Not With It”

erica waasdorp says:

July 19, 2023 at 7:39 am

hi, thanks Kevin, not happy about you throwing experts under the bus!

I still see models that work sometimes, but not all of the times, and sometimes having some additional segments into the mix to compare to works just fine or even better. I think the message should be that experts with years of experience and AI and testing together make the best combination. Smart experts are always willing to learn.
Nicholas Ellinger says:

July 19, 2023 at 4:53 pm

You might like this one – new study out in Science. College-educated writers who had access to ChatGPT decreased writing time by 40% and increased writing quality by 18%: https://www.science.org/doi/10.1126/science.adh2586

Working together not only worked better, but those who used the system during the test were also likely to continue using it after.
- Kevin says:
  
  July 19, 2023 at 5:10 pm
  
  Nick, yep saw that study, thought I wrote about it but who the hells knows if I did or just drafted it or imagined it. You can relate. We did write about a similar study but with professional writers of various flavors vs. college kids. The results were mixed:
  
  -The average overall liking was 2.5, slightly positive.
  -The more creative the writer (e.g. fiction writers and poets) the less they like it, though the differences were slight.
  -Writers were neutral about the idea of embedding this tech into their word processor though creatives were much more negative on the idea.
  -The writers got less positive the more they used it.
  -Sentiment analysis of open-end comments indicated more negative sentiment than positive.
  -The negative emotions were dominated by fear, anger (far less common) and sadness.
  -The positive emotion was Joy and an analytical sentiment
  
  We use it, we’re very focused on using it more and more and convinced it is making us more productive and equally convinced that prompting GPT to write a letter for you is a total waste of time. It does require knowledge transfer, fine tuning – e.g., “small data” to augment the big.
Kathryn H says:

July 28, 2023 at 10:07 am

I haven’t read this study. Did they control for gender of the human radiologist?
- Kevin says:
  
  July 28, 2023 at 2:57 pm
  
  Hi Kathryn, yes, among a host of other more interesting possible, causal factors.

Ask A Behavioral Scientist

Behavioral Science Q & A

Q: As a designer who works with non-profits on fundraising strategy, I see the language like the following: “Our supporters help empower every girl, ensuring she has the resources she needs.” I do not think the word “help” is useful–I think “Our supporters empower every girl, ensuring she has the resources she needs. ” is much more engaging. Thoughts?

Whether “help” is more engaging or not really depends on the framing and context. The word help can sometimes weaken the perceived agency of the supporter, making their role feel secondary rather than central (your point). On the other hand, help can also signal collaboration rather than implying full ownership of the outcome, which might […]

Read Full Answer

Q: We started offering a donor cover option last april 1. The data to date suggests this may be dampening giving.eg. those who say yes to donor cover have a lower average gift (based on analysis of 6000+ gifts). I’m wondering if those who give lower gifts feel more guilt and therefore say yes to donor cover or if the presence of donor cover is making people adjust (lower) their gift size to accommodate the extra 3%. Would love any insights you have.

Great question! Here’s how behavioral science can help unpack what might be happening: Pain of Paying: Even a small extra charge can make giving feel more transactional than emotional, potentially reducing generosity. Fairness Concerns: Some donors might perceive donor cover as a surcharge rather than a contribution to the cause. If they feel the charity […]

Read Full Answer

Q: When writing an appeal, I waffle back and forth between writing “Your gift CAN…” or “Your gift WILL…” Any studies of which of these two words is best for an appeal?

The choice between “Your gift CAN…” and “Your gift WILL…” taps into the psychological framing of certainty vs. possibility. Currently, there is no academic research directly comparing these two framings in charitable appeals. However, I suspect no framing is universally better—the outcome likely depends on your target audience and the campaign’s goal. Here are some thoughts: Certainty Framing – […]

Read Full Answer

Q: Do you have any insight on whether integrating an individual giving appeal with other comms from the charity in both appearance and messaging can uplift results? Or does the actual appeal become ‘lost’ for lack of stand-out?

Integrating an individual giving appeal with other communications from a charity can have both positive and negative effects, and the outcome largely depends on how it’s executed. Advantages of Integration Brand Consistency: Maintaining a consistent appearance and messaging across all communications can reinforce the org’s brand identity and strengthen brand recognition and trust among your […]

Read Full Answer

Q: Is there any research on response rate impact in direct mail when referring to a sustainer gift as ongoing or recurring (catching all frequencies) v. monthly or annual?

I’m not aware of any in-market tests specifically comparing recurring vs. gift frequency language. I suspect the answer might not be the same with all gift frequencies, nor with all people. It sounds like a great opportunity for you to test and find out what works for your audience. Based on the literature, here’s a couple […]

Read Full Answer

Q: A major conservation nonprofit sends me lots of mail, many of which have on the envelope “time to renew” or “2nd notice.” I find this practice deceptive, especially as I haven’t given to said organization since 1997. It must be effective or they wouldn’t do it. But is it ethical?

Based on what we know from existing data, those renewal notices can actually be pretty effective in getting people to donate. They tap into our psychology – creating a sense of urgency, reminding us of past support, and using personalization to make the message hit home. They’re playing on our natural tendencies to feel obligated […]

Read Full Answer

The Agitator Tool Box

Ideas, applications, tools, processes, and case studies of break-through solutions in fundraising, including: