How to Use Chatbots for Fact-Checking, Prompts to Spot Made-up Answers

July 24, 2023
440 views

Generative-AI chatbots have an accuracy problem and are prone to making things up.

A journalist breaks down the prompts he uses to identify errors Google Bard introduces.

They include telling the bot to list the facts it relied on or to explain its thinking step-by-step.

Morning Brew Insider recommends waking up with, a daily newsletter. Loading Something is loading. Thanks for signing up! Access your favorite topics in a personalized feed while you're on the go. download the app Email address By clicking “Sign Up,” you also agree to marketing emails from both Insider and Morning Brew; and you accept Insider’s Terms and Privacy Policy Click here for Morning Brew’s privacy policy.

I'm someone who loathes busywork, so generative-AI chatbots have been something of a silver bullet for me. After initially dismissing them as glorified toys, I've been won over by their convenience.

I'm a journalist who treats Google's Bard as a souped-up personal assistant for the tedious life admin I don't want to do: like whipping up emails and summarizing meeting notes.

But if you're using it as an assistant, it's not one you should leave unsupervised. No matter how specific your prompts are, it will occasionally cite made-up sources and introduce outright errors. These are problems inherent to large language models, and there's no getting around them.

Fact-checking is key — I'd never rely on Bard's responses without combing through them. The trick, then, is to make fact-checking as quick, easy, and straightforward as possible.

By using a few carefully honed prompts, I can identify and deal with any inaccuracies at a glance. Sure, I still need to manually verify whatever Bard spits out, but these four prompts help me fact-check quickly, saving me time by making the artificial intelligence do the heavy lifting.

1. 'Give me a list of the fundamental facts on which your response relied'

I've found Bard is great for quickly generating answers to basic questions, how-to queries, and buying prompts. But it can take forever to pick out every implicit assumption or overt statement that needs verifying. That's why I get the model to do it for me.

After throwing it a question, I tell it: "Give me a list of the fundamental facts on which your response relied." It tends to generate a bullet-point rundown that, right off the bat, lets me check for self-consistency: Are all the listed facts reflected in the text, and are there any major statements that it's missed? From there, I can verify each individually.

Depending on the complexity of my instructions, I've found it sometimes also returns the names of its sources. If I can't find any mention of them from a quick Google search, they're likely made up. I'll take what I can and move on.

2. 'Base your answer on these facts'

When I use Bard to draft an email, I usually want it to hit several key points. I'll tell it: "Base your answer on these following facts." Then, I'll type out a numbered list of statements. As a final instruction, I'll say: "When you use each fact in a sentence, label it by referencing its corresponding number."

That last bit is key. It lets me instantly check whether Bard has included every statement I gave it, just by me reading off the references. If one is missing, a quick re-prompt telling it to add in or make more explicit "fact X" usually does the trick.

I've found that if Bard doesn't follow my instructions precisely, it tends to fabricate ideas. Using references to track its statements like this is an easy way of keeping it on course.

3. 'Think step-by-step'

Bard is a hardworking silent partner, which is a blessing and a curse: It will always produce an answer but won't ask for clarifications. When using the chatbot for problem-solving, such as calculating figures or setting up a schedule, I've found it makes basic errors in arithmetic by obscuring the assumptions used in its calculations.

To make its thought process a touch more transparent, I use chain-of-thought prompting. At the end of a prompt, add an extra line asking Bard to "think step-by-step," and it'll break down its solution into bite-size chunks.

AI researchers have found this kind of communication increases the likelihood that AI systems will land on the correct answer. But it also lets you see the model's working, so you can follow along and pinpoint where dubious assumptions or mistakes have crept in.

I also use examples whenever I can. As a demonstration, I'll show Bard a step-by-step solution to the kind of thing I want to think through — which could be as simple as typing out a very basic dummy calculation and arranging it in a format I can understand. This encourages the AI to produce an output that follows the same template.

4. 'Rewrite with these changes in mind'

Like real conversations, it can sometimes take a few questions to get the answer you want from Bard. When I asked it to summarize the transcript of a meeting, it misunderstood a key piece of jargon, generating a muddled answer.

When I can immediately see a factual error in its response, I'll ask it to "rewrite the answer with these changes in mind," clearly listing the issues it needs to correct. These could be as simple as typos in names, or as fundamental as the meaning of a complex concept.

Generally, the more esoteric and jargon-filled my requests, the more rounds of fine-tuning that are needed. Even so, I've found specifying a change with a single re-prompt is often quicker than rewriting the whole thing myself. And after all, time is what I'm trying to save.

Source: Business Insider