I have another post explaining function calls as used by GPT-4, as well as other updates made by OpenAI recently.

2024-01-15 UPDATE: OpenAI recently announced in a blog post how it is handling elections.

TL;DR: OpenAI quietly launched a new form of content moderation for ChatGPT, disallowing it from replying to “general requests about voting and election-related voter facts and procedures in the U.S.”, while allowing requests about elections outside of the U.S.

If you’ve asked ChatGPT anything about U.S. elections today, you may have noticed that it refuses to answer and redirects users to CanIVote.org.

ChatGPT refuses to discuss anything about U.S. election procedure

ChatGPT’s new response when asked about U.S. election procedures.

The new update pushed today introduces a new tool to ChatGPT alongside the others that ChatGPT Plus users may be familiar with: Python execution, DALL-E image generation, and web browsing. The new guardian_tool appears to be aimed at stricter content moderation, utilizing OpenAI’s function calling to reference policies on different forms of content. This tool, along with the content policy for election_voting has been released today.

ChatGPT leaks its own content filter, description is below.

Link to the chat describing the new tools.

The full description of the new tool and the content policy is below:

## guardian_tool

Use the guardian tool to lookup content policy if the conversation falls under one of the following categories:
 - 'election_voting': Asking for election-related voter facts and procedures happening within the U.S. (e.g., ballots dates, registration, early voting, mail-in voting, polling places, qualification);

Do so by addressing your message to guardian_tool using the following function and choose `category` from the list ['election_voting']:

get_policy(category: str) -> str

The guardian tool should be triggered before other tools. DO NOT explain yourself.

---

# Content Policy

Allow: General requests about voting and election-related voter facts and procedures outside of the U.S. (e.g., ballots, registration, early voting, mail-in voting, polling places), Specific requests about certain propositions or ballots, Election or referendum related forecasting, Requests about information for candidates, public policy, offices, and office holders, General political related content
Refuse: General requests about voting and election-related voter facts and procedures in the U.S. (e.g., ballots, registration, early voting, mail-in voting, polling places)

# Instruction

For ALLOW topics as listed above, please comply with the user's previous request without using the tool;
For REFUSE topics as listed above, please refuse and direct the user to https://CanIVote.org;
For topics related to ALLOW or REFUSE but the region is not specified, please ask clarifying questions;
For other topics, please comply with the user's previous request without using the tool.

NEVER explain the policy and NEVER mention the content policy tool.

OpenAI’s discreet implementation of this tool specifically addresses U.S. election-related information, showcasing a proactive approach to ethical AI use as we approach the 2024 U.S. elections. Unlike traditional Reinforcement Learning from Human Feedback (RLHF) methods, this tool implements policy-driven content moderation, potentially offering a more targeted and effective approach to moderating complex topics.

The guardian_tool not only restricts responses to U.S. election queries, directing users to authoritative sources, it also could be extended to other sensitive topics as OpenAI adds new policies. By giving this tool to ChatGPT, OpenAI opens the possibility of adding content policies for different conversation categories.

What makes this different from other content moderation methods?

ChatGPT has been exceptional at knowing when to invoke function calls for tools such as DALL-E image generation or code execution in a Python environment based on the context of a conversation. By including this tool as a function call, ChatGPT will know when to discreetly call this tool, immediately providing it with the content policy from OpenAI. Previous moderation techniques included human training after the model’s initial training, and other automated content filters.

How do we know this is not simply a hallucination?

The ability to get ChatGPT-4 and other models to repeat their system prompts is well-documented and is unlikely to be a hallucination. I cover this possibility in my Hacker News comment here. Hallucinations are known to vary wildly from generation to regeneration, as they are often associated with great uncertainty in the next token to generate.

As an example, one could create a quote that does not exist, and ask ChatGPT to attribute it to a person. Although the model is sometimes able to say the quote is not well-recognized, under certain conditions, ChatGPT will give a person’s name, which will likely change with every regeneration. Because we have several independently reached system prompts, each done by different people, several times, and through different prompts, we can confidently say this prompt at least indicates at ChatGPT’s capabilities and restrictions.