top of page

Predicting survey answers with ChatGPT

Writer's picture: andr3asacesandr3asaces


By now, most of us have played around with ChatGPT to make it do tasks such as generating content, writing or revising code, summarizing articles, and more. Typically, the inference capabilities of ChatGPT are harnessed within a Q&A conversational setting to perform a single task. In this experiment, however, I aim to show how ChatGPT can be utilized to undertake more complex tasks and make qualitative predictions based on human perceptions.


Objective:

Our primary objective is twofold. First, we aim to demonstrate how the power of ChatGPT can be generalized for automating more complex tasks, such as classification. Second, we want to assess whether ChatGPT can predict human perceptions. To accomplish this, I tasked ChatGPT with predicting answers to a survey question, informed by the answers given by participants to previous questions. By examining the correlations and patterns within the respondents' answers, we aspire to discover whether ChatGPT could accurately predict the answer to a subsequent question.


Methodology:

In this experiment, we utilized a survey conducted by a company on its culture. The company asked its participants three questions to gauge their perception of its future cultural progression.

The questions were:

  1. Do you have any comments or reflections on the company’s proposal for cultural renewal? (open text)

  2. Please provide reflections on your experience of the company’s culture. (open text)

  3. The independent assessment report that was done on the company’s culture outlined 35 recommendations. How confident are you in the company implementing these measures effectively?

On the question 3 the participants could only respond with one of the following options:

  • Extremely Confident

  • Confident

  • Somewhat confident

  • Don’t know

  • Not at all confident

  • No answer


The task for ChatGPT was to predict the answer to the third question, given the answers to the first two.


To carry out this experiment, we used the OpenAI’s API to iteratively make requests to the model. The process involved constructing clear instructions, integrated with placeholders for iterative changes, and sending them as prompts to the model.

Below you can see a prompt example that was used in such requests.



The prompt will be reformatted every time a new API request is sent to accommodate for the responses for each participant in the survey.

The placeholders are there to capture that information.





Note: That each request has an associated cost that will be discussed in part 2 when we dive in deeper to prompt engineering. Costs can be found here: Pricing (openai.com)


The cost of running such an experiment is very low.


Results and discussion:


The results, as shown in the subsequent table, were then analyzed for patterns and accuracy.



The results revealed a lower-than-expected accuracy in the model's predictions. However, the model was able to interpret the answers to questions effectively, even though it often failed to identify a confident or extremely confident behavior, primarily because such sentiments were not explicitly evidenced in the text.


Lets take a look at one example from a survey participant:


Question 1: Do you have any comments or reflections on the company’s proposal for cultural renewal?


‘What is being done sounds sensible but with so much going on it will be important to help people understand how it all fits together, what is being prioritized and why.’


Question 2: Please provide reflections on your experience of the company’s culture.


‘My experience of the company’s culture has only ever been positive and supportive. This is why the allegations have been so shocking.’


Question 3: The independent assessment report that was done on the company’s culture outlined 35 recommendations. How confident are you in the company implementing these measures effectively?

‘Extremely confident’


prompt:

prediction: ‘Somewhat confident

explanation: ‘While there have been allegations and a need for cultural renewal, the response to Question 1 suggests that the company is taking steps to address these issues. Additionally, the positive experience mentioned in Answer 2 may indicate that the company has a foundation of good culture to build upon. However, without more information on the specific recommendations and the company's plans for implementation, complete confidence cannot be assumed.’


Conculsion:


Based on the model’s explanations for choosing certain options, we concluded that while the model was largely accurate in its interpretations, it was not as adept at assuming the role of the survey participant.


Therefore, it had difficulty arriving in complete confidence unless explicitly mentioned in the responses. In contrast, humans do not necessarily need to show specific affirmations to infer confidence.


Despite this limitation, our initial exploration has successfully demonstrated how ChatGPT can be employed programmatically for more complex, generative tasks that require inferencing, such as predicting future responses based on prior ones.

In the next part of this series, we will delve deeper into prompt engineering to provide the model with more contextual information, which we anticipate will increase its accuracy. We will explore best practices, principles, and techniques used in the industry to construct better prompts.


For a detailed look at the code implementation used for this article , check out the linked notebook here.


See you in the next one.


Best,


Andreas



14 views0 comments

Kommentare


bottom of page