Skip to main content

What is the Open-Ended Score?

Open-ended quality checks evaluate free-text responses for relevance, informativeness, language conformity, duplication, and potential synthetic (automated) generation. The goal is to ensure that collected open-ended answers are meaningful and usable for analysis.

How does ReDem classify responses?

ReDem classifies each open-ended response into distinct quality categories, ensuring a clear and consistent assessment of respondent performance. These categories capture all essential dimensions of open-ended response quality, from high-effort, meaningful engagement to outright fraud.

1. Valid Answer - Meaningful Answer with Varying Effort

Definition: The question was read and answered meaningfully. The response is relevant to the question and demonstrates a varying degree of elaboration or cognitive effort.

2. No Answer (“Refusal or Inability to Answer”)

Definition: The question was read but not meaningfully answered. The respondent signals unwillingness or inability to answer the question. Typical Indicators:
  • Explicit refusal to provide an answer
  • Dismissive statements
  • Abbreviations indicating “no answer”
  • One or more question marks
Example Question: Please describe your ideal summer vacation: Example Answers:
  • Don’t know
  • I don’t go on vacation
  • None of your business
  • I hate vacations
  • ???????????
  • n/a

Effort Levels (Valid Answer & No Answer)

Effort is categorized into low, medium, or high, depending on the level of detail, specificity, and engagement shown in the response.
This effort scale is applied to the Valid Answer and No Answer categories.
  • Low Effort: Minimal response, short and factual, without elaboration. Example (Valid Answer): Question: Please describe your ideal summer vacation: Answer: In Italy.
  • Medium Effort: Some elaboration with relevant details; the response provides some context or specificity. Example (Valid Answer): Question: Please describe your ideal summer vacation: Answer: At a vineyard in Tuscany.
  • High Effort: Detailed and thoughtful response including context, reasoning, and multiple elements. Example (Valid Answer): Question: Please describe your ideal summer vacation: Answer: Two weeks at a Tuscan vineyard near Siena with my best friends, good food, and a swimming pool.

3. Off Topic (“Irrelevant” - Misunderstanding or Lack of Motivation)

Definition: The question was not properly read or understood. The response may be meaningful in another context but is irrelevant to the actual question, indicating that the respondent did not engage with the question’s topic or intent. Typical Indicators:
  • Response unrelated to the question
  • General or misplaced statements
  • One-word or minimal replies that don’t address the question
Example Question: Please describe your ideal summer vacation: Example Answers:
  • My hobbies are cycling, reading, and swimming.
  • My ideal vacation is at Christmas time in the mountains of Tyrol.
  • Thanksgiving
  • Everything
  • Nothing
  • No

4. Gibberish (“Nonsense” - Clear Fraud)

Definition: The question was not read. The response is completely meaningless or incoherent, consisting of random text fragments, repetitions of the question or its instructions, or nonsensical character combinations. Typical Indicators:
  • Jumbled or unrelated text fragments
  • Copying or repeating the question or parts of it (“parroting”)
  • Random letters, symbols, or punctuation (“text soup”)
  • Question marks combined with other random characters
Example Question: Please describe your ideal summer vacation: Example Answers:
  • Hello! I call you because I am happy
  • Please describe your ideal summer vacation
  • your ideal summer vacation
  • It is something when I have but never one How do you do?
  • Gdhj2
  • ………
  • ?.
  • x?

5. AI-Suspect (Probable Non-Human-Generated Response)

Definition: The response shows characteristics of AI-generated text, indicating it was likely produced by a chatbot rather than a human respondent. Such answers often appear syntactically perfect, overly balanced, or emotionally neutral, lacking natural human imperfections, spontaneity, or personal perspective. Typical Indicators:
  • Unusually polished or “too perfect” language
  • Balanced, structured, and generic phrasing without individuality
  • Overly coherent style inconsistent with typical survey responses
  • Lack of typos, personal references, or informal language
Example Question: Please describe your ideal summer vacation: Example Answer: My ideal summer vacation would balance exploration, relaxation, and inspiration. It would start somewhere by the sea — perhaps a quiet coastal town in southern Italy or Greece — where mornings begin with espresso on a terrace overlooking the water, followed by swimming, reading, and writing in the shade. Definition: The response contains insults, profanity, or vulgar expressions that are unrelated to the context of the question. Such language indicates disrespectful or hostile behavior rather than a genuine attempt to answer. Typical Indicators:
  • Direct insults or offensive remarks toward others
  • Swearwords or vulgar language without contextual relevance
  • Aggressive tone or hostility unrelated to the question content
Example Question: Please describe your ideal summer vacation: Example Answers:
  • Bad Language Example: This survey is shit.
  • No Bad Language Example: Similar to my last vacation in Ibiza. It kicked ass.

7. Wrong Language

The response is provided in a language that does not match the expected language(s) specified for the data point. This category helps identify responses that may have been misunderstood or provided by respondents who did not understand the question.

8. Duplicate Respondent

The response is identical or highly similar to responses provided by other respondents in the survey, indicating potential data quality issues or coordinated responses. Activate the duplicate check only if identical or highly similar responses to open-ends among the interviews are implausible.

9. Duplicate Answer

The response is identical or highly similar to another response provided by the same respondent within the survey, indicating potential lack of engagement or copy-paste behavior. Activate the duplicate check only if identical or highly similar responses to open ends within interviews are implausible.

Use of GPT for Open-Ended Score

ReDem uses the most advanced GPT large language models (LLMs) from OpenAI, to analyze and categorize open-ended responses. This enables precise and reliable scoring by leveraging cutting-edge language understanding. To ensure data privacy and compliance, GPT is integrated into the ReDem OES with strict safeguards:

Individual Responses & Anonymity

Each response is sent to OpenAI individually, using a fully anonymized ID. Only the single response is transmitted per API request—complete survey datasets are never shared.

Exclusive Use by ReDem

Only ReDem communicates with OpenAI. The platform does not share any details about the origin or source of the responses.

Data Storage & Retention

OpenAI retains data for up to 30 days, after which it is permanently deleted. The data is never used to train AI models.

GDPR-Compliant Data Transfers

ReDem and OpenAI operate under a Data Processing Agreement based on EU Standard Contractual Clauses (SCCs). This ensures all data transfers—including those involving personal data—fully comply with GDPR This setup gives you both the advanced capabilities of GPT and the data protection required for responsible AI use in survey research.