#1 out of 1
technology5h ago
College professors gave ChatGPT a science exam, and its grade was a 'low D'
- A Washington State University study found ChatGPT gave different answers to the same question across ten repeated runs.
- The system showed a bias toward agreement, performing poorly on false claims in 2025.
- Overall accuracy was around 60% when checks accounted for random guessing, close to a low grade.
- ChatGPT can sound confident yet may not reflect grounded, verifiable knowledge.
- Researchers warn against treating AI as a final decision-maker in high-stakes tasks.
- The study tested hundreds of hypotheses from published scientific research papers.
- Repeated questioning revealed instability in answers even with unchanged prompts.
- The study notes better performance on simple cause-and-effect tasks than on context-dependent judgments.
- Experts advise using AI as a drafting tool with human review for decisions.
- The Rutgers Business Review paper calls for broader comparisons and longer prompt runs.
Vote 0
