test-retest reliability

  • AI responded that the test-retest reliability of AI responses varies “drastically” with the model and with the task. When the same question was asked again, the word “drastically” was not in the response. “Quis custodiet ipsos custodes?” AI has agreed that quotation marks are used excessively in AI.