> perhaps humans also mostly reason using previous examples rather than thinking...

oliwary · on Oct 11, 2024

That is an excellent point! I feel like people have two modes of reasoning - a lazy mode where we assume we already know the problem, and an active mode where something prompts us to actually pay attention and actually reason about the problem. Perhaps LLMs only have the lazy mode?

letmevoteplease · on Oct 11, 2024

I prompted o1 with "analyze this problem word-by-word to ensure that you fully understand it. Make no assumptions." and it solved the "riddle" correctly.

https://chatgpt.com/share/6709473b-b22c-8012-a30d-42c8482cc6...

hoosieree · on Oct 11, 2024

My classifier is not very accurate:

    is_trick(question)  # 50% accurate

To make the client happy, I improved it:

    is_trick(question, label)  # 100% accurate

But the client still isn't happy because if they already knew the label they wouldn't need the classifier!

...

If ChatGPT had "sense" your extra prompt should do nothing. The fact that adding the prompt changes the output should be a clue that nobody should ever trust an LLM anywhere correctness matters.

[edit]

I also tried the original question but followed-up with "is it possible that the doctor is the boy's father?"

ChatGPT said:

Yes, it's possible for the doctor to be the boy's father if there's a scenario where the boy has two fathers, such as being raised by a same-sex couple or having a biological father and a stepfather. The riddle primarily highlights the assumption about gender roles, but there are certainly other family dynamics that could make the statement true.

PoignardAzur · on Oct 11, 2024

It's not like GP gave task-specific advice in their example. They just said "think carefully about this".

If it's all it takes, then maybe the problem isn't a lack of capabilities but a tendency to not surface them.

hoosieree · on Oct 15, 2024

The main point I was trying to make is that adding the prompt "think carefully" moves the model toward the "riddle" vector space, which means it will draw tokens from there instead of the original space.

And I doubt there are any such hidden capabilities because if there were it would be valuable to OpenAI to surface them (e.g. by adding "think carefully" to the default/system prompt). Since adding "think carefully" changes the output significantly, it's safe to assume this is not part of the default prompt. Perhaps because adding it is not helpful to average queries.

s-macke · on Oct 11, 2024

I have found multiple definitions in literature of what you describe.

1. Fast thinking vs. slow thinking.

2. Intuitive thinking vs. symbolic thinking.

3. Interpolated thinking (in terms of pattern matching or curve fitting) vs. generalization.

4. Level 1 thinking vs. level 2 thinking. (In terms of OpenAIs definitions of levels of intelligence)

The definitions describe all the same thing.

Currently all of the LLMs are trained to use the "lazy" thinking approach. o1-preview is advertised as being the exception. It is trained or fine tuned with a countless number of reasoning patterns.