GPT-4 was just released. Here’s a preview of what it’s capable of. I tried throwing some mathematics problems at it to check out its capability.
Problem 1: Cheryl’s Birthday Problem
There’s an infamous logic problem from Singapore’s primary school Olympiad training. Here’s the full problem from the Wikipedia page (quoted verbatim).
And… here’s the output.
That’s correct: the first line of deduction tells us the month is July or August.
Correct!
Ok, but this is a well-known viral problem. Maybe let’s try something a little different.
Problem 2: Combinatorics
Here’s the problem.
I’d say it’s a decent inter-school level type of problem. Here’s the output:
A little of a bummer, since it incorrectly thought there must be four +1 and four -1 in each row. But the penultimate paragraph is impressive. So maybe a little nudging would help?
Result:
Extremely impressive!
Problem 3: Construction Problem
Here’s a rather tricky problem. [ Hint: the smallest solution is close to a million. ]
This is its reply:
The astute reader immediately sees the problem with this “solution”: 1118 has sum of digits = 11, which is a multiple of 11. I pointed it out; GPT-4 then responded with another 4-digit (wrong) example. I pointed out again and it responded with another wrong example. At this point, I gave up.
To avoid cluttering up the reader’s bandwidth, I won’t be sharing those wrong answers.
Problem 4: PSLE Coin Problem
This problem from PSLE 2021 again caused some controversy due to its perceived difficult:
Here’s part of the answer I got:
Right at the start it made the wrong assumption: that Helen and Ivan have the same number of 50-cent coins. I pointed this out:
Then GPT-4 realised its mistake:
But it still wouldn’t tell me who has more money. Guess wealth is a sensitive question even for a few dozen bucks.
I nudged it further:
And:
Goal!!
Problem 5: Group Theory
Here’s one which can be quite tricky if you’re unfamiliar with concrete examples of groups.
Long story short: it tried to come up with three examples, failed, and concluded that such an example is impossible. What impressed me was the diversity of its examples. It tried to use:
.
- The free group
.
- The dihedral group
of order 8.
The astute reader should immediately know why each of these examples fails.