Solving easy problems the hard way

Thereâ€™s a charming little brain teaser thatâ€™s going around the Interwebs. Itâ€™s got various forms, but they all look something like this:

This problem can be solved by pre-school children in 5-10 minutes, by programer â€“ in 1 hour, by people with higher education â€¦ well, check it yourself!

8809=6
7111=0
2172=0
6666=4
1111=0
3213=0
7662=2
9313=1
0000=4
2222=0
3333=0
5555=0
8193=3
8096=5
7777=0
9999=4
7756=1
6855=3
9881=5
5531=0
2581=?

SPOILER ALERTâ€¦

The answer has to do with how many circles are in each number. So the number 8 has two circles in its shape so it counts as two. And 0 is one big circle, so it counts as 1. So 2581=2. Ok, thatâ€™s cute, itâ€™s an alternative mapping of values with implied addition.

What bugged me was how might I solve this if the mapping of values was not based on shape. So how could I program a computer to solve this puzzle? I gave it a little thought and since I like to pretend Iâ€™m an econometrician, this looked a LOT like a series of equations that could be solved with an OLS regression. So how can I refactor the problem and data into a trivial OLS? I really need to convert each row of the training data into a frequency of occurrence chart. So instead of 8809=6 I need to refactor that into something like:

1,0,0,0,0,0,0,0,2,1 = 6

In this format the independent variables are the digits 0-9 and their value is the number of times they occur in each row of the training data. I couldnâ€™t figure out how to do the freq table so, as is my custom, I created a concise simplification of the problem and put it on StackOverflow.com which yielded a great solution. Once I had the frequency table built, it was simple a matter of a linear regression with 10 independent variables and a dependent with no intercept term.

My whole script, which you should be able to cut and paste into R, if you are so inclined, is the following:

## read in the training data
## more lines than it should be because of the https requirement in Github
temporaryFile <- tempfile()
download.file("https://raw.github.com/gist/2061284/44a4dc9b304249e7ab3add86bc245b6be64d2cdd/problem.csv",destfile=temporaryFile, method="curl")
series <- read.csv(temporaryFile)

## munge the data to create a frequency table
freqTable <- as.data.frame( t(apply(series[,1:4], 1, function(X) table(c(X, 0:9))-1)) )
names(freqTable) <- c("zero","one","two","three","four","five","six","seven","eight","nine")
freqTable$dep <- series[,5]

## now a simple OLS regression with no intercept
myModel <- lm(dep ~ 0 + zero + one + two + three + four + five + six + seven + eight + nine, data=freqTable)
round(myModel$coefficients)

Created by Pretty R at inside-R.org

The final result looks like this:

> round(myModel$coefficients)
zero   one   two three  four  five   six seven eight  nine
   1     0     0     0    NA     0     1     0     2     1

So we can see that zero, six, and nine all get mapped to 1 and eight gets mapped to 2. Everything else is zero. And four is NA because there were no fours in the training data.

There. Iâ€™m as smart as a preschooler. And I have code to prove it.

Source:http://www.cerebralmastication.com/2012/03/solving-easy-problems-the-hard-way/

Solving easy problems the hard way

RELATED

0 COMMENT

ABOUT

HOW IT WORKS

FOLLOW US

FEEDBACK