profile

Learn Data Science from Data School πŸ“Š

Tuesday Tip #19: Solution to Code Challenge #1 πŸ†

Published 11 months agoΒ β€’Β 4 min read

Hi Reader,

​Last week, I shared a probability puzzle and asked YOU to simulate the problem using Python!

Today, I'll share my solution as well as some great solutions submitted by readers!


πŸ‘‰ Recap of the Monty Hall problem

You are a contestant on a game show. In front of you are three closed doors. Behind one of the doors is a car, and behind the other two doors are goats. Your goal is to pick the door with the car.

The host asks you to choose a door. You tell the host your choice. Instead of telling you whether your choice was correct, the host (who knows which door contains the car) opens one of the two doors you didn't choose and reveals a goat.

You now have the opportunity toΒ keep your original choiceΒ orΒ switch your choice to the door that is still closed. Which should you choose?

Specifically, I want you to simulate that you are a contestant on this show 1000 times. Each time, you pick a random door as your first choice, let the host open a door that reveals a goat, and then switch your choice to the door that the host didn't open.Β With that strategy (known as the "always switch" strategy), how often do you win the car?


My solution code

I'll explain this code piece-by-piece:

πŸ‘† Since we need to simulate random selections, the usual choice is Python's random module.

πŸ‘† I set the number of games to simulate, a counter to track the number of wins, and a list to represent the three doors.

πŸ‘† This for loop simulates 1000 independent games.

During each game, the host randomly selects one of the three doors for the car, and the player randomly selects one of the three doors as their first pick for where the car might be. To do this, the random.choice() function selects one of the three elements from the doors list.

πŸ‘† The host must open one of the closed doors, and I need to make sure that they're not opening the door with the car or the door that the player already chose.

To do this, I convert the doors list to a set, convert the car's location and the player's first pick to another set, and then take the difference. Thus, host_can_open is a set of all of the doors that don't contain the car and weren't picked by the player.

host_can_open contains either one or two doors, and I use random.choice() to select which door the host actually opens and store that in host_opens. (random.choice() doesn't work with sets, which is why I converted it to a list.)

πŸ‘† Next, we need the player to switch their pick to the one door that is still closed. Again, we use a set difference operation. But why is the min() function there?

Well, the set difference operation returns a set that contains a single integer, such as {3}. In order to access just the integer, the most elegant solution is to take the min (or max) of that set!

πŸ‘† Finally, I check whether the player's second pick matches the location of the car. If so, we increment the win count.

πŸ‘† This is just a bit of debugging code. It runs during the first five games, and shows me the value of four of the objects. This allows me to double-check that the values being selected obey the rules of the game.

These are called "self-documenting expressions", which were added in Python 3.8. Because of the equals sign, the f-string will print "first_pick=2" (for example). Without the equals sign, the f-string would just print "2", and I'd have to remember what the 2 represents.

πŸ‘† Finally, we print the win percentage, which is calculated by dividing the win count by the number of games.

I start the f-string with a \n in order to add a line break, and I add the :.1% at the end to format the result as a percentage with 1 digit after the decimal.


How often does the player win the car?

Here's an example of the output:

To be clear, the first five lines represent the first five games (out of 1000). In those five games, the player won the car three times.

The win percentage is how often the player won the car by using the "always switch" strategy across 1000 games. In this case it was 68.8%, but you'll find that this percentage hovers around 66.6%. If you want to confirm this for yourself, you can run my code online.

This result is counter-intuitive to many people, but the "always switch" strategy is twice as likely to win as the "never switch" strategy! (To be clear, this result only holds if we follow the specific rules above, which I clarified in detail in last week's email.)

If you want to read more about this puzzle, check out the Wikipedia article about the Monty Hall problem.


Reader-submitted solutions

One of my favorite solutions was from FΓ‘bio C., who received help from Bing AI when writing this code:

Here's what I like about FΓ‘bio's code:

  • To select which door the host opens, he uses a while loop that generates a random integer until it doesn't match the location of the car or the player's first pick. Brilliant!
  • To select which door the player changes to, he subtracts the player's first pick and the door the host opens from 6. Why? Because those three values (1, 2, and 3) must add up to 6!

Another solution I really enjoyed was from Aaron S.:

Here's what I like about Aaron's code:

  • He uses a list of emoji (which are valid characters) to represent the prizes, and then uses random.shuffle() to shuffle their locations.
  • To select which door the host opens, he uses a clever list comprehension that zips together the doors and prizes and checks for two conditions.
  • To count the number of wins, he runs the function 1000 times using a generator expression and then sums the results. (This works because his function returns False for a loss and True for a win, and these get converted to 0's and 1's by the sum() function.)

JosΓ© P. performed an increasing number of simulations to demonstrate that as the number of games increases, the result converges towards a winning percentage of 66.6%:

Finally, Conor G. sent me a link to his collection of Monty Hall simulations written in Lua, Perl, C, Fortran, COBOL, Pascal, PHP, and (of course) Python! 🀯

Thank you so much to everyone who submitted a solution! πŸ‘πŸ‘πŸ‘

I look forward to doing another "Code Challenge" in a future Tuesday Tip!


If you enjoyed this week’s newsletter, please forward it to a friend!Β Takes only a few seconds, and it really helps me out! πŸ™Œ

Watch out for a special announcement next week... πŸ˜‰

- Kevin

P.S. Eminem raps pillow quotes (video, but safe for work!)

Did someone awesome forward you this email?Β Sign up here to receive data science tips every week!​

Learn Data Science from Data School πŸ“Š

Kevin Markham

Join 25,000+ aspiring Data Scientists and receive Python & Data Science tips every Tuesday!

Read more from Learn Data Science from Data School πŸ“Š

Hi Reader, Last week, I recorded the FINAL 28 LESSONS πŸŽ‰ for my upcoming course, Master Machine Learning with scikit-learn. That's why you didn't hear from me last week! πŸ˜… I edited one of those 28 videos and posted it on YouTube. That video is today's tip, which I'll tell you about below! πŸ‘‰ Tip #45: How to read the scikit-learn documentation In order to become truly proficient with scikit-learn, you need to be able to read the documentation. In this video lesson, I’ll walk you through the five...

4 days agoΒ β€’Β 1 min read

Hi Reader, happy Tuesday! My recent tips have been rather lengthy, so I'm going to mix it up with some shorter tips (like today's). Let me know what you think! πŸ’¬ πŸ”— Link of the week A stealth attack came close to compromising the world's computers (The Economist) If you haven't heard about the recent "xz Utils backdoor", it's an absolutely fascinating/terrifying story! In short, a hacker (or team of hackers) spent years gaining the trust of an open-source project by making helpful...

18 days agoΒ β€’Β 1 min read

Hi Reader, Today's tip is drawn directly from my upcoming course, Master Machine Learning with scikit-learn. You can read the tip below or watch it as a video! If you're interested in receiving more free lessons from the course (which won't be included in Tuesday Tips), you can join the waitlist by clicking here: Yes, I want more free lessons! πŸ‘‰ Tip #43: Should you discretize continuous features for Machine Learning? Let's say that you're working on a supervised Machine Learning problem, and...

25 days agoΒ β€’Β 2 min read
Share this post