Hi Reader, I appreciate everyone who has emailed to check on me and my family post-Helene! It has been more than 6 weeks since the hurricane, and most homes in Asheville (mine included) still don't have clean, running water. We're hopeful that water service will return within the next month. In the meantime, we're grateful for all of the aid agencies providing free bottled water, free meals, places to shower, and so much more. ❤️ Thanks for allowing me to share a bit of my personal life with you! Now, back to the Data Science. 😄 🔗 Link of the weekThe Present Future: AI's Impact Long Before Superintelligence (Ethan Mollick) A short, compelling article demonstrating the impact that today's multimodal models can achieve when interacting with the real world! 👉 Tip #49: Calculating the confidence of your classifierAlthough Generative AI is the focus of everyone's attention (I'm even working on a GenAI course! 😲), supervised Machine Learning is still the optimal tool for solving most real-world predictive problems. Today's tip answers the question: How certain is my classification model about its predictions? It comes directly from my course, Master Machine Learning with scikit-learn. Let's say you need to predict whether individual users are likely to buy your product. You might build a classifier that outputs "1" if they are likely to buy, and "0" otherwise: But what if there were 50,000 users who are likely to buy, but you can only afford to market to 500 of them? In that case, you would use your marketing budget to reach the 500 users who are most likely to buy. In Machine Learning terms, we're looking for the users with the highest "predicted probability" of buying. Here's how we would find these users: Here's what we did:
Because we're using a well-calibrated classifier called logistic regression, these predicted probabilities can be directly interpreted as the model's confidence in each prediction. In this example, the model thinks the 8th user is the most likely to buy since they have the highest predicted probability (0.612). Conclusion: If you had 50,000 users and you needed to choose which 500 users to target, you would calculate the predicted probability for all 50,000 and then select the 500 users with the highest probabilities! Did you enjoy this short lesson? There are 148 more video lessons like this in my newest ML course, Master Machine Learning with scikit-learn! 👋 See you next week!If you liked this week's tip, please share it with a friend! It really helps me out. - Kevin P.S. I spent my entire life savings on pasta 🍝 Did someone AWESOME forward you this email? Sign up here to receive more Data Science tips! |
Join 25,000+ intelligent readers and receive AI tips every Tuesday!
Hi Reader, Here are your top AI stories for the week: ChatGPT can weaken your brain Claude shares nerve gas recipe Amsterdam ends AI experiment due to bias Read more below! 👇 Sponsored by: Brain.fm Transform Your Focus With Brain.fm I know you're always on the hunt for tools that genuinely improve your life—which is why I'm excited to introduce you to Brain.fm's groundbreaking focus music. Brain.fm's patented audio technology was recently validated in a top neuroscience journal, showing how...
Hi Reader, Last week, I invited you to help me test Google's Data Science Agent in Colab, which promises to automate your data analysis. Does it live up to that promise? Let's find out! 👇 Sponsored by: Morning Brew Business news you’ll actually enjoy Join 4M+ professionals who start their day with Morning Brew—a free daily newsletter that makes business, tech, and finance news genuinely enjoyable to read and hard to forget. Each morning, it breaks down complex stories in plain English—cutting...
Hi Reader, Today I'm trying something brand new! I wrote short summaries of the 5 most important AI stories this week, and also turned it into a video: Watch the video I'd love to know what you think! 💬 AI-generated TV ad airs during NBA finals Prediction market Kalshi just aired this AI-generated ad on network TV during the NBA finals. It was created in just two days by one person using Google's new Veo 3 video model, plus scripting help from Google's Gemini chatbot. Expect to see many more...