Tuesday Tip #1: Speed up your grid search 🔎


Hi Reader!

Welcome to the first issue of “Tuesday Tips,” a new series in which I’ll share a data science tip with you every Tuesday!

These tips will come from all over the data science spectrum: Machine Learning, Python, data analysis, NLP, Jupyter, and much more!

I hope they will help you to learn something new, work more efficiently, or just motivate and inspire you ✨


👉 Tip #1: Speed up your hyperparameter search

In supervised Machine Learning, “hyperparameter tuning” is the process of tuning your model to make it more effective. For example, if you’re trying to improve your model’s accuracy, you want to find the model parameters that maximize its accuracy score.

One common way to tune your model is through a “grid search”, which basically means that you define a set of parameters you want to try out, and your model evaluation procedure (like cross-validation) checks every combination of those parameters to see which one works the best.

Sounds great, right?

Well, one big problem with grid search is that if your model is slow to train or you have a lot of parameters you want to try, this process can take a LONG TIME.

So what’s the solution? I've got two solutions for you:

1. If you’re using GridSearchCV in scikit-learn, use the “n_jobs” parameter to turn on parallel processing. Set it to -1 to use all processors, though be careful about using that setting in a shared computing environment!

🔗 2-minute demo of parallel processing

2. Also in scikit-learn, swap out RandomizedSearchCV for GridSearchCV. Whereas grid search checks every combination of parameters, “randomized search” checks random combinations of parameters. You specify how many combinations you want to try (based on how much time you have available), and it often finds the “almost best” set of parameters in far less time than grid search!

🔗 5-minute demo of randomized search

How helpful was today’s tip?

🤩🙂😐


If you enjoyed this issue, please forward it to a friend! 📬

See you next Tuesday!

- Kevin

P.S. Shout-out to my long-time pal, Ben Collins, who inspired and encouraged me to start this series. He has been sharing weekly Google Sheets tips for almost 5 years! Check out his site if you want to improve your Sheets skills!

Learn Artificial Intelligence from Data School 🤖

Join 25,000+ intelligent readers and receive AI tips every Tuesday!

Read more from Learn Artificial Intelligence from Data School 🤖

Hi Reader, Until 8 PM ET tonight, you can get the All-Access Pass for $99: Here's everything you need to know: Access all existing courses for one year ($700+ value) Includes new courses launched during your subscription Includes e-book version of Master Machine Learning (coming soon) Additional discounts available Lock in this price forever 30-day refund policy Get the Pass for $99 Questions? Please let me know! - Kevin

Hi Reader, I wanted to share with you three limited-time resources for improving your Python skills... 1️⃣ Algorithm Mastery Bootcamp 🥾 Are you looking for an intense, 12-day Python bootcamp? My friend Rodrigo Girão Serrão is running a new Algorithm Mastery Bootcamp, and it starts in just 5 days! In the bootcamp, you'll solve 24 real programming challenges and participate in daily live sessions to discuss and compare solutions. It's a great way to strengthen your problem-solving muscles 💪 I...

Hi Reader, Last week, I launched the All-Access Pass, which gives you access to ALL of Data School's courses for one year. Through Black Friday, you can buy the pass for $99, after which the price will increase. Here are the included courses: Build an AI chatbot with Python ($9) Create your first AI app in 60 minutes using LangChain & LangGraph! ⚡ Build AI agents with Python ($99) Develop the skills to create AI apps that can think and act independently 🤖 Conda Essentials for Data Scientists...