Hi Reader,
Do you believe in magic? You will after you read this week’s tip! ✨
But first: I’m including a “link of the week” in each issue to share something that I think is worth checking out!
An Introduction to Statistical Learning
Nine years after first reading it, this remains my top recommended text for learning the foundational principles of Machine Learning. Although the book’s code was originally written in R, the authors released a Python version which uses libraries like scikit-learn, statsmodels, NumPy, Matplotlib, and PyTorch.
The labs from the book can be accessed as Jupyter Notebooks on GitHub or as a searchable Jupyter Book.
I’m going to show you some “magic” tricks that will improve your coding experience in Jupyter. But first, a quick history lesson… 📖
In the early days of the Jupyter Notebook, it used to be called the IPython Notebook. That’s because it was initially built for Python only, whereas now the Notebook supports many other programming languages.
But why was it called the “IPython Notebook”, not the “Python Notebook”?
That’s because the IPython Notebook was built on top of IPython, which is an Interactive Python shell. IPython is basically a better version of the standard Python shell.
One of the neat features of IPython is “magic commands”, which I’ll demonstrate below. And because the IPython Notebook (and thus the Jupyter Notebook) was built on top of IPython, you can use IPython magic commands from within Jupyter! 🪄
There are two types of IPython magic commands:
%
.%%
.For example, you can use the line magic %lsmagic
to list all of the magic commands:
You can use another line magic, %quickref
, to open a “quick reference card” that briefly describes each of the commands. (Try it out!)
Below are some of my favorite magic commands... 👇
%time
and %timeit
%time
runs a line of code once, times how long it took to run, and displays the output of the code:
%timeit
runs a line of code many times and averages the timing results (for greater accuracy), but it does not display the output of the code:
I use %time
for long-running processes (like a scikit-learn grid search) in which I want to know how long it took to run but I don’t actually want to watch it run!
I use %timeit
when I need to accurately compare the performance of two different lines of code.
%%time
and %%timeit
The use cases for %%time
and %%timeit
are the same as the line magics above, except that these cell magics time the entire cell:
%who
and %whos
%who
shows you all of the variables you’ve defined in the current session:
%whos
is similar, but it prints some extra information about each variable:
Both %who
and %whos
can be filtered by data type:
%history
and %pastebin
%history
shows your input history from the current session:
There are many useful options for %history, which you can learn about by adding a question mark after the command:
(The question mark allows you to get help with any object in Jupyter, not just magic commands!)
One useful option is to add -n
for line numbers:
If you really want to blow your mind, use the -g
(global) option to see your entire history, meaning every command you've ever typed into Jupyter 🤯
That may overflow Jupyter, so you can include the -f
option to save it to a text file:
(The last line in my text file is “5230/15”, which means that I’ve started 5230 Jupyter sessions on this computer 😅)
A more practical use of %history -g
is to search for a particular line of code that you’ve written in the past. For example, I can filter my history to only show input that included “df”:
Another line magic that pairs well with %history
is %pastebin
, which makes it easy to share your code with someone else. For example, this code uploads lines 1 through 6 of your current session’s input history to a pastebin website:
You can then share the unique URL with anyone you like, and here’s what they would see:
There are many more magic commands, which you can read about in the IPython documentation.
There are also other IPython features worth learning about, such as the ability to run shell commands from within Jupyter!
If you enjoyed this week’s tip, please forward it to a friend! Takes only a few seconds, and it really helps me grow the newsletter! 🚀
See you next Tuesday!
- Kevin
P.S. Officer: pop the trunk
Did someone awesome forward you this email? Sign up here to receive data science tips every week!
Join 25,000+ aspiring Data Scientists and receive Python & Data Science tips every Tuesday!
Hi Reader, Last week, I announced that a new course is coming soon and invited you to guess the topic. Hundreds of guesses were submitted, and four people who guessed correctly got the course for free! (I've already notified the winners.) I'll tell you about the course next week. In the meantime, I've got a new Tuesday Tip for you! 👇 🔗 Link of the week OpenAI just unleashed an alien of extraordinary ability (Understanding AI) If you're curious about what makes OpenAI's new "o1" models so...
Hi Reader, I'm really proud of this week's tip because it covers a topic (data leakage) that took me years to fully understand. 🧠 It's one of those times when I feel like I'm truly contributing to the collective wisdom by distilling complex ideas into an approachable format. 💡 You can read the tip below 👇 or on my blog. 🔗 Link of the week Building an AI Coach to Help Tame My Monkey Mind (Eugene Yan) In this short post, Eugene describes his experiences calling an LLM on the phone for coaching:...
Hi Reader, Last week, I recorded the FINAL 28 LESSONS 🎉 for my upcoming course, Master Machine Learning with scikit-learn. That's why you didn't hear from me last week! 😅 I edited one of those 28 videos and posted it on YouTube. That video is today's tip, which I'll tell you about below! 👉 Tip #45: How to read the scikit-learn documentation In order to become truly proficient with scikit-learn, you need to be able to read the documentation. In this video lesson, I’ll walk you through the five...