Learn Data Science from Data School 📊

Tuesday Tip #13: Use pandas resample() with time series data 📈

Published 12 months ago • 1 min read

Hi Reader,

Last week, I released a 3-hour video, My top 50 scikit-learn tips.

I also finished Chapter 10 of my next ML course, which I'll tell you about once all 20 chapters are done 😅

Anyway, let’s get to today’s tip!

👉 Tip #13: Use resample() with time series data

For fun, I’ve been building an interactive dashboard using pandas, Plotly Express, and Shiny for Python. (Check out a screenshot here.)

The goal is to help me analyze sales of my online courses. And since I’m working with sales data, I’m reminded of how much I love the pandas resample function!

Let’s see an example of how to resample 😉

Pretend you have a DataFrame of sales data that looks like this:

You might ask: What are my total sales for each product?

In that case, you would use groupby:

You can read that code as: For each Product, this is the sum of the Sale column.

Another similar question you might ask is: What are my total sales for each day?

Instead of groupby, you would use resample, which I think of as “groupby for time series data”:

You can read that code as: For each day, this is the sum of the Sale column.

(Notice that it inserted 2023-03-31 with a value of 0, since there were no sales on that day.)

By changing the 'D' to an 'M', you can resample by month instead:

'D' and 'M' are known as the “offset alias”, and there are many other offset aliases you can use.

Finally, let’s say that the index is not a datetime column:

In that case, you need to use the 'on' parameter to specify the datetime column:

If you work with time series data, I bet you’ll find a use for resample!

If you enjoyed this week’s tip, please forward it to a friend! Takes only a few seconds, and it really helps me out! 🙌

See you next Tuesday!

- Kevin

P.S. What’s the worst volume control interface? (my favorite is #12)

Did someone awesome forward you this email? Sign up here to receive data science tips every week!

Learn Data Science from Data School 📊

Kevin Markham

Join 25,000+ aspiring Data Scientists and receive Python & Data Science tips every Tuesday!

Read more from Learn Data Science from Data School 📊

Hi Reader, happy Tuesday! My recent tips have been rather lengthy, so I'm going to mix it up with some shorter tips (like today's). Let me know what you think! 💬 🔗 Link of the week A stealth attack came close to compromising the world's computers (The Economist) If you haven't heard about the recent "xz Utils backdoor", it's an absolutely fascinating/terrifying story! In short, a hacker (or team of hackers) spent years gaining the trust of an open-source project by making helpful...

13 days ago • 1 min read

Hi Reader, Today's tip is drawn directly from my upcoming course, Master Machine Learning with scikit-learn. You can read the tip below or watch it as a video! If you're interested in receiving more free lessons from the course (which won't be included in Tuesday Tips), you can join the waitlist by clicking here: Yes, I want more free lessons! 👉 Tip #43: Should you discretize continuous features for Machine Learning? Let's say that you're working on a supervised Machine Learning problem, and...

20 days ago • 2 min read

Hi Reader, I'm so excited to share this week's tip with you! It has been in my head for months, but I finally put it in writing ✍️ It's longer than usual, so if you prefer, you can read it as a blog post instead: Jupyter & IPython terminology explained 🔗 Link of the week Python Problem-Solving Bootcamp (April 1-21) Want to improve your Python skills quickly? There's no better way than solving problems, reviewing alternative solutions, and exchanging ideas with others. That's the idea behind...

about 1 month ago • 3 min read
Share this post