Tuesday Tip #11: Power up your pandas DataFrame 🐼


Hi Reader!

Back in tip #5, I showed you how to visualize your pandas code with Pandas Tutor.

Today, I’ve got four exciting pandas tools that can help you to:

  1. Speed up your data exploration
  2. Explore your dataset visually
  3. Treat your DataFrame like a spreadsheet
  4. Write pandas code faster with the help of AI

Let’s go! 🚀


👉 Tip #11: 4 tools to improve your pandas workflow

There are tons of free tools designed to improve your pandas workflow, but which ones are worth trying out?

I only considered tools that are being actively developed and maintained, since it’s not worth investing your time into a tool that will quickly become outdated, buggy, or broken.

Here are my top four picks...

1️⃣ ydata-profiling: “One-line Exploratory Data Analysis”

  • Summary: You run one line of code, and it creates an interactive report that makes it easy to examine each variable in your DataFrame. It also visualizes the interactions between variables, and alerts you to possible problems with the dataset. The report can even be exported to HTML!
  • Example: HTML report
  • Installation: pip or conda
  • Notes: It used to be known as pandas-profiling, but was renamed since it now also supports Spark DataFrames.
  • Takeaway: It’s a huge time-saver for getting an overview of a new dataset.

2️⃣ PyGWalker: “Turn your pandas DataFrame into a Tableau-style User Interface”

  • Summary: You run one line of code, and it creates a Tableau-like interface for visually exploring your pandas (or Polars) DataFrame. It works within Jupyter, Google Colab, Kaggle Code, VS Code, Streamlit, and more.
  • Example: Kaggle notebook
  • Installation: pip or conda
  • Notes: According to the repository, PyGWalker is pronounced “Pig Walker.”
  • Takeaway: It looks useful if you’re already familiar with Tableau (which I am not!)

3️⃣ Mito: “Edit spreadsheet, generate Python code”

  • Summary: It’s essentially spreadsheet software that you run inside of Jupyter. The killer feature is that as you point-and-click (or write Excel-style formulas) to transform your data, Mito writes the corresponding pandas code for you. You can even create interactive, customizable graphs!
  • Example: Watch the demo video
  • Installation: pip (virtual environment recommended)
  • Notes: Most features are available for free, though a few features are limited to paid plans.
  • Takeaway: It’s designed to help you automate your spreadsheet workflow, though you could also use it to help you learn pandas!

4️⃣ Sketch: “AI code-writing assistant for pandas”

  • Summary: You write out what you want to do with a DataFrame, and Sketch writes the pandas code for you! You can also ask it questions about your dataset.
  • Example: Colab notebook or watch the demo video
  • Installation: pip
  • Notes: Sketch shares information about your DataFrame with OpenAI, which improves the relevance of its suggestions.
  • Takeaway: It could help you to speed up your pandas workflow, though it’s important that you double-check the code suggestions (since they are not guaranteed to be correct).

What did I miss? Reply and let me know your favorite pandas tool!


If you enjoyed this week’s tip, please forward it to a friend! Takes only a few seconds, and it really helps me out 🙏

See you next Tuesday!

- Kevin

P.S. Reality TV

Did someone awesome forward you this email? Sign up here to receive data science tips every week!

Learn Data Science from Data School 📊

Join 25,000+ aspiring Data Scientists and receive Python & Data Science tips every Tuesday!

Read more from Learn Data Science from Data School 📊

Hi Reader, Next week, I’ll be offering a Black Friday sale on ALL of my courses. I’ll send you the details this Thursday! 🚨 👉 Tip #50: What is a "method" in Python? In Python, a method is a function that can be used on an object because of the object's type. For example, if you create a Python list, the "append" method can be used on that list. All lists have an "append" method simply because they are lists: If you create a Python string, the "upper" method can be used on that string simply...

Hi Reader, I appreciate everyone who has emailed to check on me and my family post-Helene! It has been more than 6 weeks since the hurricane, and most homes in Asheville (mine included) still don't have clean, running water. We're hopeful that water service will return within the next month. In the meantime, we're grateful for all of the aid agencies providing free bottled water, free meals, places to shower, and so much more. ❤️ Thanks for allowing me to share a bit of my personal life with...

Hi Reader, Regardless of whether you enrolled, thanks for sticking with me through the launch of my new course! 🚀 I've already started exploring topics for the next course... 😄 🔗 Link of the week git cheat sheet (PDF) A well-organized and highly readable cheat sheet from Julia Evans, the brilliant mind behind Wizard Zines! 👉 Tip #48: Three ways to set your environment variables in Python I was playing around with Mistral LLM this weekend (via LangChain in Python), and I needed to set an...