Today, I’ve got four exciting pandas tools that can help you to:
- Speed up your data exploration
- Explore your dataset visually
- Treat your DataFrame like a spreadsheet
- Write pandas code faster with the help of AI
Let’s go! 🚀
👉 Tip #11: 4 tools to improve your pandas workflow
There are tons of free tools designed to improve your pandas workflow, but which ones are worth trying out?
I only considered tools that are being actively developed and maintained, since it’s not worth investing your time into a tool that will quickly become outdated, buggy, or broken.
Here are my top four picks...
1️⃣ ydata-profiling: “One-line Exploratory Data Analysis”
- Summary: You run one line of code, and it creates an interactive report that makes it easy to examine each variable in your DataFrame. It also visualizes the interactions between variables, and alerts you to possible problems with the dataset. The report can even be exported to HTML!
- Example: HTML report
- Installation: pip or conda
- Notes: It used to be known as pandas-profiling, but was renamed since it now also supports Spark DataFrames.
- Takeaway: It’s a huge time-saver for getting an overview of a new dataset.
2️⃣ PyGWalker: “Turn your pandas DataFrame into a Tableau-style User Interface”
- Summary: You run one line of code, and it creates a Tableau-like interface for visually exploring your pandas (or Polars) DataFrame. It works within Jupyter, Google Colab, Kaggle Code, VS Code, Streamlit, and more.
- Example: Kaggle notebook
- Installation: pip or conda
- Notes: According to the repository, PyGWalker is pronounced “Pig Walker.”
- Takeaway: It looks useful if you’re already familiar with Tableau (which I am not!)
3️⃣ Mito: “Edit spreadsheet, generate Python code”
- Summary: It’s essentially spreadsheet software that you run inside of Jupyter. The killer feature is that as you point-and-click (or write Excel-style formulas) to transform your data, Mito writes the corresponding pandas code for you. You can even create interactive, customizable graphs!
- Example: Watch the demo video
- Installation: pip (virtual environment recommended)
- Notes: Most features are available for free, though a few features are limited to paid plans.
- Takeaway: It’s designed to help you automate your spreadsheet workflow, though you could also use it to help you learn pandas!
4️⃣ Sketch: “AI code-writing assistant for pandas”
- Summary: You write out what you want to do with a DataFrame, and Sketch writes the pandas code for you! You can also ask it questions about your dataset.
- Example: Colab notebook or watch the demo video
- Installation: pip
- Notes: Sketch shares information about your DataFrame with OpenAI, which improves the relevance of its suggestions.
- Takeaway: It could help you to speed up your pandas workflow, though it’s important that you double-check the code suggestions (since they are not guaranteed to be correct).
What did I miss? Reply and let me know your favorite pandas tool!
If you enjoyed this week’s tip, please forward it to a friend! Takes only a few seconds, and it really helps me out 🙏
See you next Tuesday!
P.S. Reality TV
Did someone awesome forward you this email? Sign up here to receive data science tips every week!