profile

Learn Data Science from Data School πŸ“Š

Tuesday Tip #34: What are conda, Anaconda, and Miniconda? 🐍

Published 3 months agoΒ β€’Β 2 min read

Hi Reader,

Soon it will be winter break for my 6-year-old, so this is going to be my last Tuesday Tip of the year! β›„


πŸ‘‰ Tip #34: What's the difference between conda, Anaconda, and Miniconda?

If you've ever taken one of my courses, you may have noticed that I frequently recommend the Anaconda distribution of Python.

You might be left wondering:

  • What is the Anaconda distribution, and why do people recommend it?
  • How is it related to conda?
  • How is it related to Miniconda?
  • As a data scientist, which of these do I need to be familiar with?

I'll answer those questions below! πŸ‘‡


What is Anaconda?

​Anaconda is a Python distribution aimed at data scientists that includes 250+ packages (with easy access to 7,500+ additional packages). Its value proposition is that you can download it (for free) and "everything just works." It's available for Mac, Windows, and Linux.

A new Anaconda distribution is released a few times a year. Within each distribution, the versions of the included packages have all been tested to work together.

If you visit the installation page for many data science packages (such as pandas), they recommend Anaconda because it makes installation easy!


What is conda?

​conda is an open source package and environment manager that comes with Anaconda.

As a package manager, you can use conda to install, update, and remove packages and their "dependencies" (the packages they depend upon):

  • If Anaconda doesn't include a package that you need, you use conda to download and install it.
  • If Anaconda doesn’t have the version of a package you need, you use conda to update it.

As an environment manager, you can use conda to manage virtual environments:

  • If you're not familiar with virtual environments, they allow you to maintain isolated environments with different packages and versions of those packages.
  • conda is an alternative to virtualenv, pipenv, and other related tools.

conda has a few huge advantages over other tools:

  • It’s a single tool to learn, rather than using multiple tools to manage packages, environments, and Python versions.
  • Package installation is predictably easy because you’re installing pre-compiled binaries.
  • Unlike pip, you never need to build from source code, which can be especially difficult for some data science packages.
  • You can use conda with languages other than Python.

What is Miniconda?

​Miniconda is a Python distribution that only includes Python, conda, their dependencies, and a few other useful packages.

Miniconda is a great choice if you prefer to only install the packages you need, and you're sufficiently familiar with conda. (Here's how to choose between Anaconda and Miniconda.)


Summary:

  • ​Anaconda and Miniconda are both Python distributions.
  • Anaconda includes hundreds of packages, whereas Miniconda includes just a few.
  • ​conda is an open source tool that comes with both Anaconda and Miniconda, and it functions as both a package manager and an environment manager.

Personally, I make extensive use of conda for creating environments and installing packages. And since I'm comfortable with conda, I much prefer Miniconda over Anaconda.

Would you be interested in taking a short course about conda? Reply and let me know! πŸ’Œ


If you enjoyed this week’s tip, please forward it to a friend! Takes only a few seconds, and it really helps me reach more people!

I'll see you again in January! πŸ‘‹

- Kevin

P.S. Christmas decorating injuries πŸŽ„

Did someone awesome forward you this email? Sign up here to receive Data Science tips every week!

Learn Data Science from Data School πŸ“Š

Kevin Markham

Join 25,000+ aspiring Data Scientists and receive Python & Data Science tips every Tuesday!

Read more from Learn Data Science from Data School πŸ“Š

Hi Reader, Do any of these sound like you? You're new to the pandas library and you want to learn the fundamentals You have some experience with pandas, but you want to fill in the gaps in your knowledge You want to learn the best practices for data analysis with pandas in 2024 If so, you should enroll in my FREE course (launching today!), pandas in 30 days. Why learn pandas? pandas is a powerful, open source Python library for data analysis, manipulation, and visualization. If you're working...

9 days agoΒ β€’Β 1 min read

Hi Reader, There's a gift for you somewhere in this email... just look for the 🎁 emoji! Tip #39: Six quick Python tricks Here's what I'll cover below: Return the number of unique values Count values with Counter Better debugging with f-strings Return multiple values from a function Count while looping Create a dictionary with a comprehension Let's get started! πŸ‘‡ 1️⃣ Return the number of unique values Need to know the number of unique values in an iterable? Convert it to a set and check the...

16 days agoΒ β€’Β 2 min read

Hi Reader, My goal with Tuesday Tips is to help you get better at Data Science every week. Is there anything that would make these tips even more helpful for you? Let me know! πŸ’¬ You can find past tips at tuesday.tips. (Yes, that’s a real URL!) Tip #38: Five ways to rename your DataFrame columns Let's say that we have a simple pandas DataFrame: I prefer to use dot notation to select pandas columns, but that won't work since the column names have spaces. Let's fix this! The most flexible method...

23 days agoΒ β€’Β 1 min read
Share this post