In this post, you’re going to learn the 20% of the command line that you’ll use 80% of the time.
(This guide is emphatically not meant to be comprehensive—this guide will show you how to get up and running quickly with the most useful commands.)
Before we dive in, just a couple of items.
In case you want to learn more than just the command line, here are the other Project Data Science 80/20 Guides:
And if you need to get a professional data science environment set up on your computer, we have a guide for that: Step-by-Step Guide to Setting Up a Professional Data Science Environment.
Alright—Ready to get started?
Table of Contents
- Table of Contents
- 80/20 Command Line
- Different Names for the Command Line
- A Note about the Windows Operating System
- What Is a Shell?
- Basic Shell Commands
- Common Programs and Applications
- Additional Resources
80/20 Command Line
Different Names for the Command Line
The one-sentence definition for the command line is “a text-based way to interact with and run programs on your computer.” That’s really all it is.
But first, before we go any deeper, we need to get this out of the way—there are a lot of different names you might hear for the command line, and we’ll list them here just so you’re aware of the different terminology.
Obviously, “command line” is one way to talk about the command line. This is one of the most general terms.
Another very popular term for the command line is the shell, which is also general.
You’ll also hear people talk about specific shells such as bash and zsh, or they may say “bash” when they really just mean “the command line, whatever shell it is”. (We usually use zsh at Project Data Science, which is an extension of bash.) These shell names are referencing specific programs that have slightly different functionality, although the core of the functionality is often the same.
Finally, the terminal is another way to reference the command line—and this is the official name of the command line program on MacOS, so this term gets used a lot with people who are using Macs.
For the rest of this article, we’ll mostly use either “shell” or “terminal” since those are shorter and are a bit more commonly used in the industry.
If you want to go deeper (you don’t necessarily need to if you don’t want), check out this excellent answer on the Ask Ubuntu StackExchange site: What is the difference between Terminal, Console, Shell, and Command Line?
A Note about the Windows Operating System
One quick note about Windows before we go on. The Windows operating system has two main command lines, CMD and PowerShell. These command lines are powerful and can do many of the same things as the Linux and MacOS command lines, but they have different commands for doing so and they have a very different way of operating in general.
Most programmers and data scientists use the Linux/Unix/MacOS style command lines, because most of the world’s servers are run on Linux. (Linux and MacOS are both built on top of Unix, or are Unix-like, and so they can both run popular shells like bash and zsh.) If you’re going to be working on a server at some point, it’s probably going to be a Linux server, and you’re probably going to be using bash to interact with that server.
We won’t be covering the CMD or PowerShell commands in this article, since those aren’t used as commonly. If you work on a Windows computer, consider installing the Linux Subsystem for Windows. We dive into all of your command line options for Windows in this article: Step-by-Step Guide to Setting Up a Professional Data Science Environment on Windows.
What Is a Shell?
We covered this briefly above, but essentially a shell is a text-based way to interact with and run programs on your computer.
Want to create a new file or directory? There’s a shell command for that. Want to hop into Python or run a Python script? You can do that from the shell too. Want to connect to another computer? That’s also a program you can run from the shell.
You’ll typically start off with one of the most popular shell applications, like bash or zsh, which has a lot of commands ready for you to use. You’ll use this shell to run applications and executables on your computer (like the Python executable), and if you need to install additional applications you can do that with the shell too.
But this is all a lot easier to understand by simply going through the most common commands you’ll run on the shell. Let’s fire up a terminal!
Basic Shell Commands
Starting Your Terminal
First, we’ll launch the terminal. I’m on a Mac, so I’ll just search “Terminal” and run that application.
Tada! Here we are, inside of the shell. I’m using the shell zsh, which you can install by following the instructions in this article if you want: Step-by-Step Guide to Setting Up a Professional Data Science Environment. Or you can use bash, which is totally fine and is probably the default shell on your computer.
Printing Your Working Directory
The first thing we might ask is… Uh, where are we? Let’s use the “pwd” command to print our working directory.
It looks like we’re in the “stevenrouk” directory under the “Users” directory. Directory is just another word for a folder.
Printing the Directory Contents
So now that we know where we are, what all is in this directory? We can use the “ls” command to list the contents of the directory. (Note that the letters are lowercase L and S.)
You can see that I have some typical directories—like Desktop, Documents, Downloads, Pictures, etc.—as well as some custom directories that I’ve created, such as my-project-dir and ProjectDataScience.
Now I want to check out one of these directories, like the my-project-dir directory. Let’s change directories into my-project-dir.
Now let’s list the contents of this new directory that we’re in.
Looks like a Python project of some sort.
(We could have also listed the contents of this directory without changing into it with the command “ls my-project-dir”.)
Creating New Files and New Directories
We can create a new file using the “touch” command. I’ll use it to create a new Python script called “test_shell_commands.py”.
If we want to make a new directory, we can use the “mkdir” command.
Well, maybe I actually wanted to put the “test_shell_commands.py” file into my-new-directory. I can move it using the “mv” command.
And if I’m done with it, I can delete (remove) the file using the “rm” command.
You’ll notice that to go into a directory you put a forward slash (“/”) after the directory name. That’s how unix-style paths (file paths or directory paths) work.
To remove a directory and everything in it, I need to use the “-r” flag.
What if I want to open up a file or directory and look at it like normal, using the graphical user interface (GUI) that we usually use on our computers? I can just use the “open” command with the file or directory, and our computer will open it using the default program.
I’ll open up the directory we’re currently in.
Note that I used a single period in the shell to indicate “the current directory”. Anytime you want to reference the current directory that you’re in, you can use a single period to indicate that.
There are some other special characters that are used all of the time. Two periods in a row means “the parent directory”. Let’s use the “cd” command with two periods to go up one directory, back to where we first started.
Do you see that tilda (“~”) there? That’s a special character that means “home directory”. In this case, my home directory is “/Users/stevenrouk”. Anytime I want to go to the home directory, I can just do “cd ~”.
Finally, one more special character to discuss: a single forward slash (“/”). This indicates the root directory, which is top-level parent of all of the other directories on the computer. This is the highest you can go in the tree hierarchy. Let’s use “ls /” to list the contents of the root directory.
Common Programs and Applications
Programs and applications are essentially “executable files” that can be run by your computer, and these can be launched from the shell. There are certain programs that are very commonly used in the shell.
Python, IPython, and Juypter Notebooks
Data scientists will often use the shell to launch a Python interactive shell, an IPython interactive shell, or a Jupyter notebook server. Here, we’ll launch all three in quick succession just to show you how it’s done.
Once you launch the Python shells or Jupyter notebooks, then you’re just working with Python at that point!
Editing a File with Vim
What happens if you need to edit a Python file using just the command line, though? This is very common if you’re connecting to a server that doesn’t have any kind of graphical user interface. First, let’s create a Python file to work with.
Now we’ll edit the Python file using one of the most common command line based text editors, Vim. Vim does have a bit of a steep learning curve because it uses all kinds of special commands for navigating around your text, but it’s worth learning the basics.
We’ll open up our file with Vim using the command “vim demo.py”.
Now we’re in the program Vim! Looks pretty plain because… well, it’s just a blank text file.
I’ll add a few lines of Python code, then save the file and quit out of Vim.
Now I can run that Python script and see what it prints out!
If you want to learn a bit more about Vim, check out this introductory tutorial: Getting started with Vim: The basics.
Opening a File or Directory with VS Code
Of course, very often we’re editing code on our computers which do have graphical user interfaces and useful coding programs like VS Code. We can open up files or directories in VS Code by using the “code” command. If I wanted to open up the current directory in VS Code I could run “code .” (notice the single period there), but right now I’ll just open up the demo.py file with “code demo.py”.
Connecting to Another Computer
One final program that we’ll briefly touch on is SSH, which stands for “secure shell”. This is how you’ll very often connect to other computers, such as servers or virtual machines (which are often hosted by cloud providers like Amazon Web Services (AWS) or Google Cloud Platform).
Let’s say I currently have a virtual machine provisioned on AWS—which basically means I have a computer in the cloud managed by AWS. These virtual machines are usually running Linux, and I can use the SSH program to connect to them.
I’ll connect to a virtual machine I have running right now. Here’s how I would connect (this isn’t my real server URL).
Once I’m connected to the other server, I’m in another shell and can use the same shell commands! These commands are getting executed on my virtual machine in the cloud. When I’m done, I can exit out of the SSH program and return to the shell on my own computer.
And with that, we’ve covered the 20% of the command line you need to know to get 80% of the value! (We actually probably only covered 5% of the command line, actually, but these are still the commands you’ll use ~80% of the time.) Knowing how to effectively use the command line will not only make your life much easier, but it’s also an important skill to a lot of companies looking to hire data scientists.
There’s so much more to learn though, so I wanted to leave you with some additional resources and topics for future learning if you want to go deeper.
First, here are in-depth guides that can help you go deeper. Both of these are excellent resources for learning more about Linux in general (including the command line).
- The Linux Documentation Project Guides: https://tldp.org/guides.html
- Linux Journey: https://linuxjourney.com/
Second, here are some topics that you’ll want to learn more about at some point.
- Aliases are used to define custom shell commands. For example, a very common alias is using “ll” (two lowercase L’s) for the command “ls -alF”, which is the “ls” command with a few flags to print additional information. To define this alias in bash, you can run the command “alias ll=’ls -alF’”.
- To learn more: https://tldp.org/LDP/abs/html/aliases.html
- Bash Profile
- Your bash profile (or bashrc, or zshrc) is a file that controls some configuration options for bash. One common use of this file is to define aliases in order to make them immediately available when you start the shell. Another common use is to export environment variables so that they’re available. But, there are many other uses for the bash profile. The bash profile is stored in the home directory and usually has the file name .bash_profile, .bashrc, or .zshrc.
- To learn more: https://friendly-101.readthedocs.io/en/latest/bashprofile.html
- Shell Scripts
- Not only can you use the shell in a real-time interactive manner, but you can also write shell scripts just like you write Python scripts, using the shell commands as a scripting language.
- To learn more: https://tldp.org/LDP/abs/html/index.html
And with that, you’ve just learned the 20% of the command line that will get you 80% of the value. A lot more to learn, but you’re well on your way!
Introduction to Practical Data Science in Python
This course bundle covers an in-depth introduction to the core data science and machine learning skills you need to know to start becoming a data science practitioner. You’ll learn using the same tools that the professionals use: Python, Git, VS Code, Jupyter Notebooks, and more.