Chapter 6 Introduction to the Command Line
Objectives:
- To start using the command line for basic coding
- To learn how to navigate in the command line
- To create our first lines of code to process text files
The UNIX command line is a program that will interpret commands, in the form of text instructions, for the computer to be executed.
This means that you can tell the computer to list files
or print text in the screen
or do a mathematical operation
using a set of commands and the computer will exacute them and provide you with an answer.
However, before we start using the command line I decided to make a list of the most commonly used commands in UNIX for your future reference.
6.1 Commonly used commands
6.1.1 Logging in
- Logging into the cluster:
140.232.222.154
in your browser - Logging into the cluster via FTP to upload/download files (i.e Filezilla, Cyber Duck):
ssh username@140.232.222.228
6.2 Section 1: Basics of the command line
Today, we will do very basic things on the cluster. First, we will learn about our whereabouts (where is our home
folder), then we will learn how to create a folder
for the lab within your home folder. Then, we will create a folder for lab 2 and download a file to it. Finally, we will do a similar exercise to the one in class, and count the number of occurences in a text file.
6.2.1 Logging into the cluster.
- Log into your R studio session through your browser as indicated above
- Go to the
Terminal
tab in the Command Pane Viewer in the bottom of your screen
WELCOME TO SMAUG!!
6.2.2 Where are we? Who am I?
First thing: You land into the magical real of the cluster. Now… where are you?
Question 1
- What command would you use to check in which working directory are you?
In addition, in case you forget what is your username, you can always use the command whoami
:
This means two things:
- Our home directory, where all of your files are, is located within the
/Smaug_SSD/MBB101/username
folder - Our username, in case we forget it, can always be remembered.
6.2.3 Running basic commands
So, the idea of the command line is to run commands, right? Well, congrats! You have already ran two commands!
Question 2
- Which commands have you already ran?
So, as you know now. To run commands you just write them and that’s it!
For example: you want to know what time it is? Use the command date
:
We will learn to use more commands as we use the cluster more and more.
6.2.5 Downloading files into the cluster.
After creating your Lab_2
folder, we need to populate it with files.
I have created a file for us to do a quick and fun exercise today and have uploaded it to a public repository of files called GitHub, at this link: https://raw.githubusercontent.com/Tabima/MBB101/master/Lab_2/text_file.txt
.
To download the files, use the wget
command. (wget
is short for web page get):
Question 5
- What are the contents of this file? What command from the list above would you use to open or view a file?
Question 6
- How many lines does this file has? What command from the list above would you use to count the number of lines? What are the three outputs from the command?
6.2.6 Extracting patterns from a text file
Finally, lets do the examples from Monday’s course:
- How many times is the word “Dursley” found in the document?
- How many times is Harry mentioned?
- How was the weather on that Tuesday?
To find and match a pattern, we can use the command grep
. (grep
is short for g/re/p
: globally search for a regular expression and print matching lines)
So, if we want to find out if the word “Dursley” is mentioned in the document we use the command:
grep Dursley text_file.txt
NOTE: Before you run the command, do you see anything weird?
In this case, we are using two more arguments after the
grep
command: The pattern of interest we want (Dursley) and
the file that we want to search in (text_file.txt).
To learn how to use these commands, we can google the command or use
the man
command (short for manual).
Alright, how does the output of grep
looks like?
Ah cool, we can see if the pattern is present or not!
Question 7
-
What happens if you try and answer these questions using
grep
?- How many times is the word “Dursley” found in the document?
- How many times is Harry mentioned?
- How was the weather on that Tuesday?
Summarize how grep
can help you with finding these
patterns of interest.
Question 8
-
Using the
man grep
command, how can you use grep to count for an occurrence of a text string. For example, can grep count the number of times the wordDursley
exists in the file? Write the command you would use and the results.
6.3 Section 2: Advanced command line.
6.3.1 Identifying paths
We know now how to download and check files, but we also need to remember how to look at where the files are in case we need to use paths in the future.
For example, what is the path of the text_file.txt
we were working on?
Question 9
-
Using the
readlink -f
command, can you indicate what is the full path of this file?.
Question 10
-
If you are in your home (
~
) folder, and you have a file there calledfile.txt
, what is the relative path of this file?
6.3.2 Creating copies of a file into a new folder.
We want to modify the text_file.txt
to add, remove, replace or count elements in a new version of it.
So, the idea is to create a new folder inside your Lab_2/
folder, named modified/
.
Create the folder and copy the text_file.txt
file into a new file called modified_file.txt
in a new folder called modified/
using the cp
command
Question 11
- Please present syntax of the command to do the instructions above.
Question 12
-
What is the absolute path of the
modified_file.txt
?
Lets check if it worked:
6.3.3 Replacing patterns using sed
So, if you recall the theory class, we learnt about the sed
command to replace patterns.
The idea is that you will replace some patterns in the modified_file.txt
file and then check if the results make sense.
- Change the
Dursley
patterns intoPotter
and save the file aspotter.txt
using thesed s/Dursley/Potter/g modified_file.txt > potter.txt
command.
Question 13
-
What does the
>
in the command mean?
Question 14
-
Check the number of occurrences of the word
Potter
in thepotter.txt
file usinggrep
.
Question 15
-
How many times does the word
Potter
appear? Is it the same number as in question 7 when you checked the number of times that Harry was mentioned? Why?
Question 17
Count the number of lines of the potter.txt
file and
compare them to the original file.
-
Is the number of lines similar between
potter.txt
andmodified_file.txt
? Very briefly justify your answer.
- Extract all the lines that say
Potter
frompotter.txt
into a new file calledpotter_lines.txt
.
Question 18
- Present the syntax to execute the previous instruction.
Question 19
-
What are the number of lines that say
Potter
in the file? Is the number different from questions 14 and 15? Explain your answer very briefly.
6.3.4 Removing files, renaming files and creating symbolic links
Now that we are done with creating files and modifing them, we need to remove all the files we are not using anymore. The rm
command will allow us to do that.
- Remove the
potter_lines.txt
and themodified_file.txt
files from the folder.
Question 20
- What are the commands you used to remove the files? How can you make sure the files do not exist anymore?
- Rename the
potter.txt
file asPotter_story.txt
using themv
command as suchmv potter.txt Potter_story.txt
.
Question 22
- Explain the syntax of the previous command briefly.
- Go back to your
Lab_2/
folder (remember to use thecd ..
command) and create a symbolic link of yourPotter_story.txt
file into theLab_2/
folder using theln -s modified/Potter_story.txt .
.
Question 23
- Explain the syntax of the previous command briefly.
Question 24
-
Use the ls -l command while in your
Lab_2/
folder. What is a weird pattern you see on the newPotter_story.txt
file? Is that its absolute or its relative path?