Chapter 6 Introduction to the Command Line
Objectives:
- To start using the command line for basic coding
- To learn how to navigate in the command line
- To create our first lines of code to process text files
The UNIX command line is a program that will interpret commands, in the form of text instructions, for the computer to be executed.
This means that you can tell the computer to list files or print text in the screen or do a mathematical operation using a set of commands and the computer will exacute them and provide you with an answer.
However, before we start using the command line I decided to make a list of the most commonly used commands in UNIX for your future reference.
6.1 Commonly used commands
6.1.1 Logging in
- Logging into the cluster:
140.232.222.14in your browser - Logging into the cluster via FTP to upload/download files (i.e Filezilla, Cyber Duck):
ssh username@140.232.222.14
6.2 Section 1: Basics of the command line
Today, we will do very basic things on the cluster. First, we will learn about our whereabouts (where is our home folder), then we will learn how to create a folder for the lab within your home folder. Then, we will create a folder for lab 2 and download a file to it. Finally, we will do a similar exercise to the one in class, and count the number of occurences in a text file.
6.2.1 Logging into the cluster.
- Log into your R studio session through your browser as indicated above
- Go to the
Terminaltab in the Command Pane Viewer in the bottom of your screen
WELCOME TO SMAUG!!
6.2.2 Where are we? Who am I?
First thing: You land into the magical real of the cluster. Now… where are you?
Question 1
- What command would you use to check in which working directory are you?
In addition, in case you forget what is your username, you can always use the command whoami:
This means two things:
- Our home directory, where all of your files are, is located within the
/Smaug_SSD/MBB101/usernamefolder - Our username, in case we forget it, can always be remembered.
6.2.3 Running basic commands
So, the idea of the command line is to run commands, right? Well, congrats! You have already ran two commands!
Question 2
- Which commands have you already ran?
So, as you know now. To run commands you just write them and that’s it!
For example: you want to know what time it is? Use the command date:
We will learn to use more commands as we use the cluster more and more.
6.2.5 Downloading files into the cluster.
After creating your Lab_2 folder, we need to populate it with files.
I have created a file for us to do a quick and fun exercise today and have uploaded it to a public repository of files called GitHub, at this link: https://raw.githubusercontent.com/Tabima/MBB101/master/Lab_2/text_file.txt.
To download the files, use the wget command. (wget is short for web page get):
Question 5
- What are the contents of this file? What command from the list above would you use to open or view a file?
Question 6
- How many lines does this file has? What command from the list above would you use to count the number of lines? What are the three outputs from the command?
6.2.6 Extracting patterns from a text file
Finally, lets do the examples from Monday’s course:
- How many times is the word “Dursley” found in the document?
- How many times is Harry mentioned?
- How was the weather on that Tuesday?
To find and match a pattern, we can use the command grep. (grep is short for g/re/p: globally search for a regular expression and print matching lines)
So, if we want to find out if the word “Dursley” is mentioned in the document we use the command:
grep Dursley text_file.txt
NOTE: Before you run the command, do you see anything weird?
In this case, we are using two more arguments after the
grep command: The pattern of interest we want (Dursley) and
the file that we want to search in (text_file.txt).
To learn how to use these commands, we can google the command or use
the man command (short for manual).
Alright, how does the output of grep looks like?
Ah cool, we can see if the pattern is present or not!
Question 7
-
What happens if you try and answer these questions using
grep?- How many times is the word “Dursley” found in the document?
- How many times is Harry mentioned?
- How was the weather on that Tuesday?
Summarize how grep can help you with finding these
patterns of interest.
Question 8
-
Using the
man grepcommand, how can you use grep to count for an occurrence of a text string. For example, can grep count the number of times the wordDursleyexists in the file? Write the command you would use and the results.
6.3 Section 2: Advanced command line.
6.3.1 Identifying paths
We know now how to download and check files, but we also need to remember how to look at where the files are in case we need to use paths in the future.
For example, what is the path of the text_file.txt we were working on?
Question 9
-
Using the
readlink -fcommand, can you indicate what is the full path of this file?.
Question 10
-
If you are in your home (
~) folder, and you have a file there calledfile.txt, what is the relative path of this file?
6.3.2 Creating copies of a file into a new folder.
We want to modify the text_file.txt to add, remove, replace or count elements in a new version of it.
So, the idea is to create a new folder inside your Lab_2/ folder, named modified/.
Create the folder and copy the text_file.txt file into a new file called modified_file.txt in a new folder called modified/ using the cp command
Question 11
- Please present syntax of the command to do the instructions above.
Question 12
-
What is the absolute path of the
modified_file.txt?
Lets check if it worked:
6.3.3 Replacing patterns using sed
So, if you recall the theory class, we learnt about the sed command to replace patterns.
The idea is that you will replace some patterns in the modified_file.txt file and then check if the results make sense.
- Change the
Dursleypatterns intoPotterand save the file aspotter.txtusing thesed s/Dursley/Potter/g modified_file.txt > potter.txtcommand.
Question 13
-
What does the
>in the command mean?
Question 14
-
Check the number of occurrences of the word
Potterin thepotter.txtfile usinggrep.
Question 15
-
How many times does the word
Potterappear? Is it the same number as in question 7 when you checked the number of times that Harry was mentioned? Why?
Question 17
Count the number of lines of the potter.txt file and
compare them to the original file.
-
Is the number of lines similar between
potter.txtandmodified_file.txt? Very briefly justify your answer.
- Extract all the lines that say
Potterfrompotter.txtinto a new file calledpotter_lines.txt.
Question 18
- Present the syntax to execute the previous instruction.
Question 19
-
What are the number of lines that say
Potterin the file? Is the number different from questions 14 and 15? Explain your answer very briefly.
6.3.4 Removing files, renaming files and creating symbolic links
Now that we are done with creating files and modifing them, we need to remove all the files we are not using anymore. The rm command will allow us to do that.
- Remove the
potter_lines.txtand themodified_file.txtfiles from the folder.
Question 20
- What are the commands you used to remove the files? How can you make sure the files do not exist anymore?
- Rename the
potter.txtfile asPotter_story.txtusing themvcommand as suchmv potter.txt Potter_story.txt.
Question 22
- Explain the syntax of the previous command briefly.
- Go back to your
Lab_2/folder (remember to use thecd ..command) and create a symbolic link of yourPotter_story.txtfile into theLab_2/folder using theln -s modified/Potter_story.txt ..
Question 23
- Explain the syntax of the previous command briefly.
Question 24
-
Use the ls -l command while in your
Lab_2/folder. What is a weird pattern you see on the newPotter_story.txtfile? Is that its absolute or its relative path?