|
| 1 | +--- |
| 2 | +title: Introducing the Shell |
| 3 | +teaching: 20 |
| 4 | +exercises: 10 |
| 5 | +--- |
| 6 | + |
| 7 | +::::::::::::::::::::::::::::::::::::::: objectives |
| 8 | + |
| 9 | +- Describe key reasons for learning shell. |
| 10 | +- Navigate your file system using the command line. |
| 11 | +- Access and read help files for `bash` programs and use help files to identify useful command options. |
| 12 | +- Demonstrate the use of tab completion, and explain its advantages. |
| 13 | + |
| 14 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 15 | + |
| 16 | +:::::::::::::::::::::::::::::::::::::::: questions |
| 17 | + |
| 18 | +- What is a command shell and why would I use one? |
| 19 | +- How can I move around on my computer? |
| 20 | +- How can I see what files and directories I have? |
| 21 | +- How can I specify the location of a file or directory on my computer? |
| 22 | + |
| 23 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 24 | + |
| 25 | +## What is a shell and why should I care? |
| 26 | + |
| 27 | +A *shell* is a computer program that presents a command line interface |
| 28 | +which allows you to control your computer using commands entered |
| 29 | +with a keyboard instead of controlling graphical user interfaces |
| 30 | +(GUIs) with a mouse/keyboard/touchscreen combination. |
| 31 | + |
| 32 | +There are many reasons to learn about the shell: |
| 33 | + |
| 34 | +- Many bioinformatics tools can only be used through a command line interface. Many more |
| 35 | + have features and parameter options which are not available in the GUI. |
| 36 | + BLAST is an example. Many of the advanced functions are only accessible |
| 37 | + to users who know how to use a shell. |
| 38 | +- The shell makes your work less boring. In bioinformatics you often need to repeat tasks with a large number of files. With the shell, you can automate those repetitive tasks and leave you free to do more exciting things. |
| 39 | +- The shell makes your work less error-prone. When humans do the same thing a hundred different times |
| 40 | + (or even ten times), they're likely to make a mistake. Your computer can do the same thing a thousand times |
| 41 | + with no mistakes. |
| 42 | +- The shell makes your work more reproducible. When you carry out your work in the command-line |
| 43 | + (rather than a GUI), your computer keeps a record of every step that you've carried out, which you can use |
| 44 | + to re-do your work when you need to. It also gives you a way to communicate unambiguously what you've done, |
| 45 | + so that others can inspect or apply your process to new data. |
| 46 | +- Many bioinformatic tasks require large amounts of computing power and can't realistically be run on your |
| 47 | + own machine. These tasks are best performed using remote computers or cloud computing, which can only be accessed |
| 48 | + through a shell. |
| 49 | + |
| 50 | +In this lesson you will learn how to use the command line interface to move around in your file system. |
| 51 | + |
| 52 | +## How to access the shell |
| 53 | + |
| 54 | +On a Mac or Linux machine, you can access a shell through a program called "Terminal", which is already available |
| 55 | +on your computer. The Terminal is a window into which we will type commands. If you're using Windows, |
| 56 | +you'll need to download a separate program to access the shell. |
| 57 | + |
| 58 | +To save time, we are going to be working on a remote server where all the necessary data and software available. |
| 59 | +When we say a 'remote server', we are talking about a computer that is not the one you are working on right now. |
| 60 | +You will access the Carpentries remote server where everything is prepared for the lesson. |
| 61 | +We will learn the basics of the shell by manipulating some data files. Some of these files are very large |
| 62 | +, and would take time to download to your computer. |
| 63 | +We will also be using several bioinformatic packages in later lessons and installing all of the software |
| 64 | +would take up time even more time. A 'ready-to-go' server lets us focus on learning. |
| 65 | + |
| 66 | +## How to access the remote server |
| 67 | + |
| 68 | +You can log-in to the remote server using the instructions |
| 69 | +[here](https://datacarpentry.org/cloud-genomics/02-logging-onto-cloud#logging-onto-a-cloud-instance). |
| 70 | +Your instructor will supply to you the `ip_address` and password that you need to login. |
| 71 | + |
| 72 | +Each of you will have a different `ip_address`. This will |
| 73 | +prevent us from accidentally changing each other's files as we work through the |
| 74 | +exercises. The password will be the same for everyone. |
| 75 | + |
| 76 | +After logging in, you will see a screen showing something like this: |
| 77 | + |
| 78 | +```output |
| 79 | +Welcome to Ubuntu 20.04.5 LTS (GNU/Linux 5.4.0-137-generic x86_64) |
| 80 | +
|
| 81 | + * Documentation: https://help.ubuntu.com |
| 82 | + * Management: https://landscape.canonical.com |
| 83 | + * Support: https://ubuntu.com/advantage |
| 84 | +
|
| 85 | + System information as of Mon 13 Mar 2023 03:57:46 AM UTC |
| 86 | +
|
| 87 | + System load: 0.0 Processes: 192 |
| 88 | + Usage of /: 20.3% of 98.27GB Users logged in: 0 |
| 89 | + Memory usage: 25% IPv4 address for eth0: 172.31.12.214 |
| 90 | + Swap usage: 0% |
| 91 | +
|
| 92 | + Get cloud support with Ubuntu Advantage Cloud Guest: |
| 93 | + http://www.ubuntu.com/business/services/cloud |
| 94 | +
|
| 95 | +178 updates can be applied immediately. |
| 96 | +108 of these updates are standard security updates. |
| 97 | +To see these additional updates run: apt list --upgradable |
| 98 | +
|
| 99 | +
|
| 100 | +Last login: Fri Mar 10 03:14:44 2023 from 72.83.168.14 |
| 101 | +``` |
| 102 | + |
| 103 | +This provides a lot of information about the remote server that you're logging into. We're not going to use most of this information for |
| 104 | +our workshop, so you can clear your screen using the `clear` command. |
| 105 | + |
| 106 | +Type the word `clear` into the terminal and press the `Enter` key. |
| 107 | + |
| 108 | +```bash |
| 109 | +$ clear |
| 110 | +``` |
| 111 | + |
| 112 | +This will scroll your screen down to give you a fresh screen and will make it easier to read. |
| 113 | +You haven't lost any of the information on your screen. If you scroll up, you can see everything that has been output to your screen |
| 114 | +up until this point. |
| 115 | + |
| 116 | +::::::::::::::::::::::::::::::::::::::::: callout |
| 117 | + |
| 118 | +## Tip |
| 119 | + |
| 120 | +Hot-key combinations are shortcuts for performing common commands. |
| 121 | +The hot-key combination for clearing the console is `Ctrl+L`. Feel free to try it and see for yourself. |
| 122 | + |
| 123 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 124 | + |
| 125 | +## Navigating your file system |
| 126 | + |
| 127 | +The part of the operating system that manages files and directories |
| 128 | +is called the **file system**. |
| 129 | +It organizes our data into files, |
| 130 | +which hold information, |
| 131 | +and directories (also called "folders"), |
| 132 | +which hold files or other directories. |
| 133 | + |
| 134 | +Several commands are frequently used to create, inspect, rename, and delete files and directories. |
| 135 | + |
| 136 | +::::::::::::::::::::::::::::::::::::::::: callout |
| 137 | + |
| 138 | +## Preparation Magic |
| 139 | + |
| 140 | +You may have a prompt (the characters to the left of the cursor) that looks different from the `$` sign character used here. |
| 141 | +If you would like to change your prompt to match the example prompt, first type the command: |
| 142 | +`echo $PS1` |
| 143 | +into your shell, followed by pressing the <kbd>Enter</kbd> key. |
| 144 | + |
| 145 | +This will print the bash special characters that are currently defining your prompt. |
| 146 | +To change the prompt to a `$` (followed by a space), enter the command: |
| 147 | +`PS1='$ '` |
| 148 | +Your window should look like our example in this lesson. |
| 149 | + |
| 150 | +To change back to your original prompt, type in the output of the previous command `echo $PS1` (this will be different depending on the |
| 151 | +original configuration) between the quotes in the following command: |
| 152 | +`PS1=""` |
| 153 | + |
| 154 | +For example, if the output of `echo $PS1` was `\u@\h:\w $ `, |
| 155 | +then type those characters between the quotes in the above command: `PS1="\u@\h:\w $ "`. |
| 156 | +Alternatively, you can reset your original prompt by exiting the shell and opening a new session. |
| 157 | + |
| 158 | +This isn't necessary to follow along (in fact, your prompt may have other helpful information you want to know about). This is up to you! |
| 159 | + |
| 160 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 161 | + |
| 162 | +```bash |
| 163 | +$ |
| 164 | +``` |
| 165 | + |
| 166 | +The dollar sign is a **prompt**, which shows us that the shell is waiting for input; |
| 167 | +your shell may use a different character as a prompt and may add information before |
| 168 | +the prompt. When typing commands, either from these lessons or from other sources, |
| 169 | +do not type the prompt, only the commands that follow it. |
| 170 | + |
| 171 | +Let's find out where we are by running a command called `pwd` |
| 172 | +(which stands for "print working directory"). |
| 173 | +At any moment, our **current working directory** |
| 174 | +is our current default directory, |
| 175 | +i.e., |
| 176 | +the directory that the computer assumes we want to run commands in, |
| 177 | +unless we explicitly specify something else. |
| 178 | +Here, |
| 179 | +the computer's response is `/home/dcuser`, |
| 180 | +which is the top level directory within our cloud system: |
| 181 | + |
| 182 | +```bash |
| 183 | +$ pwd |
| 184 | +``` |
| 185 | + |
| 186 | +```output |
| 187 | +/home/dcuser |
| 188 | +``` |
| 189 | + |
| 190 | +Let's look at how our file system is organized. We can see what files and subdirectories are in this directory by running `ls`, |
| 191 | +which stands for "listing": |
| 192 | + |
| 193 | +```bash |
| 194 | +$ ls |
| 195 | +``` |
| 196 | + |
| 197 | +```output |
| 198 | +R r_data shell_data |
| 199 | +``` |
| 200 | + |
| 201 | +`ls` prints the names of the files and directories in the current directory in |
| 202 | +alphabetical order, |
| 203 | +arranged neatly into columns. |
| 204 | +We'll be working within the `shell_data` subdirectory, and creating new subdirectories, throughout this workshop. |
| 205 | + |
| 206 | +The command to change locations in our file system is `cd`, followed by a |
| 207 | +directory name to change our working directory. |
| 208 | +`cd` stands for "change directory". |
| 209 | + |
| 210 | +Let's say we want to navigate to the `shell_data` directory we saw above. We can |
| 211 | +use the following command to get there: |
| 212 | + |
| 213 | +```bash |
| 214 | +$ cd shell_data |
| 215 | +``` |
| 216 | + |
| 217 | +Let's look at what is in this directory: |
| 218 | + |
| 219 | +```bash |
| 220 | +$ ls |
| 221 | +``` |
| 222 | + |
| 223 | +```output |
| 224 | +sra_metadata untrimmed_fastq |
| 225 | +``` |
| 226 | + |
| 227 | +We can make the `ls` output more comprehensible by using the **flag** `-F`, |
| 228 | +which tells `ls` to add a trailing `/` to the names of directories: |
| 229 | + |
| 230 | +```bash |
| 231 | +$ ls -F |
| 232 | +``` |
| 233 | + |
| 234 | +```output |
| 235 | +sra_metadata/ untrimmed_fastq/ |
| 236 | +``` |
| 237 | + |
| 238 | +Anything with a "/" after it is a directory. Things with a "\*" after them are programs. If |
| 239 | +there are no decorations, it's a file. |
| 240 | + |
| 241 | +`ls` has lots of other options. To find out what they are, we can type: |
| 242 | + |
| 243 | +```bash |
| 244 | +$ man ls |
| 245 | +``` |
| 246 | + |
| 247 | +`man` (short for manual) displays detailed documentation (also referred as man page or man file) |
| 248 | +for `bash` commands. It is a powerful resource to explore `bash` commands, understand |
| 249 | +their usage and flags. Some manual files are very long. You can scroll through the |
| 250 | +file using your keyboard's down arrow or use the <kbd>Space</kbd> key to go forward one page |
| 251 | +and the <kbd>b</kbd> key to go backwards one page. When you are done reading, hit <kbd>q</kbd> |
| 252 | +to quit. |
| 253 | + |
| 254 | +::::::::::::::::::::::::::::::::::::::: challenge |
| 255 | + |
| 256 | +## Challenge |
| 257 | + |
| 258 | +Use the `-l` option for the `ls` command to display more information for each item |
| 259 | +in the directory. What is one piece of additional information this long format |
| 260 | +gives you that you don't see with the bare `ls` command? |
| 261 | + |
| 262 | +::::::::::::::: solution |
| 263 | + |
| 264 | +## Solution |
| 265 | + |
| 266 | +```bash |
| 267 | +$ ls -l |
| 268 | +``` |
| 269 | + |
| 270 | +```output |
| 271 | +total 8 |
| 272 | +drwxr-x--- 2 dcuser dcuser 4096 Jul 30 2015 sra_metadata |
| 273 | +drwxr-xr-x 2 dcuser dcuser 4096 Nov 15 2017 untrimmed_fastq |
| 274 | +``` |
| 275 | + |
| 276 | +The additional information given includes the name of the owner of the file, |
| 277 | +when the file was last modified, and whether the current user has permission |
| 278 | +to read and write to the file. |
| 279 | + |
| 280 | +::::::::::::::::::::::::: |
| 281 | + |
| 282 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 283 | + |
| 284 | +No one can possibly learn all of these arguments, that's what the manual page |
| 285 | +is for. You can (and should) refer to the manual page or other help files |
| 286 | +as needed. |
| 287 | + |
| 288 | +Let's go into the `untrimmed_fastq` directory and see what is in there. |
| 289 | + |
| 290 | +```bash |
| 291 | +$ cd untrimmed_fastq |
| 292 | +$ ls -F |
| 293 | +``` |
| 294 | + |
| 295 | +```output |
| 296 | +SRR097977.fastq SRR098026.fastq |
| 297 | +``` |
| 298 | + |
| 299 | +This directory contains two files with `.fastq` extensions. FASTQ is a format |
| 300 | +for storing information about sequencing reads and their quality. |
| 301 | +We will be learning more about FASTQ files in a later lesson. |
| 302 | + |
| 303 | +### Shortcut: Tab Completion |
| 304 | + |
| 305 | +Typing out file or directory names can waste a |
| 306 | +lot of time and it's easy to make typing mistakes. Instead we can use tab complete |
| 307 | +as a shortcut. When you start typing out the name of a directory or file, then |
| 308 | +hit the <kbd>Tab</kbd> key, the shell will try to fill in the rest of the |
| 309 | +directory or file name. |
| 310 | + |
| 311 | +Return to your home directory: |
| 312 | + |
| 313 | +```bash |
| 314 | +$ cd |
| 315 | +``` |
| 316 | + |
| 317 | +then enter: |
| 318 | + |
| 319 | +```bash |
| 320 | +$ cd she<tab> |
| 321 | +``` |
| 322 | + |
| 323 | +The shell will fill in the rest of the directory name for |
| 324 | +`shell_data`. |
| 325 | + |
| 326 | +Now change directories to `untrimmed_fastq` in `shell_data` |
| 327 | + |
| 328 | +```bash |
| 329 | +$ cd shell_data |
| 330 | +$ cd untrimmed_fastq |
| 331 | +``` |
| 332 | + |
| 333 | +Using tab complete can be very helpful. However, it will only autocomplete |
| 334 | +a file or directory name if you've typed enough characters to provide |
| 335 | +a unique identifier for the file or directory you are trying to access. |
| 336 | + |
| 337 | +For example, if we now try to list the files which names start with `SR` |
| 338 | +by using tab complete: |
| 339 | + |
| 340 | +```bash |
| 341 | +$ ls SR<tab> |
| 342 | +``` |
| 343 | + |
| 344 | +The shell auto-completes your command to `SRR09`, because all file names in |
| 345 | +the directory begin with this prefix. When you hit |
| 346 | +<kbd>Tab</kbd> again, the shell will list the possible choices. |
| 347 | + |
| 348 | +```bash |
| 349 | +$ ls SRR09<tab><tab> |
| 350 | +``` |
| 351 | + |
| 352 | +```output |
| 353 | +SRR097977.fastq SRR098026.fastq |
| 354 | +``` |
| 355 | + |
| 356 | +Tab completion can also fill in the names of programs, which can be useful if you |
| 357 | +remember the beginning of a program name. |
| 358 | + |
| 359 | +```bash |
| 360 | +$ pw<tab><tab> |
| 361 | +``` |
| 362 | + |
| 363 | +```output |
| 364 | +pwck pwconv pwd pwdx pwunconv |
| 365 | +``` |
| 366 | + |
| 367 | +Displays the name of every program that starts with `pw`. |
| 368 | + |
| 369 | +## Summary |
| 370 | + |
| 371 | +We now know how to move around our file system using the command line. |
| 372 | +This gives us an advantage over interacting with the file system through |
| 373 | +a GUI as it allows us to work on a remote server, carry out the same set of operations |
| 374 | +on a large number of files quickly, and opens up many opportunities for using |
| 375 | +bioinformatic software that is only available in command line versions. |
| 376 | + |
| 377 | +In the next few episodes, we'll be expanding on these skills and seeing how |
| 378 | +using the command line shell enables us to make our workflow more efficient and reproducible. |
| 379 | + |
| 380 | +:::::::::::::::::::::::::::::::::::::::: keypoints |
| 381 | + |
| 382 | +- The shell gives you the ability to work more efficiently by using keyboard commands rather than a GUI. |
| 383 | +- Useful commands for navigating your file system include: `ls`, `pwd`, and `cd`. |
| 384 | +- Most commands take options (flags) which begin with a `-`. |
| 385 | +- Tab completion can reduce errors from mistyping and make work more efficient in the shell. |
| 386 | + |
| 387 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 388 | + |
| 389 | + |
0 commit comments