Skip to content

Commit 0dea6db

Browse files
committed
chapter 7 edits
1 parent 3acc775 commit 0dea6db

File tree

1 file changed

+20
-22
lines changed

1 file changed

+20
-22
lines changed

07-durable-code.Rmd

Lines changed: 20 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,12 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1LMurysUhCjZb7DVF
1717
<img src="resources/images/iterative.png" width="12%">
1818

1919
Getting your code to work the first time is the first step, but don't stop there!
20-
Just like in writing a manuscript you wouldn't consider your first draft a final draft, your polishing code works best in an iterative manner. Although you may need to set it aside for the day to give your brain a rest, return to your code later with fresh eyes and try to look for ways to improve upon it!
20+
Just like in writing a manuscript you wouldn't consider your first draft a final draft, your code polishing works best in an iterative manner. Although you may need to set it aside for the day to give your brain a rest, return to your code later with fresh eyes and try to look for ways to improve upon it!
2121

2222
#### Prioritize readability over cleverness
2323
<img src="resources/images/readable.png" width="12%">
2424

25-
Some cleverness in code can be helpful, too much can make it difficult for others (including your future self!) to understand. If cleverness comprises the readability of your code, it probably is not worth it. Clever but unreadable code won't be re-used or trusted by others (AGAIN, including your future self!).
25+
Some cleverness in code can be helpful, too much can make it difficult for others (including your future self!) to understand. If cleverness compromises the readability of your code, it probably is not worth it. Clever but unreadable code won't be re-used or trusted by others (AGAIN, this includes your future self!).
2626

2727
What does readable code look like? @Orosz2019 has some thoughts on writing readable code:
2828

@@ -34,7 +34,7 @@ What does readable code look like? @Orosz2019 has some thoughts on writing reada
3434
3535
> **The real test of readable code is others reading it.** So get feedback from others, via code reviews. Ask people to share feedback on how clear the code is. Encourage people to ask questions if something does not make sense. Code reviews - especially thorough code reviews - are the best way to get feedback on how good and readable your code is.
3636
>
37-
> Readable code will attract little to no clarifying questions, and reviewers won't misunderstand it. So pay careful attention to the cases when you realize someone misunderstood the intent of what you wrote or asked a clarifying question. Every question or misunderstanding hints to opportunities to make the code more readable.
37+
> Readable code will attract little to no clarifying questions, and reviewers won't misunderstand it. So pay careful attention to the cases when you realize someone misunderstood the intent of what you wrote or asked a clarifying question. Every question or misunderstanding hints at opportunities to make the code more readable.
3838
>
3939
> A good way to get more feedback on the clarity of your code is to ask for feedback from someone who is not an expert on the codebase you are working on. Ask specifically for feedback on how easy to read your code is. Because this developer is not an expert on the codebase, they'll focus on how much they can follow your code. Most of the comments they make will be about your code's readability.
4040
@@ -60,7 +60,7 @@ If you find yourself writing something more than once, you might want to write a
6060
DRY code is easier on the reviewer because they don't have to review the same thing twice, but also because they don't have to review the same thing twice. ;)
6161
DRYing code is something that takes some iterative passes and edits through your code, but in the end DRY code saves you and your collaborators time and can be something you reuse again in a future project!
6262

63-
Here's an slightly modified example from @Bernardo2021 for what DRY vs non-DRY code might look like:
63+
Here's a slightly modified example from @Bernardo2021 for what DRY vs non-DRY code might look like:
6464

6565
```
6666
paste('Hello','John', 'welcome to this course')
@@ -109,9 +109,9 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1LMurysUhCjZb7DVF
109109

110110
<details> <summary> *Why do you need to refresh your kernel/session?* </summary>
111111

112-
As a quick example of why refreshing your kernel/session, let's suppose you are troubleshooting something that centers around an object named `some_obj` but then you rename this object to `iris_df`. When you rename this object you may need to update this other places in the code. If you don't refresh your environment while working on your code, `some_obj` will still be in your environment. This will make it more difficult for you to find where else the code needs to be updated.
112+
As a quick example of why refreshing your kernel/session helps, let's suppose you are troubleshooting something that centers around an object named `some_obj` but then you rename this object to `iris_df`. When you rename this object you may need to update this other places in the code. If you don't refresh your environment while working on your code, `some_obj` will still be in your environment. This will make it more difficult for you to find where else the code needs to be updated.
113113

114-
Refreshing your kernel/session goes beyond objects defined in your environment, and also can affect packages and dependencies loaded or all kinds of other things attached to your kernel/session.
114+
Refreshing your kernel/session goes beyond objects defined in your environment, and also can affect the packages and dependencies loaded, or all kinds of other things attached to your kernel/session.
115115

116116
As a quick experiment, try this in your Python or R environment:
117117

@@ -185,7 +185,7 @@ Try to avoid using variable names that have no meaning like `tmp` or `x`, or `i`
185185
> 2 Use consistent notation for naming convention.
186186
> 3 Use standard terms.
187187
> 4 Do not number a variable name.
188-
> 5 When you find another way to name variable, refactor as fast as possible.
188+
> 5 When you find another way to name a variable, refactor as fast as possible.
189189
190190
[@Hobert2018]
191191

@@ -198,9 +198,9 @@ Try to avoid using variable names that have no meaning like `tmp` or `x`, or `i`
198198
#### Follow a code style
199199
<img src="resources/images/style.png" width="12%">
200200

201-
Just like when writing doesN"t FoLLOW conv3nTi0Ns OR_sPAcinng 0r sp3llinG, it can be distracting, the same goes for code. Your code may even work all the same, just like you understood what I wrote in that last sentence, but a lack of consistent style can make require more brain power from your readers for them to understand. For reproducibility purposes, readability is important! The easier you can make it on your readers, the more likely they will be able to understand and reproduce the results.
201+
Just like when writing doesN"t FoLLOW conv3nTi0Ns OR_sPAcinng 0r sp3llinG, it can be distracting, the same goes for code. Your code may run correctly, just like you understood what I wrote in that last sentence, but a lack of consistent style can require more brain power from your readers for them to understand. For reproducibility purposes, readability is important! The easier you can make it on your readers, the more likely they will be able to understand and reproduce the results.
202202

203-
There are different style guides out there that people adhere to. It doesn't matter so much which one you choose, so much that you pick one and stick to it for a particular project.
203+
There are different style guides out there that people adhere to. It doesn't matter which one you choose, as long as you pick one and stick to it for a particular project.
204204

205205
_Python style guides_:
206206

@@ -212,7 +212,7 @@ _R style guides_:
212212
- [Hadley Wickham's Style guide](http://adv-r.had.co.nz/Style.html) @Wickham.
213213
- [Google R style guide](https://google.github.io/styleguide/Rguide.html) @GoogleR.
214214

215-
Although writing code following a style as you are writing is a good practice, we're all human and that can be tricky to do, so we recommend using an automatic styler on your code to fix up your code for you.
215+
Although writing code that follows a style is a good practice, we're all human and it can be tricky to do, so we recommend using an automatic styler on your code to fix up your code for you.
216216
For Python code, you can use [python black](https://black.readthedocs.io/en/stable/) and for R, [styler](https://www.tidyverse.org/blog/2017/12/styler-1.0.0/).
217217

218218
#### Organize the structure of your code
@@ -358,9 +358,9 @@ For example, for this `make-heatmap` notebook we want to:
358358

359359
**The exercise: Polishing code**
360360

361-
1. Start up JupyterLab with running `jupyter lab` from your command line.
361+
1. Start up JupyterLab by running `jupyter lab` from your command line.
362362
2. Activate your conda environment using `conda activate reproducible-python`.
363-
3. Open up your notebook you made in the previous chapter `make-heatmap.ipynb`
363+
3. Open up the notebook you made in the previous chapter `make-heatmap.ipynb`
364364
4. Work on organizing the code chunks and adding documentation to reflect the steps we've laid out in the [previous section](#organize-the-big-picture-of-the-code), you may want to work on this iteratively as we dive into the code.
365365
5. As you clean up the code, you should run and re-run chunks to see if they work as you expect. You will also want to refresh your environment to help you develop the code (sometimes older objectives stuck in your environment can inhibit your ability to troubleshoot). In Jupyter, you refresh your environment by using the `refresh` icon in the toolbar or by going to `Restart Kernel`.
366366

@@ -489,7 +489,7 @@ _More reading on the tidyverse:_
489489
2. Open up the notebook you created in the previous chapter.
490490
3. Now we'll work on applying the principles from this chapter to the code. We'll cover some of the points here, but then we encourage you to dig into the fully transformed notebook we will link at the end of this section.
491491
4. Work on organizing the code chunks and adding documentation to reflect the steps we've laid out in the [previous section](#organize-the-big-picture-of-the-code), you may want to work on this iteratively as we dive into the code.
492-
5. As you clean up the code, you should run and re-run chunks to see if they work as you expect. You will also want to refresh your environment to help you develop the code (sometimes older objectives stuck in your environment can inhibit your ability to troubleshoot). In RStudio, you refresh your environment by going to the `Run` menu and using `Restart R and refresh clear output`.
492+
5. As you clean up the code, you should run and re-run chunks to see if they work as you expect. You will also want to refresh your environment to help you develop the code (sometimes older objects stuck in your environment can inhibit your ability to troubleshoot). In RStudio, you refresh your environment by going to the `Run` menu and using `Restart R and Clear Output`.
493493

494494
***
495495

@@ -510,7 +510,7 @@ set.seed(1234)
510510
**Get rid of setwd**
511511

512512
_Rationale:_
513-
`setwd()` almost never work for anyone besides the one person who wrote it. And in a few days/weeks it may not work for them either.
513+
`setwd()` almost never works for anyone besides the one person who wrote it. And in a few days/weeks it may not work for them either.
514514

515515
_Before:_
516516
```
@@ -526,9 +526,7 @@ _Related readings:_
526526
**Give the variables more informative names**
527527

528528
_Rationale:_
529-
`xx` doesn't tell us what is in the data here. Also by using the `readr::read_tsv()` from tidyverse we'll get a cleaner, faster read and won't have to specify `sep` argument. Note we are also fixing some spacing and using `<-` so that we can stick to readability conventions.
530-
531-
You'll notice later
529+
`xx` doesn't tell us what is in the data here. Also by using the `readr::read_tsv()` from tidyverse we'll get a cleaner, faster read and won't have to specify the `sep` argument. Note we are also fixing some spacing and using `<-` so that we can stick to readability conventions.
532530

533531
_Before:_
534532
```
@@ -551,10 +549,10 @@ What is happening with df1 and df2? What's being filtered out? etc.
551549
Code comments would certainly help understanding, but even better, we can DRY this code up and make the code clearer on its own.
552550

553551
_Before:_
554-
It may be difficult to tell from looking at the before code because there are no comments and it's a bit tricky to read, but the goal of this is to:
552+
It may be difficult to tell from looking at the before code because there are no comments and it's a bit tricky to read, but the goal of this code is to:
555553

556-
1) Calculate variances for each row (each row is a gene).
557-
2) Filter the original gene expression matrix to only genes have a bigger variance (here we use arbitrarily 10 as a filter cutoff).
554+
1) Calculate the variance for each row (each row is a gene with expression values from a number of samples).
555+
2) Filter the original gene expression matrix to only genes that have a bigger variance (here we arbitrarily use 10 as a filter cutoff).
558556

559557
```
560558
df=read.csv("SRP070849.tsv", sep="\t")
@@ -575,10 +573,10 @@ Let's see how we can do this in a DRY'er and clearer way.
575573
We can:
576574
1) Add comments to describe our goals.
577575
2) Use variable names that are more informative.
578-
3) Use the apply functions to do the loop for us -- this will eliminate the need for unclear variable `i` as well.
576+
3) Use the apply functions to do the loop for us -- this will eliminate the need for the unclear variable `i` as well.
579577
4) Use the tidyverse to do the filtering for us so we don't have to rename data frames or store extra versions of `df`.
580578

581-
Here's what the above might look like after some refactoring. Hopefully you find this is easier to follow and total there's less lines of code (but also has comments too!).
579+
Here's what the above might look like after some refactoring. Hopefully you find this is easier to follow and there's less total lines of code (but it also has comments now too!).
582580
```
583581
# Read in data TSV file
584582
expression_df <- readr::read_tsv(data_file) %>%

0 commit comments

Comments
 (0)