You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Getting your code to work the first time is the first step, but don't stop there!
20
-
Just like in writing a manuscript you wouldn't consider your first draft a final draft, your polishing code works best in an iterative manner. Although you may need to set it aside for the day to give your brain a rest, return to your code later with fresh eyes and try to look for ways to improve upon it!
20
+
Just like in writing a manuscript you wouldn't consider your first draft a final draft, your code polishing works best in an iterative manner. Although you may need to set it aside for the day to give your brain a rest, return to your code later with fresh eyes and try to look for ways to improve upon it!
Some cleverness in code can be helpful, too much can make it difficult for others (including your future self!) to understand. If cleverness comprises the readability of your code, it probably is not worth it. Clever but unreadable code won't be re-used or trusted by others (AGAIN, including your future self!).
25
+
Some cleverness in code can be helpful, too much can make it difficult for others (including your future self!) to understand. If cleverness compromises the readability of your code, it probably is not worth it. Clever but unreadable code won't be re-used or trusted by others (AGAIN, this includes your future self!).
26
26
27
27
What does readable code look like? @Orosz2019 has some thoughts on writing readable code:
28
28
@@ -34,7 +34,7 @@ What does readable code look like? @Orosz2019 has some thoughts on writing reada
34
34
35
35
> **The real test of readable code is others reading it.** So get feedback from others, via code reviews. Ask people to share feedback on how clear the code is. Encourage people to ask questions if something does not make sense. Code reviews - especially thorough code reviews - are the best way to get feedback on how good and readable your code is.
36
36
>
37
-
> Readable code will attract little to no clarifying questions, and reviewers won't misunderstand it. So pay careful attention to the cases when you realize someone misunderstood the intent of what you wrote or asked a clarifying question. Every question or misunderstanding hints to opportunities to make the code more readable.
37
+
> Readable code will attract little to no clarifying questions, and reviewers won't misunderstand it. So pay careful attention to the cases when you realize someone misunderstood the intent of what you wrote or asked a clarifying question. Every question or misunderstanding hints at opportunities to make the code more readable.
38
38
>
39
39
> A good way to get more feedback on the clarity of your code is to ask for feedback from someone who is not an expert on the codebase you are working on. Ask specifically for feedback on how easy to read your code is. Because this developer is not an expert on the codebase, they'll focus on how much they can follow your code. Most of the comments they make will be about your code's readability.
40
40
@@ -60,7 +60,7 @@ If you find yourself writing something more than once, you might want to write a
60
60
DRY code is easier on the reviewer because they don't have to review the same thing twice, but also because they don't have to review the same thing twice. ;)
61
61
DRYing code is something that takes some iterative passes and edits through your code, but in the end DRY code saves you and your collaborators time and can be something you reuse again in a future project!
62
62
63
-
Here's an slightly modified example from @Bernardo2021 for what DRY vs non-DRY code might look like:
63
+
Here's a slightly modified example from @Bernardo2021 for what DRY vs non-DRY code might look like:
<details> <summary> *Why do you need to refresh your kernel/session?* </summary>
111
111
112
-
As a quick example of why refreshing your kernel/session, let's suppose you are troubleshooting something that centers around an object named `some_obj` but then you rename this object to `iris_df`. When you rename this object you may need to update this other places in the code. If you don't refresh your environment while working on your code, `some_obj` will still be in your environment. This will make it more difficult for you to find where else the code needs to be updated.
112
+
As a quick example of why refreshing your kernel/session helps, let's suppose you are troubleshooting something that centers around an object named `some_obj` but then you rename this object to `iris_df`. When you rename this object you may need to update this other places in the code. If you don't refresh your environment while working on your code, `some_obj` will still be in your environment. This will make it more difficult for you to find where else the code needs to be updated.
113
113
114
-
Refreshing your kernel/session goes beyond objects defined in your environment, and also can affect packages and dependencies loaded or all kinds of other things attached to your kernel/session.
114
+
Refreshing your kernel/session goes beyond objects defined in your environment, and also can affect the packages and dependencies loaded, or all kinds of other things attached to your kernel/session.
115
115
116
116
As a quick experiment, try this in your Python or R environment:
117
117
@@ -185,7 +185,7 @@ Try to avoid using variable names that have no meaning like `tmp` or `x`, or `i`
185
185
> 2 Use consistent notation for naming convention.
186
186
> 3 Use standard terms.
187
187
> 4 Do not number a variable name.
188
-
> 5 When you find another way to name variable, refactor as fast as possible.
188
+
> 5 When you find another way to name a variable, refactor as fast as possible.
189
189
190
190
[@Hobert2018]
191
191
@@ -198,9 +198,9 @@ Try to avoid using variable names that have no meaning like `tmp` or `x`, or `i`
198
198
#### Follow a code style
199
199
<imgsrc="resources/images/style.png"width="12%">
200
200
201
-
Just like when writing doesN"t FoLLOW conv3nTi0Ns OR_sPAcinng 0r sp3llinG, it can be distracting, the same goes for code. Your code may even work all the same, just like you understood what I wrote in that last sentence, but a lack of consistent style can make require more brain power from your readers for them to understand. For reproducibility purposes, readability is important! The easier you can make it on your readers, the more likely they will be able to understand and reproduce the results.
201
+
Just like when writing doesN"t FoLLOW conv3nTi0Ns OR_sPAcinng 0r sp3llinG, it can be distracting, the same goes for code. Your code may run correctly, just like you understood what I wrote in that last sentence, but a lack of consistent style can require more brain power from your readers for them to understand. For reproducibility purposes, readability is important! The easier you can make it on your readers, the more likely they will be able to understand and reproduce the results.
202
202
203
-
There are different style guides out there that people adhere to. It doesn't matter so much which one you choose, so much that you pick one and stick to it for a particular project.
203
+
There are different style guides out there that people adhere to. It doesn't matter which one you choose, as long as you pick one and stick to it for a particular project.
-[Google R style guide](https://google.github.io/styleguide/Rguide.html)@GoogleR.
214
214
215
-
Although writing code following a style as you are writing is a good practice, we're all human and that can be tricky to do, so we recommend using an automatic styler on your code to fix up your code for you.
215
+
Although writing code that follows a style is a good practice, we're all human and it can be tricky to do, so we recommend using an automatic styler on your code to fix up your code for you.
216
216
For Python code, you can use [python black](https://black.readthedocs.io/en/stable/) and for R, [styler](https://www.tidyverse.org/blog/2017/12/styler-1.0.0/).
217
217
218
218
#### Organize the structure of your code
@@ -358,9 +358,9 @@ For example, for this `make-heatmap` notebook we want to:
358
358
359
359
**The exercise: Polishing code**
360
360
361
-
1. Start up JupyterLab with running `jupyter lab` from your command line.
361
+
1. Start up JupyterLab by running `jupyter lab` from your command line.
362
362
2. Activate your conda environment using `conda activate reproducible-python`.
363
-
3. Open up your notebook you made in the previous chapter `make-heatmap.ipynb`
363
+
3. Open up the notebook you made in the previous chapter `make-heatmap.ipynb`
364
364
4. Work on organizing the code chunks and adding documentation to reflect the steps we've laid out in the [previous section](#organize-the-big-picture-of-the-code), you may want to work on this iteratively as we dive into the code.
365
365
5. As you clean up the code, you should run and re-run chunks to see if they work as you expect. You will also want to refresh your environment to help you develop the code (sometimes older objectives stuck in your environment can inhibit your ability to troubleshoot). In Jupyter, you refresh your environment by using the `refresh` icon in the toolbar or by going to `Restart Kernel`.
366
366
@@ -489,7 +489,7 @@ _More reading on the tidyverse:_
489
489
2. Open up the notebook you created in the previous chapter.
490
490
3. Now we'll work on applying the principles from this chapter to the code. We'll cover some of the points here, but then we encourage you to dig into the fully transformed notebook we will link at the end of this section.
491
491
4. Work on organizing the code chunks and adding documentation to reflect the steps we've laid out in the [previous section](#organize-the-big-picture-of-the-code), you may want to work on this iteratively as we dive into the code.
492
-
5. As you clean up the code, you should run and re-run chunks to see if they work as you expect. You will also want to refresh your environment to help you develop the code (sometimes older objectives stuck in your environment can inhibit your ability to troubleshoot). In RStudio, you refresh your environment by going to the `Run` menu and using `Restart R and refresh clear output`.
492
+
5. As you clean up the code, you should run and re-run chunks to see if they work as you expect. You will also want to refresh your environment to help you develop the code (sometimes older objects stuck in your environment can inhibit your ability to troubleshoot). In RStudio, you refresh your environment by going to the `Run` menu and using `Restart R and Clear Output`.
493
493
494
494
***
495
495
@@ -510,7 +510,7 @@ set.seed(1234)
510
510
**Get rid of setwd**
511
511
512
512
_Rationale:_
513
-
`setwd()` almost never work for anyone besides the one person who wrote it. And in a few days/weeks it may not work for them either.
513
+
`setwd()` almost never works for anyone besides the one person who wrote it. And in a few days/weeks it may not work for them either.
514
514
515
515
_Before:_
516
516
```
@@ -526,9 +526,7 @@ _Related readings:_
526
526
**Give the variables more informative names**
527
527
528
528
_Rationale:_
529
-
`xx` doesn't tell us what is in the data here. Also by using the `readr::read_tsv()` from tidyverse we'll get a cleaner, faster read and won't have to specify `sep` argument. Note we are also fixing some spacing and using `<-` so that we can stick to readability conventions.
530
-
531
-
You'll notice later
529
+
`xx` doesn't tell us what is in the data here. Also by using the `readr::read_tsv()` from tidyverse we'll get a cleaner, faster read and won't have to specify the `sep` argument. Note we are also fixing some spacing and using `<-` so that we can stick to readability conventions.
532
530
533
531
_Before:_
534
532
```
@@ -551,10 +549,10 @@ What is happening with df1 and df2? What's being filtered out? etc.
551
549
Code comments would certainly help understanding, but even better, we can DRY this code up and make the code clearer on its own.
552
550
553
551
_Before:_
554
-
It may be difficult to tell from looking at the before code because there are no comments and it's a bit tricky to read, but the goal of this is to:
552
+
It may be difficult to tell from looking at the before code because there are no comments and it's a bit tricky to read, but the goal of this code is to:
555
553
556
-
1) Calculate variances for each row (each row is a gene).
557
-
2) Filter the original gene expression matrix to only genes have a bigger variance (here we use arbitrarily 10 as a filter cutoff).
554
+
1) Calculate the variance for each row (each row is a gene with expression values from a number of samples).
555
+
2) Filter the original gene expression matrix to only genes that have a bigger variance (here we arbitrarily use 10 as a filter cutoff).
558
556
559
557
```
560
558
df=read.csv("SRP070849.tsv", sep="\t")
@@ -575,10 +573,10 @@ Let's see how we can do this in a DRY'er and clearer way.
575
573
We can:
576
574
1) Add comments to describe our goals.
577
575
2) Use variable names that are more informative.
578
-
3) Use the apply functions to do the loop for us -- this will eliminate the need for unclear variable `i` as well.
576
+
3) Use the apply functions to do the loop for us -- this will eliminate the need for the unclear variable `i` as well.
579
577
4) Use the tidyverse to do the filtering for us so we don't have to rename data frames or store extra versions of `df`.
580
578
581
-
Here's what the above might look like after some refactoring. Hopefully you find this is easier to follow and total there's less lines of code (but also has comments too!).
579
+
Here's what the above might look like after some refactoring. Hopefully you find this is easier to follow and there's less total lines of code (but it also has comments now too!).
0 commit comments