11 literate programming with R markdown
Showing your work, to (future) you as well as others, is a key part of reproducible science. R Markdown documents facilitate this, as they allow you to include comments, code, and results in a single place. But before we consider R markdown, we begin with two more elemental ideas: scripts (R4DS, Chapter 6) and projects (Chapter 8).
11.1 scripts are files of code
We begin with R4DS Chapter 6, which shows the R studio interface and encourages you to save your work using scripts, written in the source (editor) window in the upper left quadrant of the default R studio screen.
Note the recommendations - for example, include packages (libraries) at the beginning of your code. One more thing - in setting up R studio, consider adjusting the “insert spaces for tab” setting to something more than 2. This will allow you to more easily see the nested structure of functions, loops, etc. - and will create a modest disincentive against making these nested structures too deep or complex:
Note, too, the code diagnostics in R. Consider enabling all of these, including the R style diagnostics, to help you keep your code readable:
11.1.1 some elements of coding style
Good coding is often a combination of several skills ranging from puzzle-solving to communication. I can’t claim that these are the elements of coding style (apologies to Strunk & White), but rather that these are merely some of the elements.
Good coding is clear and thus commented. You are writing for your future self as well as others, so be explicit about the purpose of each chunk of code.
Good coding is concise. When you can write code in 3 lines instead of 30, your code may be more clear and efficient. Take pleasure in writing parsimonious, efficient code. But where efficiency and clarity conflict, choose the latter.
Good code should be complete, including all steps from reading the data to producing output. Where appropriate, comment-out (rather than delete) informative errors, again for the future you.
Good code may be creative. The coolest solutions are those which pull from and synthesize a number of ideas. Creativity often requires walking away from a problem in order to ultimately arrive at a solution (Wertheimer’s Productive Thinking).
Finally, good code should be considered. Reflect on the impacts of your work - just because you can analyze something doesn’t mean that you should.
11.3 R markdown documents integrate rationale, script, and results
R Markdown documents allow you to include comments, scripts, and results in a single place. The basics of R markdown are presented in Chapter 27 of R4DS. I encourage you to use R markdown for nearly everything you do in R.
Within R studio, open up a new R markdown document. There are as many as four parts of an R markdown document:
- A YAML (yet another markdown language) header
- Text formatted in markdown
- R code (chunks) surrounded by code fences
- and, occasionally, inline code
There is a handy R Markdown cheat sheet which can give you a sense of what R markdown is about. It describes eight steps, from “workflow” to “publish” (and a ninth, “learn more”). Don’t worry about all of the detail here, but do get a sense of how it works.
Exercise 11.1:
Working in groups, do the exercises in section 27.4.7 of R4DS.
Begin with the R markdown file that is included at the beginning of Chapter 27. You can download it here.
Study the code, and annotate it so that you have a better sense of how it works. For example, “this block loads needed libraries, then takes the _____ dataset and ___________ .”
Play with the graph. Change one or more parameters of it to make it more useful. Again, annotate your changes.
11.4 What to do when you are stuck
Google. pay attention to your error messages
Ask for help, make your questions clear and reproducible (see R4DS Chapter 1)
Take a break, think outside the box and kludge something together if you have to
Document your struggle and your cleverness for a future you