RMarkdown cognitive learning psychology Driven Development (RmdDD) - 22 August 2019

Home » » RMarkdown cognitive learning psychology Driven Development (RmdDD)

03:16

RMarkdown cognitive learning psychology Driven Development (RmdDD)

RMarkdown is an excellent platform for capturing narrative analysis and code to create reproducible reports, blogs, slides, books, and more. One benefit of rmarkdown is its abilities to keep an analyst in the “flow” of their work and to capture their thought process along the way.Cognitive learning psychology however, thought processes are rarely linear; as a result, first-draft rmarkdown scripts rarely are either. This is fine for some individual analysis and preliminary exploration but can significantly decrease how understandable and resilient an rmarkdown will be in the future.Cognitive learning psychology

First, the promise of re-use and reproducibility helps justify any incremental time expenditure on constructing a single analysis with “good practices”.Cognitive learning psychology second, for newer project users or package developers, it hopefully helps emphasize that the learning gap between being a user and developer is much small than it may seem 2.Cognitive learning psychology finally, I have found that this approach to package development leads to very intuitive, user-friendly packages. Essentially, you, a humble user, have conducted rigorous UX research before you, the tool developer, ever shows up!Cognitive learning psychology

Each of these is discussed in more detail below. I also want to emphasize that this sequence is not a recommendation that every rmarkdown needs to become a project or a package.Cognitive learning psychology clearly, that is just silly and in many cases causes unneccesary fragmentation and overhead. However, I believe imagining a spectrum between a single-file rmarkdown to a full-functioning R package is helpful to conscientiously make the decision where to draw the line.Cognitive learning psychology “optimal stopping” at different stages may be the subject of a future post. A taxonomy of rmarkdown chunks

• infrastructure: these chunks set up the environment in which the rmarkdown is rendered.Cognitive learning psychology this includes code that helps add functions to your enviornment (e.G. Library(), source()), load data (e.G. Functions from readr or data.Table::fread() or functions calling apis or databases), or define analysis parameters (e.G.Cognitive learning psychology hardcoded values that are somehow used to change behavior later in the script)

• do not hardcode passwords. This is a good general principle of scripts, but especially important in an rmarkdown where they might accidentally “leak” into the rendered output (e.G.Cognitive learning psychology HTML) in a non-visible way without your realizing. If something in your file absolutely requires a password, one approach is to create a parameter and supply this upon knitting.Cognitive learning psychology

• do not hardcode absolute file paths. No one else has your specific set up of files, nor are you likely to if you change computers. This can lead to a lot of frustration. 3 try to use relative paths to reference any external files (e.G.Cognitive learning psychology data) being brought in to your report. This is significantly easier once the analysis becomes and R project. At minimum, move any brittle dependencies like this to the top of your script where they will at least be found more quickly to debug.Cognitive learning psychology

• do not do complicated database queries. For simple rmarkdown files, sometimes it may be convenient to use the sql language engine and query a database.Cognitive learning psychology however, at least in my experience, it is generally not the best approach to attempt database queries in your rmarkdown. Sometimes, queries can take a long time to run, and you do not want to do this every time you find a typo or tweak some plot styling and want to reknit your document.Cognitive learning psychology consider making your data pull a separate script and read the results into your rmarkdown.

• don’t litter. Resist the temptation to save everything you tried that didn’t work and isn’t part of your analysis or narrative.Cognitive learning psychology note that I’m not advocating against transparency around reporting all the tests you ran, all the models you attempted to fit, etc. More precisely, don’t leave half written code just in case you want to try to make some specific plot or graph later.Cognitive learning psychology don’t let your rmarkdown become a “junk drawer” or take misadvantage of the ability to store unneeded code with the eval = FALSE chunk option.Cognitive learning psychology

• don’t load unneccesary libraries. Often, you may add library loads in exploratory analysis “just in case” or have tried out using one package before deciding on a different approach.Cognitive learning psychology after you’ve removed the “litter” discussed previously, also be sure to clean up any side-effects of such litter. There isn’t a huge cost to excess library loads except that it can be confusing to users and raises (however slightly) the chance of a NAMESPACE conflict.Cognitive learning psychology you might also then cause some other user to install extra packages unneccesarily which, while not tragic, is inconvenient if there is no actual benefit.Cognitive learning psychology

Moving all infrastructure chunks to the beginning of your rmarkdown makes it clear what dependencies your rmarkdown has. Instantly upon opening a file, a new analyst can understand what libraries and external files it requires.Cognitive learning psychology this is optimal over a case where some obscure library in the penultimate chunk of some large rmarkdown which almost runs to completion before erroring out due to a missing dependency.Cognitive learning psychology

The rationale for front-loading a lot of wrangling chunks is similar. Because these chunks are the most computationally intense, they are most likely to throw errors.Cognitive learning psychology having them at the beginning of your file means you will learn about your errors sooner; having all of them together will make it easier to debug.Cognitive learning psychology additionally, in my experience, these are the chunks I most often want to edit, so it’s efficient not to have to scroll through code for plots and tables simply to find these chunks

cognitive learning psychology

Of course, in this step, do be careful of being too “prescient”. If some of your data wrangling chunks are motivated by certain output and discussion later in your file, it may be confusing for this computation to be removed from its context and placed at the beginning.Cognitive learning psychology I’m reasonably convicted about this advice for creating standardized reporting frameworks, but I caution that the best structure becomes far more subjective for more creative analyses.Cognitive learning psychology

Bonus points now that your data load chunks are right at the top of your script, consider adding validation for the data getting loaded. Is it the same form that your code expects?Cognitive learning psychology does it have the right variable names and types? Does it meet any logical checks or assumptions you deem necessary? A few good packages for this are validate and assertr.Cognitive learning psychology depending on your project, you could put these in a separate code chunk with the chunk option include = FALSE so that data validation can be run manually or include them in you script to throw errors and prevent attempt to render incorrect data structures. (3) reduce duplication with functions

cognitive learning psychology

Reorganization offers other clear benefits, one of which is that code with similar purposes ends up physically closer in your document. This may make it easier for you to spot similarities.Cognitive learning psychology as you notice similarities in different chunks of wrangling or reporting chunks, keep in mind the rule of three. That is, similar code repeated multiple times should be turned into a function.Cognitive learning psychology

For example, I often encounter situations where I need to produce the same plots or tables for many different groups. While an analyst is in exploratory mode, they might reasonably copy-paste such code, edit some key parameters, and eagerly proceed to analyzing the results.Cognitive learning psychology converting this code to functions makes is significantly easier to test and maintain. It also has the benefit of converting reporting code into infrastructure code which can be moved to the top of the rmarkdown, with the previously described benefits.Cognitive learning psychology generally, I define any local functions after my library() and source() commands.

Bonus points now that you have functions, it’s a good time to think about testing them.Cognitive learning psychology you could add a few tests of your functions in a chunk with include = FALSE as described in the last section for data validation. One helpful package here is testthat.Cognitive learning psychology even if you don’t include these in your rmarkdown, save any informal tests you run in a text file. They will be useful if you decide to turn your analysis all the way into a package (4) convert your rmarkdown to project

cognitive learning psychology

At this point, your rmarkdown ideally has clear requirements (library, file, and data dependencies) and minimal duplicated code. Particularly if you find that you are source()ing in a large number of files, defining many local functions, or reading in many different datasets, its worth considering whether to convert your single file rmarkdown into an R project.Cognitive learning psychology

Additionally, by using a standardized file structure within your project 5, you can help others easily navigate your repository. If an entire team or organization decides on a single file structure convention, collaborators can easily navigate each others folders and have a good intuition where to find a specific file in someone else’s project.Cognitive learning psychology

Bonus points now that you have a project, consider taking a more proactive stance on package management to ensure the future user has correct / compatible versions of any packages on which your project relies.Cognitive learning psychology as of writing this, rstudio’s new package management solution renv is still in development, but follow that project for more details! (5) convert your project to a package

cognitive learning psychology

One of the beautiful things about R packages is their shocking simplicity. Before I wrote my first package, I always assumed that there was some mystical step change in the level of effort between writing everyday R code and writing a package.Cognitive learning psychology this is a misconception I frequently hear repeated by newer R users. In reality (admittedly, painting with a very broad brush), writing an R package is simply the art of putting things (R files) where they belong (in the right folders.)

cognitive learning psychology

It’s also instructive to notice the differences between projects and packages. Following the description above, the biggest notable gaps are the lack of unit tests (which would live in the tests/ folder) and function documentation (which can be autogenerated from roxygen2 comments and live in docs/).Cognitive learning psychology these can easily be added when converting your project to a package.

Category: Cognitive learning | Views: 83 | Added by: poiskspider | Tags: cognitive learning psychology | Rating: 0.0/0

Total comments: 0

Cognitive Learning