Roger Linters
The package roger provides the tools to analyse the coding style and documentation of R scripts for the automated grading system Roger the Omni Grader.
The tools are actually linters that analyse the code and documentation and report whether or not they respect good coding practices.
This page lists the linters included in roger and provides for each one a description and usage examples.
In a Nutshell
Using the linters first involves parsing the script file with the
function getSourceData
. The resulting object is then passed to as
many linters as you wish.
srcData <- getSourceData("script.R")
assignment_style(srcData)
commas_style(srcData)
...
All linters in roger return TRUE
if the code respects good
coding practices, or else FALSE
and a message indicating the nature
of the error and the faulty lines.
The utility function all_style
eases the process by running all
available style linters at once.
srcData <- getSourceData("script.R")
all_style(srcData)
Style Linters
assignment_style
Check that the left assign symbol (<-
) is used to assign values to
objects instead of the equation assign symbol (=
).
Problematic code
x = 2
z = c(x = 42, y = 43)
Correct code
x <- 2
z <- c(x = 42, y = 43)
Rationale
The correct and safe way to assign values to objects in R is with the
assignment operator <-
(or, in rare instances, ->
). Furthermore,
<-
can be used anywhere, whereas =
is only allowed at the top
level.
The second line of the problematic code shows the ambiguity that
arises when using =
for assignment: the first instance is indeed an
assignment operator, whereas the other two are used to name arguments.
Just use <-
for assignment. R aware editors provide shortcuts to
type the symbol.
close_brace_style
Check that the closing braces are positioned according to standard bracing styles rules.
Problematic code
foo <- function(x, y) {
if (x > 2)
{ z <- 3
y <- y + z}
x^2 + y^3 + z^4
}
Correct code
foo <- function(x, y)
{
if (x > 2)
{
z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
Rationale
See open_brace_style
.
close_bracket_style
Check that spacing around closing square brackets is valid.
Problematic code
z[1 ]
z[1,]
Correct code
z[1]
z[1, ]
Rationale
Closing brackets should not be immediately preceded by a space, unless
that space is after a comma (as required by commas_style
).
close_parenthesis_style
Check that spacing around closing parentheses is valid.
Problematic code
1 + (x + y )
for (i in 1:10 )
Correct code
1 + (x + y)
for (i in 1:10)
Rationale
Typeset parentheses in code as you would in prose. Case in point: a closing parenthesis should not be immediately preceded by a space.
commas_style
Check that commas are never preceded by a space and always followed by one, unless the comma ends the line.
Problematic code
z <- c(42,43)
z <- c(42 ,43)
Correct code
z <- c(42, 43)
z <- c(42,
43)
Rationale
Spaces after commas allow code to breathe. You put spaces after commas — but never before — in prose, right? Well, just do the same in code. Spaces are cheap.
comments_style
Check that comment delimiters and the text of comments, when there is any, are separated by at least one space.
Problematic code
##comment
##---comment
Correct code
##
## comment
##--- comment
Rationale
Comments are just easier to read when set apart from their delimiters
(one or many symbols #
, possibly followed by other punctuation
symbols). Spaces are cheap in comments too.
left_parenthesis_style
Check that spacing around left (or opening) parentheses is valid.
Problematic code
1 +(x + y)
1 + ( (x + y) + z)
for(i in 1:10)
sqrt (4)
z <- function (x) x^2
Correct code
1 + (x + y)
1 + ((x + y) + z)
for (i in 1:10)
sqrt(4)
z <- function(x) x^2
Rationale
Typeset parentheses in code as you would in prose. Case in point: a left parenthesis should always be preceded by a space, except in function calls or at the start of sub-expressions.
See
open_parenthesis_style
for rules regarding the space after an opening parenthesis.
line_length_style
Check that the length of code and comment lines does not exceed a given limit in number of characters.
Problematic code
## Some very long line of comment that should be split on multiple lines to avoid horizontal scrolling.
x <- c(x^2 + y^3 + z^4 + 3 * x * y - 6 * y * z^2 + x * z^3, x^3 + y^2 + z^3 + 3 * x * y)
Correct code
## Some very long line of comment that should be split on multiple lines
## to avoid horizontal scrolling.
x <- c(x^2 + y^3 + z^4 + 3 * x * y - 6 * y * z^2 + x * z^3,
x^3 + y^2 + z^3 + 3 * x * y)
Rationale
Limiting line length to 70-80 characters greatly improves readability. You have a large screen? Use the space to display windows side by side, not to write long lines of code or comments.
nomagic_style
Check the absence of magic numbers in code.
Problematic code
fooBar <- 2^32
runif(123)
x[3] * 7 + 2
Correct code
FOOBAR <- 2^32
SIZE <- 42
runif(SIZE)
BAR <- 7
x[3] * BAR + 2
Rationale
Magic numbers are unnamed or insufficiently documented numerical
constants in code. Magic numbers make programs hard to read,
understand and debug. For example, in the expression y <- x + 42
,
the constant 42
is a magic number.
A magic number should be assigned to an appropriately named variable.
In roger, the value of a “simple” expression (see
?nomagic_style
for details) to a variable all in uppercase is
recognized as the assignment of a magic number. This explains why
fooBar <- 2^32
is not valid, but FOOBAR <- 2^32
is.
Furthermore, the common constants -1, 0, 1, 2 and 100, and numbers
used as the only expression in indexing are not considered magic. In
the expression x[3] * 7 + 2
, only 7
is a magic number.
open_brace_style
Check that the opening braces are positioned according to standard bracing styles rules.
Problematic code
foo <- function(x, y)
{ if (x > 2)
{ z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
Correct code
foo <- function(x, y)
{
if (x > 2)
{
z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
Rationale
Bracing and indent styles is a mined field subject to holy wars. Still, using a standard and consistent style remains important as it greatly improves readability of code.
roger supports two bracing styles dubbed “R” and “1TBS”. The R bracing style — also known as Allman, BSD or C++ style — has opening and closing braces on their own lines, left aligned with their corresponding statement. This is the style used in the examples.
The 1TBS style — also widely known as K&R style — has the opening brace on the same line as its corresponding statement, separated by a space. The closing brace appears on its own line, left aligned with the statement:
foo <- function(x, y) {
if (x > 2) {
z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
open_brace_unique_style
Check that only one bracing style is used throughout the script.
Problematic code
foo <- function(x, y) {
if (x > 2) {
z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
bar <- function(x)
{
x^2
}
Correct code
foo <- function(x, y)
{
if (x > 2)
{
z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
bar <- function(x)
{
x^2
}
or
foo <- function(x, y) {
if (x > 2) {
z <- 3
y <- y + z
}
x^2 + y^3 + z^4
}
bar <- function(x) {
x^2
}
Rationale
Using a standard bracing and indent style is good, but being consistent throughout a script (or project) is even better. This linter checks that either the R or 1TBS is used in the script, but not both.
See open_brace_style
for
additional details.
open_bracket_style
Check that spacing around opening square brackets is valid.
Problematic code
z[ 1]
z[ , 1]
Correct code
z[1]
z[, 1]
Rationale
Similar to parentheses, opening brackets should not be immediately followed by a space.
open_parenthesis_style
Check that spacing around opening parentheses is valid.
Problematic code
1 + ( x + y)
for ( i in 1:10)
Correct code
1 + (x + y)
for (i in 1:10)
Rationale
Typeset parentheses in code as you would in prose. Case in point: an opening parenthesis should not be immediately followed by a space.
See also
left_parenthesis_style
for rules regarding the space in front of an opening parenthesis.
ops_spaces_style
Check that spacing around infix and unary operators is valid.
Problematic code
x+y
x<-y+ 3
x<- -2
2+!x
Correct code
x + y
x <- y + 3
x <- -2
2 + !x
Rationale
Spaces around operators are like spaces after commas, they allow the code to breathe and they improve readability.
Infix binary operators should have a space (or a line break) on both sides. As for unary operators, they should be immediately followed by their argument.
Note that the assignment operator <-
is an infix binary operator.
trailing_blank_lines_style
Check that a script file does not contain trailing blank lines.
Problematic code
----- <- start of file marker
x + y
----- <- end of file marker
Correct code
-----
x + y
-----
Rationale
No, superfluous empty lines at the end of a script file do no harm. But they also make for an untidy file. Just get rid of them. Good text editors can automagically do it for you.
trailing_whitespace_style
Check that a script file does not contain whitespace at the end of lines.
Problematic code
x + y | <- end of line marker
Correct code
x + y|
Rationale
Just like trailing blank lines, trailing whitespace have no real impact on code; they just make for untidy scripts. Remove unnecessary whitespace (space or tabulation) at the end of lines. Your text editors may automagically do it for you upon saving a file. Check this option and never look back.
unneeded_concatenation_style
Check that function c
is used with more than one argument.
Problematic code
x <- c()
y <- c(42)
Correct code
x <- numeric(0)
y <- 42
Rationale
In R, the role of the function c
is to combine objects. It does
not make sense to combine nothing or a single object, right? Then
never use c
with no or only one argument.
If you mean to create an empty vector, use numeric(0)
, logical(0)
or character(0)
, depending on the type needed.
Documentation linters
Check for proper documentation of a function in the comments of a script file, and if certain mandatory sections are present.
any_comments
checks that the file contains non empty comments.
any_doc
checks that the file contains at least some documentation.
signature_doc
checks that the signature (or usage information) of
every function is present in the documentation.
section_doc
checks that the documentation contains a section
title corresponding to a regular expression pattern for every (or as
many) function definition in the file.
formals_doc
checks that the description of every formal argument is
present in the documentation.
Problematic code
foo <- function(x, y = 2)
x + y
Correct code
###
### foo(x, y = 2)
###
## Adding two vectors
##
## Arguments
##
## x: a vector
## y: another vector
##
## Value
##
## Sum of the two vectors.
##
## Examples
##
## foo(1:5)
##
foo <- function(x, y = 2)
x + y
Rationale
For R scripts that are not part of a package with a proper help page, every top-level function should be preceded by a block of documentation in comments. The documentation should normally at least contain:
- the signature, or usage information, of the function (its name followed by all the arguments with their default values, if any);
- a short description of the function (completing the sentence “This function allows to…”);
- the list of all formal arguments with their meaning and admissible values, when pertinent;
- the value returned by the function;
- one or more examples of usage, depending on the complexity of the function.
This expected documentation format is not unlike R help pages.