Skip to main content
AI in Production 2026 is now open for talk proposals.
Share insights that help teams build, scale, and maintain stronger AI systems.
items
Menu
  • About
    • Overview 
    • Join Us  
    • Community 
    • Contact 
  • Training
    • Overview 
    • Course Catalogue 
    • Public Courses 
  • Posit
    • Overview 
    • License Resale 
    • Managed Services 
    • Health Check 
  • Data Science
    • Overview 
    • Visualisation & Dashboards 
    • Open-source Data Science 
    • Data Science as a Service 
    • Gallery 
  • Engineering
    • Overview 
    • Cloud Solutions 
    • Enterprise Applications 
  • Our Work
    • Blog 
    • Case Studies 
    • R Package Validation 
    • diffify  

R Package Quality: Code Quality

Author: Colin Gillespie

Published: July 10, 2025

tags: r, litmus, validation, code, scoring

This is part four of a five part series of related posts on validating R packages. Other posts in the series are:

  • Validation Guidelines
  • Package Popularity
  • Package Documentation
  • Code Quality (this post)
  • Maintenance

In this post, we’ll take a closer look at code quality and how we can use automated tools to quickly get a feel for a package. The obvious package check is R CMD check. Anyone who has created a package, is familiar with constantly running R CMD check to ensure that their package is note, warning and error free. However, that’s not the only tool we can draw on. Codebase size, security vulnerabilities and the number of exported functions all give a hint to the package quality.

When validating R packages, code quality contributes around 50% to the total. Remember to check out our dashboard to get an overview.

Need help with R package validation to unleash the power of open source? Check out the Litmusverse suite of risk assessment tools.

Score 1: Passing R CMD check

The bedrock of all good R packages! Packages are downloaded, installed and the standard R CMD check is performed. The score is the weighted sum of errors (1) and warnings (0.25), with a maximum score of 1 (no errors or warnings) and a minimum score of 0. Essentially, the metric will allow up to 1 error or 4 warnings before returning the lowest score of 0.

We are working on being more discerning on notes and warnings, but just now, it’s a relatively simple metric that highlights packages with potential issues.

Score 2: Codebase Size

This score is based on the R codebase size, as determined by the number of lines of R code. The general idea is that larger codebases are harder to maintain. Of course, the obvious question is “what is a large R base”?

Instead of coming up with arbitrary numbers, we analysed all packages on CRAN (2025/03). If a package is in the lower quartile for codebase size, the package is scored 1. Otherwise, the empirical CDF is used.

For those who are interested, the largest R package on CRAN had 100,000+ lines of R code!

Score for code length

Score 3: Security Vulnerabilities

If a package has a known security vulnerability, it receives a score of 0. This uses the {oysteR} package to detect issues.

Score 4: Release

This is a binary score, if the package under assessment is the latest version, it’s scored 1. Otherwise, a 0 is returned. We did investigate using a more sophisticated scoring system based on minor and major releases. But within the R community, semantic versioning isn’t consistently followed, so we opted for a simpler rule.

Score 5: Exported Namespace Size

Score a package based on the number of exported objects. Fewer exported objects mean the risk surface is lower, and bugs are potentially less likely. Similar to codebase size, the question is what is large? Analysing all packages on CRAN, gave us suitable cut-offs. If a package is in the lower quartile for the number of exports, the package is scored 1. Otherwise, the empirical CDF is used.

Our analysis of CRAN suggests that most packages export relatively few objects. A modest package exporting 11 objects scores 0.5. Exporting around 26 objects reduces this to around 0.25.

Score 6: Unit Test Coverage

Score based on the fraction of lines of code which are covered by a unit test. For validation of packages in the Pharmaceutical sector we also provide additional unit tests (remediated code coverage) and investigate the Exported function test coverage.

Score 7: Dependencies

Score based on the number of dependencies a package has, assuming a lower score for more packages. ‘Suggests’, ‘Enhances’, base or recommended packages are not considered as dependencies when calculating this score.

Score for code length

This is a data driven score, based on all packages in CRAN (2025/03). If a package is in the lower quartile for the number of package dependencies, the package is scored 1. Otherwise, the empirical CDF is used. In practice, this means that packages with around 5 dependencies are scored 0.5, which decreases to 0 around twenty dependencies.

Dependencies can be an emotive topic! As with all other scores, this metric isn’t the “be all and end all”, instead it’s just an indication of package fragility.

Examples

For simplicity, we’ve removed the columns on vulnerabilities, R CMD check and release, as for all packages, the score was 1.

PackageDependenciesExported NamespaceTest CoverageCodebase Size
{drat}1.000.560.750.73
{microbenchmark}1.001.000.560.84
{shinyjs}0.820.130.030.66
{tibble}0.360.120.820.17
{tsibble}0.200.040.870.11

The scores above indicate that {tibble} and {tsibble} are relatively large, complex packages. These packages export many functions, and have multiple dependencies. Reassuringly, they have a high test coverage.

The {shinyjs} package has a worryingly low test coverage. However, inspection of the code shows that there are many manual tests that aren’t captured. This highlights a key aspect, automated aren’t enough, especially in the validated setting. Part of litmus is to having a qualified person assess the package.


Jumping Rivers Logo

Recent Posts

  • Start 2026 Ahead of the Curve: Boost Your Career with Jumping Rivers Training 
  • Should I Use Figma Design for Dashboard Prototyping? 
  • Announcing AI in Production 2026: A New Conference for AI and ML Practitioners 
  • Elevate Your Skills and Boost Your Career – Free Jumping Rivers Webinar on 20th November! 
  • Get Involved in the Data Science Community at our Free Meetups 
  • Polars and Pandas - Working with the Data-Frame 
  • Highlights from Shiny in Production (2025) 
  • Elevate Your Data Skills with Jumping Rivers Training 
  • Creating a Python Package with Poetry for Beginners Part2 
  • What's new for Python in 2025? 

Top Tags

  • R (236) 
  • Rbloggers (182) 
  • Pybloggers (89) 
  • Python (89) 
  • Shiny (63) 
  • Events (26) 
  • Training (23) 
  • Machine Learning (22) 
  • Conferences (20) 
  • Tidyverse (17) 
  • Statistics (14) 
  • Packages (13) 

Authors

  • Amieroh Abrahams 
  • Colin Gillespie 
  • Aida Gjoka 
  • Gigi Kenneth 
  • Osheen MacOscar 
  • Sebastian Mellor 
  • Keith Newman 
  • Pedro Silva 
  • Shane Halloran 
  • Russ Hyde 
  • Myles Mitchell 
  • Tim Brock 
  • Theo Roe 

Keep Updated

Like data science? R? Python? Stan? Then you’ll love the Jumping Rivers newsletter. The perks of being part of the Jumping Rivers family are:

  • Be the first to know about our latest courses and conferences.
  • Get discounts on the latest courses.
  • Read news on the latest techniques with the Jumping Rivers blog.

We keep your data secure and will never share your details. By subscribing, you agree to our privacy policy.

Follow Us

  • GitHub
  • Bluesky
  • LinkedIn
  • YouTube
  • Eventbrite

Find Us

The Catalyst Newcastle Helix Newcastle, NE4 5TG
Get directions

Contact Us

  • hello@jumpingrivers.com
  • + 44(0) 191 432 4340

Newsletter

Sign up

Events

  • North East Data Scientists Meetup
  • Leeds Data Science Meetup
  • Shiny in Production
British Assessment Bureau, UKAS Certified logo for ISO 9001 - Quality management British Assessment Bureau, UKAS Certified logo for ISO 27001 - Information security management Cyber Essentials Certified Plus badge
  • Privacy Notice
  • |
  • Booking Terms

©2016 - present. Jumping Rivers Ltd