Skip to main content
AI in Production 2026 is now open for talk proposals.
Share insights that help teams build, scale, and maintain stronger AI systems.
items
Menu
  • About
    • Overview 
    • Join Us  
    • Community 
    • Contact 
  • Training
    • Overview 
    • Course Catalogue 
    • Public Courses 
  • Posit
    • Overview 
    • License Resale 
    • Managed Services 
    • Health Check 
  • Data Science
    • Overview 
    • Visualisation & Dashboards 
    • Open-source Data Science 
    • Data Science as a Service 
    • Gallery 
  • Engineering
    • Overview 
    • Cloud Solutions 
    • Enterprise Applications 
  • Our Work
    • Blog 
    • Case Studies 
    • R Package Validation 
    • diffify  

Should I Use Your R Package?

Authors: Astrid Radermacher & Colin Gillespie

Published: March 31, 2025

tags: r, litmus, validation

The answer to this simple, innocuous question is: it depends.

It depends on the package in question, of course. Perhaps less obviously, but just as importantly, it depends on who’s asking the question.

We’re sure if we asked you about “package quality”, we would all come up with what makes a good package:

  • Documentation
  • Unit tests
  • Author credibility
  • Does the package have a web page?
  • Security vulnerabilities
  • Bug closure rate
  • Are there multiple maintainers?
  • Does the package have any reverse dependencies?

We could (and have) come up with another twenty of these attributes. With 95% confidence, we’re sure that most people would agree that everything we’ve thought of is important. But with 100% confidence, we are certain we would disagree on how substantial these characteristics are. Surely, unit testing is more important than the popularity of the package? But how important is the documentation quality relative to the number of maintainers?

It all depends on why we are asking. It’s all about your risk appetite.

What is Risk Appetite?

Risk appetite is all about the risks you are and aren’t willing to take. It ranges from “Our packages need to be vaguely sensible, not compromise our system and have a place where I can log bugs” to “if our packages aren’t thoroughly tested and proven to be fit for purpose, I can’t use them in production”. The former is fairly easy to report on, whereas the latter is quite a bit more complicated.

R users risk appetite from least to most risk averse

The Risk Seekers!

Who amongst us wouldn’t want a top-quality R package? Who are the risk seekers? Most of us, at some point or another. If you are experimenting with building Shiny applications, as long as the package is “secure”, any old package is fine - you just want to experiment. Likewise, if you are an academic and you want to compare your method to one already published, as long the package is “correct”, that’s good enough.

During our training courses, we are often asked this question about quality. How bad can a package be to be usable? A thought experiment we like to do is “suppose you had an R package, with only one version. It’s never updated, no one has heard from the maintainer in ten years. But it provides code for an algorithm you want to use. What would you do?” The obvious answer for those who have a high risk appetite is “something is better than nothing” and “proceed with caution”.

Risk Averse

There are lots of examples of where we are (and should be) risk-averse when it comes to R packages. For example:

  • In the pharmaceutical industry, we need reassurance that the statistics used in reporting are correct. It’s vital that these packages are highly regulated!
  • Accuracy and stability are crucial for official Government reports on the state of the economy. A minor bug could have significant consequences.
  • Banks also work in a regulated environment, running complex models, so have to be careful about the accuracy of their data.

Another crucial aspect is that not only do they need to consider what packages they are using, but also demonstrate this thinking in an auditable manner. This is not dissimilar from the ISO 9001 process. In the context of the Pharmaceutical industry, the holy grail is using R packages in FDA submissions for new therapies.

The R Validation Hub is Paving the Way

The pharmaceutical industry is the first to address these requirements in a meaningful way. The R Validation Hub put out a white paper which addresses the use of R and its packages for statistical analysis in pharmaceutical regulatory submissions, proposing a risk-based approach for validating R packages within validated infrastructure. The paper suggests that base R packages present minimal risk, whilst contributed packages require risk assessment based on their purpose, maintenance practices, community usage, and testing protocols.

The proposed framework classifies packages as either “Intended for Use” (loaded directly by users) or “Imports” (supporting dependencies), focusing validation efforts primarily on the former. Risk assessment should evaluate whether packages are statistical or non-statistical in nature, examine development practices, consider community adoption metrics, and review testing coverage. Organisations can use this assessment to determine package inclusion in validated systems and identify additional testing requirements, with high-risk packages needing more rigorous validation.

The approach required for those not working in regulated industries will probably not be as serious as this, but this gives an idea of what the gold standard for R package validation should be, which we can draw inspiration from for less strict applications. They’ve also created some helpful tools, like {riskmetric} which allows us to pull metadata about packages, and create quality scores for these data.

How Do We Enable Risk Assessment for Everyone Across the Risk Spectrum?

This is the question we have been grappling with over the past few months. How do we gather all of the information required to make informed decisions about including packages in production environments, using a flexible framework that meets the needs of everyone on the risk appetite spectrum? Especially considering…

There are so many packages on CRAN!

This is both a blessing and a curse, as anyone who’s ever worked in a regulated environment can tell you. The obvious answer is to automate, automate, automate! This is exactly what we’ve done in the creation of the Litmus package validation framework.

Our process relies on automation wherever possible:

  • We have written code based on {riskmetric} that pulls package metadata from CRAN, git repositories and Posit Package Manager to provide a comprehensive overview of the package’s qualities
  • We have created a framework to analyse and score packages based on these data
  • We have created reporting and dashboarding workflows that allow us to generate package- and collection-level overviews of the scores for each package
  • We’ve implemented automatic acceptance/rejection of a package based on client-specified criteria
  • Our process also enables automated reporting of any additional manual steps taken to save a package from the bin, for example writing additional remedial tests or documentation

Keep an eye out for future blogs on this topic, as we dive a little deeper into the underlying principles driving our approach to package validation.

Does Your Package Pass the Litmus Test?

Ready to find out how we can help you validate your R package collection? Check out the Litmusverse and Get in touch.


Jumping Rivers Logo

Recent Posts

  • Start 2026 Ahead of the Curve: Boost Your Career with Jumping Rivers Training 
  • Should I Use Figma Design for Dashboard Prototyping? 
  • Announcing AI in Production 2026: A New Conference for AI and ML Practitioners 
  • Elevate Your Skills and Boost Your Career – Free Jumping Rivers Webinar on 20th November! 
  • Get Involved in the Data Science Community at our Free Meetups 
  • Polars and Pandas - Working with the Data-Frame 
  • Highlights from Shiny in Production (2025) 
  • Elevate Your Data Skills with Jumping Rivers Training 
  • Creating a Python Package with Poetry for Beginners Part2 
  • What's new for Python in 2025? 

Top Tags

  • R (236) 
  • Rbloggers (182) 
  • Pybloggers (89) 
  • Python (89) 
  • Shiny (63) 
  • Events (26) 
  • Training (23) 
  • Machine Learning (22) 
  • Conferences (20) 
  • Tidyverse (17) 
  • Statistics (14) 
  • Packages (13) 

Authors

  • Amieroh Abrahams 
  • Tim Brock 
  • Aida Gjoka 
  • Shane Halloran 
  • Russ Hyde 
  • Theo Roe 
  • Colin Gillespie 
  • Gigi Kenneth 
  • Osheen MacOscar 
  • Sebastian Mellor 
  • Keith Newman 
  • Pedro Silva 
  • Myles Mitchell 

Keep Updated

Like data science? R? Python? Stan? Then you’ll love the Jumping Rivers newsletter. The perks of being part of the Jumping Rivers family are:

  • Be the first to know about our latest courses and conferences.
  • Get discounts on the latest courses.
  • Read news on the latest techniques with the Jumping Rivers blog.

We keep your data secure and will never share your details. By subscribing, you agree to our privacy policy.

Follow Us

  • GitHub
  • Bluesky
  • LinkedIn
  • YouTube
  • Eventbrite

Find Us

The Catalyst Newcastle Helix Newcastle, NE4 5TG
Get directions

Contact Us

  • hello@jumpingrivers.com
  • + 44(0) 191 432 4340

Newsletter

Sign up

Events

  • North East Data Scientists Meetup
  • Leeds Data Science Meetup
  • Shiny in Production
British Assessment Bureau, UKAS Certified logo for ISO 9001 - Quality management British Assessment Bureau, UKAS Certified logo for ISO 27001 - Information security management Cyber Essentials Certified Plus badge
  • Privacy Notice
  • |
  • Booking Terms

©2016 - present. Jumping Rivers Ltd