Skip to main content
AI in Production 2026 is now open for talk proposals.
Share insights that help teams build, scale, and maintain stronger AI systems.
items
Menu
  • About
    • Overview 
    • Join Us  
    • Community 
    • Contact 
  • Training
    • Overview 
    • Course Catalogue 
    • Public Courses 
  • Posit
    • Overview 
    • License Resale 
    • Managed Services 
    • Health Check 
  • Data Science
    • Overview 
    • Visualisation & Dashboards 
    • Open-source Data Science 
    • Data Science as a Service 
    • Gallery 
  • Engineering
    • Overview 
    • Cloud Solutions 
    • Enterprise Applications 
  • Our Work
    • Blog 
    • Case Studies 
    • R Package Validation 
    • diffify  

Diffify - Python release

Author: Myles Mitchell

Published: November 15, 2022

tags: python, r, diffify, packages

It has been 6 months since the launch of Diffify, our website for comparing package releases. We are delighted to announce that, in addition to CRAN’s 20,000 R packages, you can now track 1600 popular Python packages!

What’s included?

The current criteria for a Python package to be included in Diffify are:

  • The package is listed in the top 2000 PyPI packages according to download statistics.
  • The package has had version releases since 1st May 2020.
  • The package wheel is downloadable from pypi.org.

If your favourite package is not currently accessible, don’t worry! We are actively working to expand the list to as many PyPI packages as possible, as we’ll explain below.

Data comes in all shapes and sizes. It can often be difficult to know where to start. Whatever your problem, Jumping Rivers can help.

New content

The first change you’ll notice is to our homepage, where we now have buttons for both R and Python.

A screenshot of the new Diffify homepage: In the sidebar there are links to home, R and Python. The main body has some introduction text, and now contains a link to Get started with Python.

Clicking on the Python button will take you through to the package search bar. For this walkthrough, we will compare versions 3.3.0 and 3.5.0 of the Matplotlib package. Diffify provides a breakdown of the changes to the package dependencies, functions and classes.

A screenshot of the version comparison page for the Python package Matplotlib: The later version is set to 3.5.0 and the earlier version is set to 3.3.0. Collapsable windows are displayed which contain changes to Dependencies, Functions and Classes.

Dependencies

We consider three kinds of dependencies:

  • The Python version requirement.
  • Required Python packages - these must be installed.
  • Optional Python packages - installing these will enable extra package features.

A screenshot of the Dependencies window: This includes tabs for the Python, Required and Optional dependencies. The Python requirement has changed from 3.6 to 3.7.

In our example, we see that the Python version requirement has changed from >=3.6 to >=3.7.

Functions

Here we provide a list of functions that have been added, removed or changed between the two versions.

A screenshot of the Functions window: A list of package functions is displayed. Each entry displays the function name prefixed by the module path on the left, and a button for accessing the function Details on the right. Each function is colour-coded based on whether it has been added, removed or changed.

Clicking on the “Details” dropdown will bring up the function arguments, including the argument name and default value. If type annotations are included in the package source code, Diffify will also display the argument type and the function return type.

A screenshot of the expanded Details for the matplotlib.pyplot.grid function: A table is displayed showing the function arguments for each version, including the argument name, default value, and type. Changed arguments are highlighted. The return type of the function is displayed above this table.

For the pyplot.grid() function, the name of the first positional argument has changed from b to visible.

Classes

Here we provide a list of classes that have been added, removed or changed.

A screenshot of the Classes window: A list of package classes is displayed. Each entry displays the class name prefixed by the module path on the left, and a button for accessing the class methods is displayed on the right. Each class is colour-coded based on whether it has been added, removed or changed.

Clicking on the “Methods” button for a class will bring up a pop-up that lists the methods that belong to that class. The example below shows the methods .__init__() and .from_dict(), which belong to the spines.Spines class.

A screenshot of the Methods pop-up window for the matplotlib,spines.Spines class: A list of methods belonging to the class is displayed. Each entry displays the method name on the left, and a button for accessing the method Details is displayed on the right. Each method is colour-coded based on whether it has been added, removed or changed.

Similar to functions, you can access the method arguments by clicking on “Details”.

Removing clutter

The functions and classes listed above have been detected by analysing the package source code. We have taken various steps to filter out code that is intended for internal use by the package developers, including

  • ignoring functions and scripts whose names start with a leading underscore
  • ignoring functions whose names start test* and classes whose names start Test*
  • leaving out scripts whose names start test_* or end *_test.py

These criteria are intended to leave out internal code and unit tests.

Looking ahead

Python has been around for quite a while, and consequently it has many packages - 400,000 to be precise! Perhaps unsurprisingly, analysing so many packages for Diffify has proven to be a bit of a challenge…

This is why we have initially chosen to focus on the 2000 most popular PyPI packages. We will soon extend this to the top 5000, according to Top PyPI Packages. And we won’t be stopping there! It remains to be seen whether we will manage to add all 400,000, but we will certainly try our utmost.

Despite our best efforts to filter out clutter, you may still come across some functions and classes that are clearly intended for internal use or unit testing. We will continue to look at ways to improve our filters.

We hope you enjoy the new content! As always, if you spot any bugs or have any suggestions please add an issue to our public GitHub.

Stay tuned for more updates…


Jumping Rivers Logo

Recent Posts

  • Start 2026 Ahead of the Curve: Boost Your Career with Jumping Rivers Training 
  • Should I Use Figma Design for Dashboard Prototyping? 
  • Announcing AI in Production 2026: A New Conference for AI and ML Practitioners 
  • Elevate Your Skills and Boost Your Career – Free Jumping Rivers Webinar on 20th November! 
  • Get Involved in the Data Science Community at our Free Meetups 
  • Polars and Pandas - Working with the Data-Frame 
  • Highlights from Shiny in Production (2025) 
  • Elevate Your Data Skills with Jumping Rivers Training 
  • Creating a Python Package with Poetry for Beginners Part2 
  • What's new for Python in 2025? 

Top Tags

  • R (236) 
  • Rbloggers (182) 
  • Pybloggers (89) 
  • Python (89) 
  • Shiny (63) 
  • Events (26) 
  • Training (23) 
  • Machine Learning (22) 
  • Conferences (20) 
  • Tidyverse (17) 
  • Statistics (14) 
  • Packages (13) 

Authors

  • Amieroh Abrahams 
  • Aida Gjoka 
  • Osheen MacOscar 
  • Keith Newman 
  • Shane Halloran 
  • Russ Hyde 
  • Myles Mitchell 
  • Tim Brock 
  • Theo Roe 
  • Colin Gillespie 
  • Gigi Kenneth 
  • Sebastian Mellor 
  • Pedro Silva 

Keep Updated

Like data science? R? Python? Stan? Then you’ll love the Jumping Rivers newsletter. The perks of being part of the Jumping Rivers family are:

  • Be the first to know about our latest courses and conferences.
  • Get discounts on the latest courses.
  • Read news on the latest techniques with the Jumping Rivers blog.

We keep your data secure and will never share your details. By subscribing, you agree to our privacy policy.

Follow Us

  • GitHub
  • Bluesky
  • LinkedIn
  • YouTube
  • Eventbrite

Find Us

The Catalyst Newcastle Helix Newcastle, NE4 5TG
Get directions

Contact Us

  • hello@jumpingrivers.com
  • + 44(0) 191 432 4340

Newsletter

Sign up

Events

  • North East Data Scientists Meetup
  • Leeds Data Science Meetup
  • Shiny in Production
British Assessment Bureau, UKAS Certified logo for ISO 9001 - Quality management British Assessment Bureau, UKAS Certified logo for ISO 27001 - Information security management Cyber Essentials Certified Plus badge
  • Privacy Notice
  • |
  • Booking Terms

©2016 - present. Jumping Rivers Ltd