Skip to main content
AI in Production 2026 is now open for talk proposals.
Share insights that help teams build, scale, and maintain stronger AI systems.
items
Menu
  • About
    • Overview 
    • Join Us  
    • Community 
    • Contact 
  • Training
    • Overview 
    • Course Catalogue 
    • Public Courses 
  • Posit
    • Overview 
    • License Resale 
    • Managed Services 
    • Health Check 
  • Data Science
    • Overview 
    • Visualisation & Dashboards 
    • Open-source Data Science 
    • Data Science as a Service 
    • Gallery 
  • Engineering
    • Overview 
    • Cloud Solutions 
    • Enterprise Applications 
  • Our Work
    • Blog 
    • Case Studies 
    • R Package Validation 
    • diffify  

What's new for Python in 2025?

Author: Russ Hyde

Published: October 16, 2025

tags: r, python

Python 3.14 was released on 7th October 2025. Here we summarise some of the more interesting changes and some trends in Python development and data-science over the past year. We will highlight the following:

  • the colourful Python command-line interface;
  • project-management tool uv;
  • free-threading;
  • and a brief summary of other developments.

The Python 3.14 release notes also describe the changes to base Python.

Colourful REPL

At Jumping Rivers we have taught a lot of people to program in Python. Throughout a programming career you get used to making, and learning from, mistakes. The most common mistakes made in introductory programming lessons may still trip you up in 10 years time: unmatched parentheses, typos, missing quote symbols, unimported dependencies.

Our Python training courses are presented using Jupyter. Jupyter notebooks have syntax highlighting that makes it easy to identify an unfinished string, or a mis-spelled keyword.

But, most Python learners don’t use Jupyter (or other high-level programming tools) on day one - they experiment with Python at the command line. You can type “python” into your shell/terminal window and start programming into the “REPL” (read-evaluate-print loop).

Any effort to make the REPL easier to work with will be beneficial to beginning programmers. So the introduction of syntax highlighting in the Python 3.14 REPL is really beneficial.

Whether you want to start from scratch, or improve your skills, Jumping Rivers has a training course for you.

uv and package development

One of the big trends in Python development within 2025, is the rise of the project management tool uv. This is a Rust-based command-line tool and can be used to initialise a package / project structure, to specify the development and runtime environment of a project, and to publish a package to PyPI.

At Jumping Rivers, we have used poetry for many of the jobs that uv excels at. Python is used for the data preparation tasks for diffify.com, and we use poetry to ensure that our developers each use precisely the same package versions when working on that project (See our current blog series on Poetry). But, poetry doesn’t prevent developers using different versions of Python. For that, we need a second tool like pyenv (which allows switching between different Python versions) or for each developer to have the same Python version installed on their machine.

uv goes a step further than poetry and allows us to pin Python versions for a project. Let’s use uv to install Python 3.14, so that we can test out features in the new release.

First follow the instructions for installing uv.

Then at the command line, we will use uv to create a new project where we’ll use Python 3.14.

# [bash]
cd ~/temp
mkdir blog-py3.14
cd blog-py3.14

# Which versions of Python 3.14 are available via uv?
uv python list | grep 3.14
# cpython-3.14.0rc2-linux-x86_64-gnu                <download available>
# cpython-3.14.0rc2+freethreaded-linux-x86_64-gnu   <download available>

You’ll see something similar regardless of the operating system that you use. That lists two versions of Python 3.14 - one with an optional system called “Free Threading” (see later). We’ll install both versions of Python:

uv python install cpython-3.14.0rc2-linux-x86_64-gnu
uv python install cpython-3.14.0rc2+freethreaded-linux-x86_64-gnu

Users of pyenv will be able to install Python 3.14 in a similar manner.

We can select between the two different Python versions at the command line. First using the version that does not have free threading:

uv run --python=3.14 python
# Python 3.14.0rc2 (main, Aug 18 2025, 19:19:22) [Clang 20.1.4 ] on linux
# ...
>>> import sys
>>> sys._is_gil_enabled()
# True

Then using the version with free threading (note the t suffix)

uv run --python=3.14t python
# ...
# Python 3.14.0rc2 free-threading build (main, Aug 18 2025, 19:19:12) [Clang 20.1.4 ] on linux
# ...
>>> import sys
>>> sys._is_gil_enabled()
# False

Project creation and management with uv

uv is capable of much more than allowing us to switch between different versions of Python. The following commands initialise a Python project with uv:

# From ~/temp/blog-py3.14

# Indicate the default python version for the project
uv python pin 3.14

# Initialise a project in the current directory
uv init .

# Check the Python version
uv run python --version
# Python 3.14.0rc2

This adds some files for project metadata (pyproject.toml, README.md) and version control:

tree -a -L 1
# .
# ├── .git
# ├── .gitignore
# ├── main.py
# ├── pyproject.toml
# ├── .python-version
# ├── README.md
# ├── uv.lock
# └── .venv
#
# 2 directories, 6 files

Now we can add package dependencies using uv add <packageName> and other standard project-management tasks. But one thing I wanted to highlight is that uv allows us to start a Jupyter notebook, using the project’s Python interpreter, without either adding jupyter as a dependency or explicitly defining a kernel for jupyter:

uv run --with jupyter jupyter lab

Creating a new notebook using the default Python 3 kernel in the JupyterLab session that starts, should ensure you are using the currently active Python 3.14 environment.

Threading

Python 3.13 introduced an experimental feature, ‘Free-threading’, that is now officially supported as of 3.14.

First though, what is a ’thread’? When a program runs on your computer, there are lots of different tasks going on. Some of those tasks could run independently of each other. You, as the programmer, may need to explain to the computer which tasks can run independently. A thread is a way of cordoning-off one of those tasks; it’s a way of telling the computer that your software is running on, that this task here can run separately from those tasks there, and the logic for running this task too. (Basically).

Python has allowed developers to define threads for a while. If you have a few tasks that are largely independent of each other, each of these tasks can run in a separate thread. Threads can access the same memory space, meaning that they can access and modify shared variables in a Python session. In general, this also means that a computation in one thread could update a value that is used by another thread, or that two different threads could make conflicting updates to the same variable. This freedom can lead to bugs. The CPython interpreter was originally written with a locking mechanism (the Global Interpreter Lock, GIL) that prevented different threads from running at the same time (even when multiple processors were available) and limited the reach of these bugs.

Traditionally, you would have used threads for “non-CPU-bound tasks” in Python. These are the kinds of tasks that would be unaffected by having more, or faster, processors available to the Python instance: network traffic, file access, waiting for user input. For CPU-bound tasks, like calculations and data-processing, you could use Python’s ‘multiprocessing’ library (although some libraries like ‘numpy’ have their own low-level mechanisms for splitting work across cores). This starts multiple Python instances, each doing a portion of the processing, and allows a workload to be partitioned across multiple processors.

The main other differences between threading and multiprocessing in Python are in memory and data management. With threading, you have one Python instance, with each thread having access to the same memory space. With multiprocessing, you have multiple Python instances that work independently: the instances do not share memory, so to partition a workload using multiprocessing, Python has to send copies of (subsets of) your data to the new instances. This could mean that you need to store two or more copies of a large dataset in memory when using multiprocessing upon it.

Simultaneous processing across threads that share memory-space is now possible using the free-threaded build of Python. Many third-party packages have been rewritten to accommodate this new build and you can learn more about free-threading and the progress of the changes in the “Python Free-Threading Guide”.

As a simple-ish example, lets consider natural language processing. There is a wonderful blog post about parallel processing with the nltk package on the “WZB Data Science Blog”. We will extend that example to use free-threading.

ntlk provides access to some of the Project Gutenberg books, and we can access this data as follows:

# main.py
import nltk

def setup():
    nltk.download("gutenberg")
    nltk.download("punkt_tab")
    nltk.download('averaged_perceptron_tagger_eng')
    corpus = {                                                                                                     f_id: nltk.corpus.gutenberg.raw(f_id)
        for f_id in nltk.corpus.gutenberg.fileids()
    }
    return corpus

corpus = setup()

The key-value pairs in corpus are the abbreviated book-title and contents for 18 books. For example:

corpus["austen-emma.txt"]
# [Emma by Jane Austen 1816]
#
# VOLUME I
#
# CHAPTER I
#
#
# Emma Woodhouse, handsome, clever, and rich, with a comfortable home ...

A standard part of a text-processing workflow is to tokenise and tag the “parts-of-speech” (POS) in a document. We can do this using two nltk functions:

# main.py ... continued
def tokenise_and_pos_tag(doc):
    return nltk.pos_tag(nltk.word_tokenize(doc))

A function to sequentially tokenise and POS-tag the contents of a corpus of books can be written:

# main.py ... continued
def tokenise_seq(corpus):
    tokens = {
        f_id: tokenise_and_pos_tag(doc)
        for f_id, doc in corpus.items()
    }
    return tokens

You need to install or build Python in a particular way to make use of “Free-threaded” Python. In the above, we installed Python “3.14t” using uv, so we can compare the speed of free-threaded and sequential, single-core, processing.

We will use the timeit package to analyse processing speed, from the command line.

# Activate the threaded version of Python 3.14
uv python pin 3.14t

# Install the dependencies for our main.py script
uv add timeit nltk

# Time the `tokenise_seq()` function
# -- but do not time any setup code...
PYTHON_GIL=0 \
  uv run python -m timeit \
  --setup "import main; corpus = main.setup()" \
  "main.tokenise_seq(corpus)"

# [lots of output messages]
# 1 loop, best of 5: 53.1 sec per loop

After some initial steps where the nltk datasets were downloaded and the corpus object was created (neither of which were timed, because these steps were part of the timeit --setup block), tokenise_seq(corpus) was run multiple times and the fastest speed was around 53 seconds.

A small note: we have used the environment variable PYTHON_GIL=0 here. This makes it explicit that we are using free-threading (turning off the GIL). This wouldn’t normally be necessary to take advantage of free-threading (in Python “3.14t”), but was needed because one of the dependencies of nltk hasn’t been validated for the free-threaded build yet.

To write a threaded-version of the same, we introduce two functions. The first is a helper that takes (filename, document-content) pairs and returns (filename, processed-document) pairs:

def tupled_tokeniser(pair):
    file_id, doc = pair
    return file_id, tokenise_and_pos_tag(doc)

The second function creates a Thread-pool, taking advantage of as many CPUs as there are available on my machine (16, counted by multiprocessing.cpu_count()). Each document is processed as a separate thread and we wait for all of the documents to be processed before returning results to the caller:

import multiprocessing as mp
from concurrent.futures import ThreadPoolExecutor, wait
# ...
def tokenise_threaded(corpus):
    with ThreadPoolExecutor(max_workers=mp.cpu_count()) as tpe:
        try:
            futures = [
                tpe.submit(tupled_tokeniser, pair)
                for pair in corpus.items()
            ]
            wait(futures)
        finally:
            # output is a list of (file-id, data) pairs
            tokens = [f.result() for f in futures]
    return tokens
# Time the `tokenise_threaded()` function
# -- but do not time any setup code...
PYTHON_GIL=0 \
  uv run python -m timeit \
  --setup "import main; corpus = main.setup()" \
  "main.tokenise_threaded(corpus)"
# [lots of output messages]
# 1 loop, best of 5: 32.5 sec per loop

I could see that every core was used when processing the documents, using the htop tool on Ubuntu. At points during the run, each of the 16 CPUs was at near to 100% use (whereas only one or two CPUs were busy at any time during the sequential run):

Visual demonstration that 16 processors were busy

But, despite using 16x as many CPUs, the multithreaded version of the processing script was only about 40% faster. There was only 18 books in the dataset and some disparity between the book lengths (the bible, containing millions of words was processed much slower than the others). Maybe the speed up would be greater with a larger or more balanced dataset.

In the post on the WZB Data Science blog, there is a multiprocessing implementation of the above. Running their multiprocessing code with 16 CPUs gave a similar speed up to multithreading (minimum time 31.2 seconds). Indeed, if I was writing this code for a real project, multiprocessing would remain my choice, because the analysis for one book can proceed independently of that for any other book and data volumes aren’t that big.

Other News

Python 3.14 has also introduced some improvements to exception-handling, a new approach to string templating and improvements to the use of concurrent interpreters. See the Python 3.14 release notes for further details.

In the wider Python Data Science ecosystem, a few other developments have occurred or are due before the end of 2025:

  • The first stable release of the Positron IDE was made in August;
  • Pandas 3.0 is due before the end of the year, and will introduce strings as a data-type, copy-on-write behaviour, and implicit access to columns in DataFrame-modification code;
  • Tools that ingest DataFrames are becoming agnostic to DataFrame library through the Narwahls project. See the Plotly write-up on this subject.

Python data science progresses at such a speed that we can only really scratch the surface here. Have we missed anything in the wider Python ecosystem (2025 edition) that will make a huge difference to your data work? Let us know on LinkedIn or Bluesky.


Jumping Rivers Logo

Recent Posts

  • Start 2026 Ahead of the Curve: Boost Your Career with Jumping Rivers Training 
  • Should I Use Figma Design for Dashboard Prototyping? 
  • Announcing AI in Production 2026: A New Conference for AI and ML Practitioners 
  • Elevate Your Skills and Boost Your Career – Free Jumping Rivers Webinar on 20th November! 
  • Get Involved in the Data Science Community at our Free Meetups 
  • Polars and Pandas - Working with the Data-Frame 
  • Highlights from Shiny in Production (2025) 
  • Elevate Your Data Skills with Jumping Rivers Training 
  • Creating a Python Package with Poetry for Beginners Part2 
  • What's new for Python in 2025? 

Top Tags

  • R (236) 
  • Rbloggers (182) 
  • Pybloggers (89) 
  • Python (89) 
  • Shiny (63) 
  • Events (26) 
  • Training (23) 
  • Machine Learning (22) 
  • Conferences (20) 
  • Tidyverse (17) 
  • Statistics (14) 
  • Packages (13) 

Authors

  • Amieroh Abrahams 
  • Aida Gjoka 
  • Shane Halloran 
  • Tim Brock 
  • Russ Hyde 
  • Gigi Kenneth 
  • Osheen MacOscar 
  • Sebastian Mellor 
  • Myles Mitchell 
  • Keith Newman 
  • Theo Roe 
  • Colin Gillespie 
  • Pedro Silva 

Keep Updated

Like data science? R? Python? Stan? Then you’ll love the Jumping Rivers newsletter. The perks of being part of the Jumping Rivers family are:

  • Be the first to know about our latest courses and conferences.
  • Get discounts on the latest courses.
  • Read news on the latest techniques with the Jumping Rivers blog.

We keep your data secure and will never share your details. By subscribing, you agree to our privacy policy.

Follow Us

  • GitHub
  • Bluesky
  • LinkedIn
  • YouTube
  • Eventbrite

Find Us

The Catalyst Newcastle Helix Newcastle, NE4 5TG
Get directions

Contact Us

  • hello@jumpingrivers.com
  • + 44(0) 191 432 4340

Newsletter

Sign up

Events

  • North East Data Scientists Meetup
  • Leeds Data Science Meetup
  • Shiny in Production
British Assessment Bureau, UKAS Certified logo for ISO 9001 - Quality management British Assessment Bureau, UKAS Certified logo for ISO 27001 - Information security management Cyber Essentials Certified Plus badge
  • Privacy Notice
  • |
  • Booking Terms

©2016 - present. Jumping Rivers Ltd