How often does Python allocate?

https://zackoverflow.dev/writing/how-often-does-python-allocate/

34•ingve•4d ago

Comments

zahlman•4d ago

A caveat applies to the entire analysis that CPython may be the reference implementation, but it's still just one implementation. This sort of thing may work totally differently in PyPy, and especially in implementations that make use of another garbage-collecting runtime, such as Jython.

> Let’s take out the print statement and see if it’s just the addition:

Just FWIW: the assignment is not required to prevent optimizing out the useless addition. It isn't doing any static analysis, so it doesn't know that `range` is the builtin, and thus doesn't know that `i` is an integer, and thus doesn't know that `+` will be side-effect-free.

> Nope, it seems there is a pre-allocated list of objects for integers in the range of -5 -> 1025. This would account for 1025 iterations of our loop but not for the rest.

1024 iterations, because the check is for numbers strictly less than `_PY_NSMALLPOSINTS` and the value computed is `i + 1` (so, `1` on the first iteration).

Interesting. I knew of them only ranging up to 256 (https://stackoverflow.com/questions/306313).

It turns out (https://github.com/python/cpython/commit/7ce25edb8f41e527ed4...) that the change is barely a month old in the repository; so it's not in 3.14 (https://github.com/python/cpython/blob/3.14/Include/internal...) and won't show up until 3.15.

> Our script appears to actually be reusing most of the PyLongObject objects!

The interesting part is that it can somehow do this even though the values are increasing throughout the loop (i.e., to values not seen on previous iterations), and it also doesn't need to allocate for the value of `i` retrieved from the `range`.

> But realistically the majority of integers in a program are going to be less than 2^30 so why not introduce a fast path which skips this complicated code entirely?

This is the sort of thing where PRs to CPython are always welcome, to my understanding. It probably isn't a priority, or something that other devs have thought of, because that allocation presumably isn't a big deal compared to the time taken for the actual conversion, which in turn is normally happening because of some kind of I/O request. (Also, real programs probably do simple arithmetic on small numbers much more often than they string-format them.)

petters•1h ago

> that make use of another garbage-collecting runtime, such as Jython

I think that is mostly of historical interest. For example, it still does not support Python 3 and has not been updated in a very long time

zahlman•45m ago

Ah, sorry to hear it. I'd lost track. It looks like IronPython is still active, but way behind. Or rather, maybe they have no interest in implementing Python 3 features beyond 3.4.

cogman10•41m ago

Graalpy [1] is where it's at if you want python on a JVM.

[1] https://www.graalvm.org/python/

sushibowl•1h ago

With respect to tagged pointers, there seems to be some recent movements on that front in CPython: https://github.com/python/cpython/issues/132509

notatallshaw•49m ago

Unfortunately that was posted 1 month before the Faster CPython project was disbanded by Microsoft, so I imagine things have slowed.

nu11ptr•1h ago

I admit it may just be because I'm a PL nerd, but I thought it was general knowledge that pretty much EVERYTHING in Python is an object, and an object in Python is always heap allocated AFAIK. This goes deeper than just integers. Things most think of as declarative (like classes, modules, etc.) are also objects, etc. etc. It is both the best thing (for dynamism and fun/tinkering) and worst thing (performance optimization) about Python.

If you've never done it, I recommend using the `dir` function in a REPL, finding interesting things inside your objects, do `dir` on those, and keep the recursion going. It is a very eye opening experience as to just how deep the objects in Python go.

godshatter•1h ago

C gets a lot of crap, sometimes for good reason, but one thing I like about it is that the question of whether C is allocating something is easy to answer, at least for your own code.

manwe150•53m ago

Python is entirely a C program, ergo by this article, this seems like one of those fallacies C programs believe justifies using C

bee_rider•47m ago

It is nice.

Although, there are also modern, beautiful, user friendly languages where allocation is mostly obvious. Like Fortran.

mr_mitm•9m ago

Oh my. You don't see Fortran being called modern very often.

mgkuhn•44m ago

There are reasons why the same program in Julia can be 60x faster than in Python, see e.g. slide 5 in https://www.cl.cam.ac.uk/teaching/2526/TeX+Julia/julia-slide... for an example.

boothby•42m ago

Once upon a time, I wanted to back a stack with a linked list in Python. I had been reading a lot of compiled bytecode, and had recently learned that CPython is a stack-based language capable of unrolling and popping tuples as singular bytecode instructions. I also learned about the freelist.

I ended up with the notation

  Initialization:
    head = ()
  Push:
    head = data, head
  Safe Pop:
    if head:
       data, head = head
  Safe Top:
    head[0] if head else None

And for many stack-based algorithms, I've found this to be quite optimal in part because the length-2 tuples get recycled (also due to a lack of function calls, member accesses, etc). But I'm rather embarrassed to put it into a codebase due to others' expectations that Python should be beautiful and this seems weird.

whilenot-dev•9m ago

A bit beside the point, but this caught my eye:

> Integers are likely the most used data type of any program, that means a lot of heap allocations.

I would guess strings come first, then floats, then booleans, and then integers. Are there any data available on that?

Open Source Implementation of Apple's Private Compute Cloud

I analyzed the lineups at the most popular nightclubs

Kimi K2 Thinking, a SOTA open-source trillion-parameter reasoning model

FBI tries to unmask owner of archive.is

Ratatui – App Showcase

Mathematical exploration and discovery at scale

Show HN: See chords as flags – Visual harmony of top composers on musescore

Cloudflare Tells U.S. Govt That Foreign Site Blocking Efforts Are Trade Barriers

Australia has so much solar that it's offering everyone free electricity

Solarpunk is happening in Africa

How often does Python allocate?

Pico-100BASE-TX: Bit-Banged 100 MBit/s Ethernet and UDP Framer for RP2040/RP2350

The seven second kernel compile

How I am deeply integrating Emacs

AI Slop vs. OSS Security

Dillo, a multi-platform graphical web browser

Firefox profiles: Private, focused spaces for all the ways you browse

End of Japanese community

ChatGPT terms disallow its use in providing legal and medical advice to others

Musik magazine archives (1995-2003)

IKEA launches new smart home range with 21 Matter-compatible products

The trust collapse: Infinite AI content is awful

Staying opinionated as you grow

Why aren't smart people happier?

Eating Stinging Nettles

Recursive macros in C, demystified (once the ugly crying stops)

The Basic Laws of Human Stupidity (1987) [pdf]

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

Ruby and Its Neighbors: Smalltalk

New gel restores dental enamel and could revolutionise tooth repair

How often does Python allocate?

Comments

Open Source Implementation of Apple's Private Compute Cloud

I analyzed the lineups at the most popular nightclubs

Kimi K2 Thinking, a SOTA open-source trillion-parameter reasoning model

FBI tries to unmask owner of archive.is

Ratatui – App Showcase

Mathematical exploration and discovery at scale

Show HN: See chords as flags – Visual harmony of top composers on musescore

Cloudflare Tells U.S. Govt That Foreign Site Blocking Efforts Are Trade Barriers

Australia has so much solar that it's offering everyone free electricity

Solarpunk is happening in Africa

How often does Python allocate?

Pico-100BASE-TX: Bit-Banged 100 MBit/s Ethernet and UDP Framer for RP2040/RP2350

The seven second kernel compile

How I am deeply integrating Emacs

AI Slop vs. OSS Security

Dillo, a multi-platform graphical web browser

Firefox profiles: Private, focused spaces for all the ways you browse

End of Japanese community

ChatGPT terms disallow its use in providing legal and medical advice to others

Musik magazine archives (1995-2003)

IKEA launches new smart home range with 21 Matter-compatible products

The trust collapse: Infinite AI content is awful

Staying opinionated as you grow

Why aren't smart people happier?

Eating Stinging Nettles

Recursive macros in C, demystified (once the ugly crying stops)

The Basic Laws of Human Stupidity (1987) [pdf]

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

Ruby and Its Neighbors: Smalltalk

New gel restores dental enamel and could revolutionise tooth repair