SQLAlchemy and Idle in transaction queries

Orphaned transactions are a problem that could lead to messy states in the database. This is usually a good indicator that the something is not right in the manner in which connections are established and closed on the services that query the database.

How does one find these queries

This query lists all the database connections and their current state

This usually ends up returning a bunch of transactions

We have one transaction in the idle in transaction state, one in the active state and the rest in idle state.


SQLAlchemy, Pools and Idle connections

Now, the idle state indicates that a bunch of connections have been established and remain connected without performing any operations on the database. With respect to SQLAlchemy, this refers to the pool configurations. You’ll usually notice a large number of connections remaining idle depending on how you checkout and release connections back to your pool.


For example

In this example, we tell SQLAlchemy to create a pool of 35 connections (with a maximum of 65, indicated by the max_overflow).

The pool_use_lifo specifies the mechanism with which connections are checked out and closed. Using LIFO reduces the number of connections used during non-peak periods of use.

LIFO essentially specifies that the last used connection (released back to the pool) to the pool is the one that is used for the next request.


This setting is largely a preference and depends on how you want your application to behave. It’s important that though having idle connections is not a major drawback, a large number of idle connections do consume resources on the database.


Terminating Idle sessions

The most effective approach is to handle this using the session pool manager and ensuring the pool size is adequate and uses the LIFO strategy.

However, this can also be configured on postgresql using the idle_session_timeout attribute. Setting this to an integer (indicating duration in milliseconds) in postgresql.conf ensures that idle sessions are terminated after the period.

A more brute force approach to this is described in this post.


SQLAlchemy, Pools and Idle In Transaction sessions

The idle in transaction connections are slightly trickier and occur due to connections stuck in a transaction. This usually occurs when a connection has issued a BEGIN statement (and possibly a few other queries post that) and is waiting for the transaction to be committed or rolled back.


In my opinion, the culprit for the issue is the way in which SQLAlchemy defines the configurations for this. It’s important to acknowledge that it adheres to the approach defined in the PEP 249


Now, when a connection is checked out on an engine defined below it issues a BEGIN statement by default.

Now, let’s assume that we make the following query

We end up seeing our transaction in the Idle in transaction state.

This brings us to Postgres’ default state, which is the AUTO COMMIT mode, which is intuitive. The database doesn’t expect us to wrap queries in a BEGIN and COMMIT block and invokes the command immediately when returned. Note, we’re referring to PostgreSQL here and not SQLAlchemy.


When we need an explicit transaction, we’re expected to wrap those in a BEGIN - COMMIT or ROLLBACK block.


But, with SQLAlchemy explicitly invoking a BEGIN when a connection is checked out, we’re forced to wrap all SELECT queries in a transaction block. Now, a select clause doesn’t really warrant a transaction block and not doing so leaves us with dangling orphaned sessions.


Terminating Idle sessions

As before the brute force approach is still an option and is described in this post here but one can also configure this directly with PostgreSQL in the postgresql.conf file setting the idle_in_transaction_session_timeout. Now, this can get tricky as this could leave the application in an unpredictable state.


So, the only reasonable thing to do here is to call a ROLLBACK or COMMIT at the end of each invoked session. Closing the session automatically issues a ROLLBACK

SQLAlchemy and ISOLATION_LEVEL=AUTOCOMMIT

So, this threw me off, given that one rarely refers to AUTOCOMMIT as an isolation level. Now, this isn’t really a database isolation level. The transaction isolation levels supported by Postgres are UNCOMMITTED, READ COMMITTED, REPEATABLE READ and SERIALIZABLE.


But for some reason SQLAlchemy defines a configuration called isolation_level on the engine which sets the database API into a non-transactional auto commit mode. What this means is that the default BEGIN issued when checking out a connection is no longer issued. The onus is now on the user to issue a BEGIN, ROLLBACK clause when needed. It also seems to set the actual database isolation level to READ COMMITTED.

Previous
Previous

PostgreSQL - Cancelling statement due to conflict with recovery

Next
Next

Transactional Unit Tests with Pytest and Async SQLAlchemy