35

I have created a library in Python that contains functions for accessing a database. This is a wrapper library around a third party application database, written due to the fact that the third party application does not offer a decent API. Now I originally let each function open a database connection for the duration of the function call which was OK, until my program logic used nested calls to the functions where I would then be calling a particular function a few thousand times. This wasn't very performant. Profiling this showed that the overhead was in the database connection setup - once per function call. So I moved the open connection from within the function(s) to the module itself, so that the database connection would be opened when the library module was imported. This gave me an acceptable performance.

Now I have two questions regarding this. Firstly, do I need to be concerned that I am no longer explicitly closing the database connection and how could I do it explicitly with this set-up? Secondly, does what I have done fall anywhere close to the realm of good practice and how might I otherwise approach this?

leancz
  • 481
  • 1
  • 4
  • 7

4 Answers4

42

It really depends on the library you're using. Some of them could be closing the connection on their own (Note: I checked the builtin sqlite3 library, and it does not). Python will call a destructor when an object goes out of scope, and these libraries might implement a destructor that closes the connections gracefully.

However, that might not be the case! I would recommend, as others have in the comments, to wrap it in an object.

class MyDB(object):

    def __init__(self):
        self._db_connection = db_module.connect('host', 'user', 'password', 'db')
        self._db_cur = self._db_connection.cursor()

    def query(self, query, params):
        return self._db_cur.execute(query, params)

    def __del__(self):
        self._db_connection.close()

This will instantiate your database connection at the start, and close it when the place your object was instantiated falls out of scope. Note: If you instantiate this object at the module level, it will persist for your entire application. Unless this is intended, I would suggest separating your database functions from the non-database functions.

Luckily, python has standardized the Database API, so this will work with all of the compliant DBs for you :)

Travis
  • 621
  • 6
  • 5
  • How do you avoid that self in def query(self,? – samayo Nov 09 '15 at 19:08
  • 3
    define avoid? Self is what defines that as an instance method, rather than a class method. I guess you could create the database as a static property on the class, and then only use class methods (no self needed anywhere), but then the database would be global to the class, not just the individual instantiation of it. – Travis Nov 09 '15 at 22:46
  • Yeah, because I tried to use your example to make a simple query as db.query('SELECT ...', var) and it complained about needing a third argument. – samayo Nov 09 '15 at 23:30
  • @samson, you need to instantiate MyDB object first: db = MyDB(); db.query('select...', var) – cowbert Sep 07 '17 at 18:00
  • This prevented messages ResourceWarning: unclosed <socket.socket... – Bob Stein Sep 14 '17 at 16:36
  • 1
    Shouldn't a new cursor instance be created on each query? https://stackoverflow.com/questions/8099902/should-i-reuse-the-cursor-in-the-python-mysqldb-module – skeller88 Jan 17 '18 at 19:14
  • 1
    shouldn't the cursor instance be closed? also I don't think the query method actually works - per the same linked documentation - Return values are not defined. – cpd1 Nov 02 '18 at 21:37
  • 2
    Isn't using the destructor __del__ bad practice in python? – Michele Piccolini Feb 28 '20 at 15:32
  • @skeller88 No, you should not create a new cursor for each query. Reuse one for everything in that operation if you can (you can, if it gets all rows before moving on). Even if the columns are completely different and everything. If the database is different, then you need a cursor from that other database. Advice saying to make a new cursor every time is superstition. – doug65536 Nov 06 '21 at 22:33
8

while handling database connections there are two things to be concerned about:

  1. prevent multiple connections instantiations, letting each function open a database connection is considered bad practice, giving the limited number of database sessions, you would run out of sessions; at least your solution wouldn't scale out, instead, use singleton pattern, your class would get instantiated only once, for further information about this pattern see link

2. closing the connection upon app exit, let's say you didn't, and that you have at least dozen instances of the app running doing the same, at first everything would go fine, but you would run out of database sessions, and the only fix would be to restart the database server, which is not a good thing for a live app hence use the same connection whenever possible.

to solidify all these concepts see the following example that wraps psycopg2

import psycopg2

class Postgres(object): """docstring for Postgres""" _instance = None

def __new__(cls):
    if cls._instance is None:
        cls._instance = object.__new__(cls)
        # normally the db_credenials would be fetched from a config file or the enviroment
        # meaning shouldn't be hardcoded as follow
        db_config = {'dbname': 'demo', 'host': 'localhost',
                 'password': 'postgres', 'port': 5432, 'user': 'postgres'}
        try:
            print('connecting to PostgreSQL database...')
            connection = Postgres._instance.connection = psycopg2.connect(**db_config)
            cursor = Postgres._instance.cursor = connection.cursor()
            cursor.execute('SELECT VERSION()')
            db_version = cursor.fetchone()

        except Exception as error:
            print('Error: connection not established {}'.format(error))
            Postgres._instance = None

        else:
            print('connection established\n{}'.format(db_version[0]))

    return cls._instance

def __init__(self):
    self.connection = self._instance.connection
    self.cursor = self._instance.cursor

def query(self, query, params=None):
    try:
        result = self.cursor.execute(query, params)
    except Exception as error:
        print('error execting query &quot;{}&quot;, error: {}'.format(query, error))
        return None
    else:
        return result

def __del__(self):
    self.connection.close()
    self.cursor.close()

rachid
  • 181
  • 1
    Hi! Thank you for your answer. But when I try to implement it in my case I have if Database._instance is None: NameError: name 'Database' is not defined. I cannot understand what Database is and how I could fix this. – franchb Oct 15 '17 at 18:16
  • 1
    @IlyaRusin that was my fault, in fact Database is just a parent class in which i put common methods for different RDBMS handling since I connect not only to Postgres. However sorry for the mistake and I hope the corrected version may be useful to you, feel free to add, modify the code to your needs if you have any related question don't hesitate. – rachid Oct 15 '17 at 19:38
  • If I would repeatedly call Postgres.query(Postgres(), some_sql_query) in a while loop, would it still open and close the connection in each iteration, or keep it open for the entire time of the while loop until the program exits? –  Oct 10 '19 at 11:51
  • @Michael the connection class is implemented as a singleton, hence it would be instantiated only one time, but overall i would recommend against the suggested way of call, instead initiate it in a variable – rachid Oct 11 '19 at 07:56
  • 1
    @ponach Thanks, it does exactly what I wanted to achieve. I adapted your code a bit and tried to use an UPDATE statement in your query() function, but it seems there is a problem with my code when I run my app in "parallel". I made a separate question about it: https://softwareengineering.stackexchange.com/questions/399582/update-statements-from-multiple-database-connections-not-seeing-their-updates –  Oct 11 '19 at 09:54
  • 1
    Nice class! The only improvement I'd see is the adding of a "value=()" parameter for the query function. AFAIK it's the best practice to let the library fill the "%s" placeholders in the query string. https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries – Zapnuk Apr 17 '20 at 13:07
  • @Zapnuk I totally agree at the time of writing i wasn't handling parameters, but as u said it is a much needed change to do – rachid Apr 17 '20 at 22:27
  • It's late but , When ever I initiate Postgres it connect but not execute query with query method. any suggestions? – Pranav Bhatt Mar 29 '21 at 07:24
7

It would interesting to offer context manager capabilities for your objects. This means you can write a code like this:

class MyClass:
    def __init__(self):
       # connect to DB
    def __enter__(self):
       return self
    def __exit__(self):
       # close the connection

This will offer you a handy way to close connexion to the database automatically by call the class using the with statement:

with MyClass() as my_class:
   # do what you need
# at this point, the connection is safely closed.
1

Create a file conn.py and save it in /usr/local/lib/python3.5/site-packages/conn/ folder. I use freebsd and this is the path of my site-packages folder. in my conn.py:

conn = "dbname=omnivore user=postgres password=12345678"

and in the script I want to call the connection, I write:

import psycopg2
import psycopg2.extras
import psycopg2.extensions

from conn import conn try: conn = psycopg2.connect(conn.conn) except: page = "Can not access database"

cur = conn.cursor()

Glorfindel
  • 3,137
phong
  • 31