Basic Mistakes in Database Testing
There is a huge number of books about TDD. Those books usually concentrate on using test for units of work. Units of work are understood in many different ways, usually it means a class. So the books say: write a lot of tests, make code in such a way the the tests pass. All external resources should be mocked so you can test only this unit.
That’s cool, but unfortunately in many cases all of the testing stops at this moment. There are some queries (written by hand or generated by some ORMs) but they are not tested, usually. Some programmers test those using integration tests – tests which connect to a real database and perform real queries. This usually means testing the happy path – I’ve got my ORM so it takes care of everything, I don’t need to think.
Database is usually the most valuable asset a company owns. Applications are rewritten from time to time. Old applications are thrown away, new applications are installed. Nobody throws away the database full of data when changing the applications. The database is carefully migrated. There are systems where many different applications use the same database at the same time. That’s why it is important to have good database model full of constraints and why database should be treated with care. You really don’t want to destroy the data consistency as this could cost your company too much.
This article is about often forgotten database testing. Integration tests using real databases. It really doesn’t matter what kind of database engine you are using. You can use PostgreSQL, MySQL, Oracle or even those funny noSQL databases like MongoDB. Below rules can be applied to all kinds of databases and all kinds of applications. Maybe not all of them, as noSQL databases cannot enforce data integrity.
Your application usually consists of many different parts. There is some <put here any language you like> code, some configuration files, some SQL queries, some external systems. Testing an application means testing each part separately (only because that makes finding bugs easier) and testing how all the parts cooperate. Database is one of suchs parts, and you should test it thoroughly. Continue reading
I was explaining the PostgreSQL version naming convention to my colleague a couple of days ago. The differences between e.g. 8.3 and 8.4 and what is different between 8.4 and 9.0 and what should be used during database migration to another version. I knew all that just because somebody told me that many years ago (I think it was depesz – thanks).
However I was unaware that there is documentation of the whole versioning issue. All you can find here.
Another reason why love this project. And yes, you should have that versioning policy in you project too (it really doesn’t matter if it looks different – better to have a convention than not).
I observe the endless war about TDD. On one side there are those who claim that TDD is the best thing since sliced bread. On the other side there are those who claim that TDD is just a waste of time and is too slow.
But what about databases? Nobody writes about them too seriously.
In some books there are some wise words like:
- we should use TDD for DAO
- we should use TDD for ORM objects
- we should use TDD for queries
This is very good for testing unique, foreign and primary keys. Very good for all checks. What is missing here? Continue reading
Some time ago I wrote about ensuring that there is only one row with a specific value in a column.
Today a very similar problem: there is a table like this:
CREATE TABLE test (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
value TEXT NOT NULL
We need to be sure that there is only one row withing specific name with value=’a’. There can be more rows with this name, but only one can have value ‘a’. Continue reading
During the last database training that I led, I was talking about selecting rows from database and sorting them (including some problems with collations). Later I was talking about triggers. For this entry assume that there are only row level triggers.
When you perform e.g. INSERT in PostgreSQL, there are fired triggers (if they exists, of course). At the very beginning there are fired all BEFORE INSERT triggers. Then INSERT is made, and later all triggers AFTER INSERT are fired.
If more than one trigger is defined for the same event on the same relation, the triggers will be fired in alphabetical order by trigger name.
And then suddenly I was asked:
- OK, that’s clear, but which collation is used for ordering the triggers?
Good point, I really had no idea. So let’s check that.