A simple definition for what a database management system (DBMS) is, would be that it is a complex set of software programs that control the organization, storage and retrieval of data in a database. It also controls the security and integrity of the database.
This article will not attempt to give a detailed explanation of database technology, rather it will serve to introduce the IT auditor to some of the concepts that will be necessary to be understood and performed to support an audit of a DBMS.
But first, in order to understand DBMS there is some database terminology and definitions you will need to understand:
- Concurrency Control – Refers to the class of controls used in database management systems (DBMS) to ensure that transactions are processed in an atomic, consistent, isolated and durable manner (ACID). This implies that only serial and recoverable schedules are permitted, and that committed transactions are not discarded when undoing aborted transactions.
- Data Structure – The relationships among files in a database and among data items within each file.
- Database – A stored collection of related data needed by organizations and individuals to meet their information processing and retrieval requirements.
- Database Administrator (DBA) – An individual or department responsible for the security and information classification of the shared data stored on a database system. This responsibility includes the design, definition and maintenance of the database.
- Database Specifications – These are the requirements for establishing a database application. They include field definitions, field requirements, and reporting requirements for the individual information in the database.
- Foreign Key – A foreign key is a value that represents a reference to a tuple (a row in a table) containing the matching candidate key value (in the relational theory it would be a candidate key, but in real DBMS implementations it is always the primary key). The problem of ensuring that the database does not include any invalid foreign key values is therefore known as the referential integrity problem. The constraint that values of a given foreign key must match values of the corresponding candidate key is known as a referential constraint. The relation (table) that contains the foreign key is referred as the referencing relation and the relations that contain the corresponding candidate key as the referenced relation or target relation.
- Normalization – The elimination of redundant data.
- Repository – The central database that stores and organizes data.
- Transaction log – A manual or automated log of all updates to data files and databases.
- Tuple – A tuple is a row in a database table.
When we speak about Database Management Systems (DBMS), there are three basic types:
- Hierarchical – a database structured in a tree/root or parent/child relationship. Each parent can have many children; however, each child may have only one parent.
- Network – the basic data modeling construct is called a set. A set is formed by an owner record type, a member record type and a name. A member record type can have that role in more than one set, so a multi-owner relationship is allowed. An owner record type can also be a member or owner in another set. Usually, a set defines a 1:N relationship, although one-to-one (1:1) is permitted. A disadvantage of the network model is that such structures can be extremely complex and difficult to comprehend, modify or reconstruct in case of failure.
- Relational – This model is based on the set theory and relational calculations. A relational database allows the definition of data structures, storage/retrieval operations and integrity constraints. In such a database the data and relationships among these data are organized in tables. A table is a collection of rows, also known as tuples, and each tuple in a table contains the same number of columns. Columns, called domains or attributes, correspond to fields. Tuples are equal to records in a conventional file structure.
Relational tables have the following characteristics:
- Values are atomic
- Each row is unique
- Column values are of the same kind
- The sequence of columns is insignificant
- The sequence of rows is insignificant
- Each column has a unique name
Some of the advantages of the relational model over the hierarchical and network model are that it is easier:
- For users to understand and implement a physical database system
- To convert from other database structures
- To implement projection and join operations
- To create new relations for applications
- To implement access control over sensitive data
- To modify the data base
When auditing the controls of a database, the auditor would check to see that the following controls have been implemented and maintained to ensure database integrity and availability:
- Definition standards
- Data backup and recovery procedures
- Access controls
- Only authorized personnel can update the database
- Controls to handle concurrent access problems such as multiple users trying to update the same record at the same time
- Controls to ensure the accuracy, completeness and consistency of data elements and relationships.
- Checkpoints to minimize data loss
- Database re-organizations
- Monitoring database performance
- Capacity planning
- Who can access the database without going through the application?
When we speak of who can access the database, we have already identified one of the major audit concerns and that is what access does the DBA have? As everyone knows the DBA basically has the “keys to the kingdom” and can do (read, write, change, delete) anything. What you have to make sure of is that someone is watching. Someone is monitoring (logging) the actions the DBA takes. And the DBA, doesn’t have the ability to de-activate the log nor do they have access to the log.
It goes without saying that Access Control is the number one issue with database management systems. That being said let’s not forget to audit disaster recovery and restoration, patch management, change management, incident logging and all the other issues an auditor should look for.
There is another issue that auditors need to deal with when auditing DBMS and that is to perform some type of data integrity testing. Data integrity testing is a set of substantive tests (NOTE: Substantive not Compliance testing) that examines accuracy, completeness, consistency and authorization of data presently held in a system. There are two common types of data integrity tests; relational and referential. Relational integrity tests are performed at the data element and record-based levels. It is enforced through data validation routines built into the application or by defining the input condition constraints and data characteristics at the table definition in the database stage. Sometimes it is a combination of both.
Referential integrity test define existence relationships between entities in different tables of a database that needs to be maintained by the DBMS. Referential integrity checks involve ensuring that all references to a primary key from another table actually exist in their original table.
With respect to data integrity in online transaction processing systems there are four online data integrity requirements known collectively as the ACID principle. For those of you that are old enough to remember ACID, congratulations, your brain isn’t completed fried.
The “A” stands for atomicity and from a user’s perspective, a transaction is either completed in its entirety or not at all.
The “C” stands for consistency. Basically, all integrity conditions in the database are maintained with each transaction, taking the database from one consistent state into another consistent state.
The “I” stands for isolation. Each transaction is isolated from other transactions and hence each transaction only accesses data that are part of a consistent database state.
The “D” stands for durability. If a transaction has been reported back to a user as complete, the resulting changes to the database survive subsequent hardware of software failures.
As a parting comment, I would be remiss, if I didn’t mention how the database was populated in a test environment. As many times as I have audited databases, I have found that the production environment was being copied to the test environment to ensure an accurate copy so that changes would not fail once they were moved to production. At least that was the logical in the client’s explanation. What they fail to realize is that the security controls in test are significantly weaker that they are in production and yet there is a mirror unprotected copy sitting there for all to see.
At least as an auditor, you should recommend that the data be sanitized before being used in test.
I hope you’ve enjoyed this brief overview of DBMS and have an appreciation of some things you might check as an auditor.
Until next time, happy reading.