Deadly Sins of Database Design
By Anton Schaffer – Technical Consultant. Getting the data right is key – without it, everything else falls apart. However, in the chaos and buzz of focusing on applications, we often forget the importance of a simple database design. Things like inexperience, lack of skills, tight schedules and scarcity of resources can ruin your database game.
Let’s dive into some common slip-ups in database design and how to avoid them.
1) What Documentation?
Poor or nonexistent documentation for production databases is a common issue we see all too often, falling into categories of incomplete or inaccurate information. This leads to confusion among developers, DBAs, architects, and business analysts, requiring them to interpret data meaning on their own. One of the many possible solutions involves centralising data models and automating reports for efficiency. Yet, the process doesn’t end there – implementing validation and quality metrics enhances the model quality over time, assisting in data management and expanding metadata capture.
2) Normalise Till it Hurts, Denormalise Till it Works
Occasionally, denormalising a database structure is necessary for optimal performance, but doing so at the expense of flexibility can lead to hardships down the road. Contrary to the “traditional” developer belief in a single, all-encompassing table, it is not always the best approach. Another issue that always tends to rear it’s ugly head is duplicating values within a table, thereby reducing flexibility and complicating data updates. Grasping the fundamentals of normalisation not only enhances design flexibility but also minimises redundant data.
3) Lack of Data Modeling!
Continuous modeling is of the utmost importance for adapting to database changes while maintaining flexibility and consistency. Changes made during the production phase should always be reflected back in the model.
4) Reference Data Scattered Around?
Reference data should be defined in one central location, preferably within the data model, to avoid redundancy and ensure consistency across the application.
5) Down with the Left (Join) Policies!
In older database systems, there was a belief that implementing foreign keys and check constraints could adversely impact performance. Consequently, it was deemed more practical to perform referential integrity checks and validations through the application. However, if validation can be feasibly executed within the database, it is advisable to do so. This approach significantly streamlines error handling, leading to a notable enhancement in data quality.
6) Standards are a Joke!
Integrating domains and adhering to naming standards are likely among the most critical aspects to incorporate into modeling practices. Domains enable the creation of reusable attributes, preventing the duplication of the same attributes with different properties across various locations. Meanwhile, naming standards ensure a consistent and clear identification of these attributes throughout the modeling process.
7) No Foreign Keys without Primary Keys
When selecting a primary key, the fundamental principle to keep in mind is SUM – Static, Unique, Minimal. It’s not necessary to dive into the debate over natural versus surrogate keys. However, it is important to understand that while surrogate keys may guarantee unique identification for a record, they don’t always ensure unique identification for the data itself. That said, there is a time and place for everything, and if a surrogate is used as the primary key, creating an alternate key for natural keys is always an option.
8) Composite Keys – The Red-headed Stepchild of Keys
A composite primary key comprises two or more columns, serving the singular purpose of uniquely identifying a row within the system. Consequently, this primary key is employed in other tables as foreign keys. The use of a composite primary key necessitates adding two or three columns in these related tables to establish a connection with the original table. This process is not as straightforward or efficient as using a single column, but is sometimes neccessary.
9) My Queries are Running Slow!
Database indexes are tools designed to enhance the efficiency of specific queries. However, it’s important to recognise that indexes are not a universal “one-size-fits-all” for performance issues; they cannot address every concern. Commonly, three mistakes are made in relation to indexes:
- Lack of indexes: When there are no indexes – queries are likely to slow down as the table accumulates more data.
- Excessive indexes: Too many indexes can pose challenges for databases. While indexes assist in reading data from a table, they can impede the Data Manipulation Language (DML) processes.
- Indexes on every column: Although the idea of adding indexes to every field in a table might be enticing, doing so can lead to a slowdown in database performance – this is a bad idea!
10) I saw a great movie about databases today! I bet the SQL will be a let down.
Choosing incorrect data types, often due to a lack of comprehensive understanding of business processes, can lead to complications in both storing and retrieving data. When the chosen data type does not align with the actual business requirements, it can have a major negative effect and thereby hinder the extraction of information when needed. This underscores the importance of thorough knowledge about business flows and to ensure precise data representation for optimal utilisation.
So, by eliminating these common hiccups, you’ll be on the road to better data management and ongoing system performance. Cheers to a streamlined database!