The foreign key constraint is an important aspect of database design. This article explains why.
Foreign key constraint advantages
The purpose of the foreign key constraint is to enforce referential integrity but there are also performance benefits to be had by including them in your database design.
Firstly lets look at an example of how they are used in database design.
So here are my two tables.
CREATE TABLE Accounts ( ID INT PRIMARY KEY IDENTITY(1,1), Name VARCHAR(100) ) GO CREATE TABLE Orders ( ID INT PRIMARY KEY IDENTITY(1,1), OrderDate DATETIME DEFAULT(GETDATE()), AccountID INT NOT NULL CONSTRAINT FKAccountID REFERENCES Accounts(ID) ) GO
Now we’ll insert some data.
INSERT INTO Accounts(Name) VALUES('Test Company 1'),('Test Company 2'); INSERT INTO Orders(AccountID) VALUES (1),(2),(1),(2),(1),(2),(1),(2),(1),(2),(1),(2) ,(1),(2),(1),(2),(1),(2),(1),(2),(1),(2),(1),(2);
Let’s have a quick look at the data.
SELECT * FROM Accounts; SELECT TOP (5) * FROM Orders;
ID Name ----------- --------------- 1 Test Company 1 2 Test Company 2 (2 row(s) affected)
ID OrderDate AccountID ----------- ----------------------- ----------- 1 2011-12-04 11:03:08.533 1 2 2011-12-04 11:03:08.533 2 3 2011-12-04 11:03:08.533 1 4 2011-12-04 11:03:08.533 2 5 2011-12-04 11:03:08.533 1 (5 row(s) affected)
So we have inserted rows into table “Orders” which relate to “Accounts” by the AccountID and ID columns respectively. No problems with that. What happens if we try and insert a new row into “Orders” for an account which does not exist in table “Accounts”?
INSERT INTO Orders(AccountID) VALUES(3);
We get an error.
Msg 547, Level 16, State 0, Line 1 The INSERT statement conflicted with the FOREIGN KEY constraint "FK__Orders__AccountI__0AD2A005". The conflict occurred in database "DBADiaries", table "dbo.Accounts", column 'ID'. The statement has been terminated.
So the foreign key constraint is doing its job and only allowing recognised account ids to be added to the “Orders” table.
Now lets say for whatever reason someone attempted to remove a row from table “Accounts” which had related records in table “Orders”
DELETE Accounts WHERE ID = 2;
We get an error.
Msg 547, Level 16, State 0, Line 1 The DELETE statement conflicted with the REFERENCE constraint "FK__Orders__AccountI__0AD2A005". The conflict occurred in database "DBADiaries", table "dbo.Orders", column 'AccountID'. The statement has been terminated.
Cascading deletes are turned off in this instance so as well as stopping bad data getting into the table, the foreign key constraint is preventing data from being deleted which in this case is exactly what I need it to do.
Foreign key constraint performance benefits
How can a foreign key constraint benefit performance? Well let’s have a look at this simple example using the tables previously created.
Activate “Include Actual Execution Plan” in Management Studio using either Ctrl + M or the button on the toolbar. Run a simple query checking for records in table “Orders” which relate to a row in table “Accounts” and then check the execution plan
SELECT * FROM Orders O WHERE EXISTS (SELECT * FROM Accounts A WHERE A.ID = O.AccountID);
Execution plan output:
Now we will remove the foreign key constraint
ALTER TABLE Orders DROP CONSTRAINT FKAccountID;
Re run the preceeding SQL statement and check the execution plan again and it has changed.
So why is it different? The optimizer has to now execute the EXISTS part of the query because it cannot be sure whether table “Accounts” has any valid references. Having the foreign key in there meant that the optimizer could trust it and therefore by definition it did not have to check table “Accounts” when returning all rows from “Orders”. This is because a valid reference in “Accounts” must exist for a row to be stored in “Orders”
Could a foreign key constraint become untrusted?
The answer is yes.
For example you might decide to disable a foreign key when loading in large amounts of data. It is easier to batch insert consistent rows of data into a database without foreign keys enabled.
An untrusted foreign key would mean that the second execution plan would be used for the query which will not perform as fast as the first. If you had tables with lots of rows in, this could make a massive difference to performance.
For the purposes of this explanation, I have added the FKAccountID foreign key constraint and I ran this statement:
ALTER TABLE Orders NOCHECK CONSTRAINT FKAccountID;
So how do you tell whether your foreign key is trusted? Run this query:
SELECT Name, Is_Not_Trusted FROM sys.foreign_keys WHERE Name = 'FKAccountID'
Which outputs this information.
Name Is_Not_Trusted -------------------- -------------- FKAccountID 1 (1 row(s) affected)
To correct this run this SQL:
ALTER TABLE Orders WITH CHECK CHECK CONSTRAINT FKAccountID;
You could also look for all untrusted foreign keys in your database as part of a performance tuning exercise.
So a foreign key constraint has advantages and should be part of your design to ensure that you have a consistent database and to help ensure that the database performs optimally.