Suppose we want to create a join table in a Ruby on Rails application between two models, Foo
and Bar
. There’s plenty of material available online describing join tables in detail (the Wikipedia page on associative entities is pretty good), but to begin, let’s go over a brief recap and an example:
Join tables implement many-to-many relationships between entities in a relational database. An entry in a join table will hold the primary keys of the entities it associates. For example, the linked Wikipedia article above (at time of writing) shows that we can use a Registration
entity to implement a join table, and this join table holds a relationship between a Student
and a Course
. This Registration
entity contains the primary keys of a Student
and a Course
.
Having seen an example, let’s a join table between a Foo
and a Bar
. A join table between Foo
and Bar
will hold the primary key of a Foo
and the primary key of a Bar
(ie. the IDs of these two models). Searching through the Ruby on Rails documentation, we may find that create_join_table
is a method that is available inside of a database migration. To use this method, we’d write a migration like so:
If we run the migration, we do get a working join table in our database. However, if we look at the underlying SQL this migration generates, it might not quite match what we expect to get. In this blog post, I’m using SQLite (the default option provided by Rails), but we would see similar results when using a different database like PostgreSQL. The generated SQL is as follows:
There are two interesting things to note about the above SQL.
Firstly, our join table doesn’t actually use a foreign key constraint. This means that our SQL database will let us insert any integer into the values of foo_id
and bar_id
, even if such an ID doesn’t exist in our tables for Foo
and Bar
.
Secondly, the generated SQL has NOT NULL
constraints, meaning our join table entries must always have an integer value set for foo_id
and bar_id
. This makes sense in some, but not all, use cases.
In the Student
and Course
example from the beginning of the article, NOT NULL
fits the use case – it doesn’t make sense to have a Registration
with a Course
but no Student
, or a Registration
with a Student
but no Course
.
However, let’s look at a slightly different use case – suppose we’re trying to keep track of floating software licensing, where we limit the number of users who have permission to use some software at a given time. In this case, we may have a join table named something like LicenseAssignment
between a User
and a SoftwareApplication
. In this case, however, it makes sense that User
could be NULL
– it would just mean that not all of our licenses are currently being used. In this case, we wouldn’t want a NOT NULL
constraint.
As of writing, most hits on the first page of Google for “rails join table tutorial” use database migrations that implement pretty different results from what we see above. I’ve excluded database indexes for brevity, but for the most part, they look like so:
This migration generates the following SQL:
If we use the above SQL, we can rely on our database to enforce some level of data integrity via its foreign key constraints, and we support the use case where a Foo
ID or a Bar
ID can be NULL
.
So, what’s the lesson behind all of this? Double-check your Rails migrations’ auto-generated SQL, and make sure it fits your assumptions and use case. Superficially, the two Rails migrations outlined in the article look similar, but their underlying data models are actually very different.
Start your journey towards writing better software, and watch this space for new content.