By default, all Rails models have an integer id which is used as the primary key. In Postgres, these are handled in a things_id_seq relation, where things is whatever the model’s table is. Generally this sequence looks like this:

max_id_development=# \d things_id_seq    
        Sequence "public.things_id_seq"
    Column     |  Type   |        Value        
---------------+---------+---------------------
 sequence_name | name    | things_id_seq
 last_value    | bigint  | 1
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 0
 is_cycled     | boolean | f
 is_called     | boolean | f
Owned by: public.things.id

In this relation last_value keeps track of the last id used by an object when it was created. The fact that this value is only incremented (without special intervention) is the reason that creating new objects will not fill in gaps in IDs where previous entries had been deleted. The min_value and max_value columns specify the range of IDs. By default, everything is a bigint here (Postgres’ 8-byte integer), giving it the max value of (2^63)-1. It’s a very big number. For more information on how sequences work, I recommend this article and its associated links.

So, given that this is a fixed sequence of IDs, what happens if we have more than that many entries? That question breaks down into to others. First, what happens if the next object’s ID is greater than the maximum value allowed by its datatype? Second, what happens if that ID is greater than the sequence.

Datatype overflow

It may seem that the two questions are the same - the max_value listed above is the maximum allowed by the datatype. However, if you take a look the actual model’s table, you will notice something wrong:

max_id_development=# \d things
                                     Table "public.things"
   Column   |            Type             |                      Modifiers                      
------------+-----------------------------+-----------------------------------------------------
 id         | integer                     | not null default nextval('things_id_seq'::regclass)
 active     | boolean                     | 
 created_at | timestamp without time zone | 
 updated_at | timestamp without time zone | 
Indexes:
    "things_pkey" PRIMARY KEY, btree (id)    

The id column is an integer, so even though the sequence uses a maximum ID of (2^63)-1, this 4-byte integer will overflow long before we reach that value. The max value for this integer type is 2,147,483,647. Admittedly, this is really not that high of an ID at just over 2 billion. Now, in order to reach that value in ID, I could have made 2.1 billion objects, but that would take a long time and a fair amount of disk space. Also, it’s very easy to edit a sequence. Running the following set the last_value of the ID sequence to be close to that of the highest value allowed in the datatype: SELECT setval('things_id_seq', 2147483644);. The next two Thing.create calls worked fine. Then I did another one.

irb(main):021:0> Thing.create
   (0.3ms)  BEGIN
  SQL (3.4ms)  INSERT INTO "things" ("created_at", "updated_at") VALUES ($1, $2) RETURNING "id" 
  [["created_at", Sat, 18 Jan 2014 17:54:19 UTC +00:00], ["updated_at", Sat, 18 Jan 2014 17:54:19 UTC +00:00]]
PG::NumericValueOutOfRange: ERROR:  integer out of range
...

You can see the whole stacktrace here. Cool - this isn’t unexpected, though if you started seeing these popping up it might be tricky to figure out which integer column was causing the problem as the error message doesn’t specify the problem column.

If you find yourself with this problem, one solution is to alter the id column to be a bigint. Luckily, this is possible in Postgres without losing data. Just run ALTER table things ALTER COLUMN id TYPE bigint; and the table will be updated and able to use the full sequence available. In my quick tests, this doesn’t seem to mess with ActiveRecord and everything continues to work. You can also specify the bigint datatype when you create new models if you think you will run into this problem. Lastly, it is important to note that that ALTER command will block all reads and writes to the entire table until it finishes, which may take a while with over 2 billion rows.

Max ID Overflow

The other possible situation is that in which you have exceeded the maximum ID in the sequence. For this, we need to edit the maximum value of the sequence to be different than the max for the datatype. Doing that is as easy as running ALTER SEQUENCE things_id_seq MAXVALUE 10;. This changes the sequence to be the following (I destroyed and recreated the database before doing this, so everything was reset):

max_id_development=# \d things_id_seq
     Sequence "public.things_id_seq"
    Column     |  Type   |     Value     
---------------+---------+---------------
 sequence_name | name    | things_id_seq
 last_value    | bigint  | 1
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 10
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 0
 is_cycled     | boolean | f
 is_called     | boolean | f
Owned by: public.things.id

Great - Now to make some objects (I could have also used the setval command above): 10.times { Thing.create }. Now the next time we try to make an object:

irb(main):004:0> Thing.create
   (0.3ms)  BEGIN
  SQL (2.5ms)  INSERT INTO "things" ("created_at", "updated_at") VALUES ($1, $2) RETURNING "id"
  [["created_at", Sat, 18 Jan 2014 18:04:36 UTC +00:00], ["updated_at", Sat, 18 Jan 2014 18:04:36 UTC +00:00]]
PG::ObjectNotInPrerequisiteState: ERROR:  nextval: reached maximum value of sequence "things_id_seq" (10)
...

Again, the whole stacktrace is here. As expected, another error is raised. In my opinion, this is a much more descriptive error than the previous case as it shows both the message that we reached the maximum value, and also gives that value. If you see this error, your options are either to bump up the maximum value of the sequence (assuming it is not also the maximum allowed by the datatype), or use a different datatype for that ID.

Rails 4 added support for UUID primary keys, which would mitigate the issues (or at least delay) presented here. This post gives a good explanation of how to set up models with UUIDs.

These 2 situations are certainly edge cases, but not that unlikely as Rails applications grow (especially the 2.1 billion default threshold). The last important takeaway I want to stress is that it would be best to anticipate these issues before they happen - debugging after the fact will be nightmarish and stressful.

Updated 1/21/14: Thank you to Cody Russell for pointing out that I incorrectly said “bit” instead of “byte” for the datatype sizes

blog comments powered by Disqus