I'm using Rails 3.1 with PostgreSQL 8.4. Let's assume I want/need to use GUID primary keys. One potential drawback is index fragmentation. In MS SQL, a recommended solution for that is to use special sequential GUIDs. One approach to sequential GUIDs is the COMBination GUID that substitutes a 6-byte timestamp for the MAC address portion at the end of the GUID. This has some mainstream adoption: COMBs are available natively in NHibernate (NHibernate/Id/GuidCombGenerator.cs).
I think I've figured out how to create COMB GUIDs in Rails (with the help of the UUIDTools 2.1.2 gem), but it leaves some unanswered questions:
- Does PostgreSQL suffer from index fragmentation when the PRIMARY KEY is type UUID?
- Is fragmentation avoided if the low-order 6 bytes of the GUID are sequential?
- Is the COMB GUID as implemented below an acceptable, reliable way to create sequential GUIDs in Rails?
Thanks for your thoughts.
create_contacts.rb
migration
class CreateContacts < ActiveRecord::Migration
def up
create_table :contacts, :id => false do |t|
t.column :id, :uuid, :null => false # manually create :id with underlying DB type UUID
t.string :first_name
t.string :last_name
t.string :email
t.timestamps
end
execute "ALTER TABLE contacts ADD PRIMARY KEY (id);"
end
# Can't use reversible migration because it will try to run 'execute' again
def down
drop_table :contacts # also drops primary key
end
end
/app/models/contact.rb
class Contact < ActiveRecord::Base
require 'uuid_helper' #rails 3 does not autoload from lib/*
include UUIDHelper
set_primary_key :id
end
/lib/uuid_tools.rb
require 'uuidtools'
module UUIDHelper
def self.included(base)
base.class_eval do
include InstanceMethods
attr_readonly :id # writable only on a new record
before_create :set_uuid
end
end
module InstanceMethods
private
def set_uuid
# MS SQL syntax: CAST(CAST(NEWID() AS BINARY(10)) + CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER)
# Get current Time object
utc_timestamp = Time.now.utc
# Convert to integer with milliseconds: (Seconds since Epoch * 1000) + (6-digit microsecond fraction / 1000)
utc_timestamp_with_ms_int = (utc_timestamp.tv_sec * 1000) + (utc_timestamp.tv_usec / 1000)
# Format as hex, minimum of 12 digits, with leading zero. Note that 12 hex digits handles to year 10889 (*).
utc_timestamp_with_ms_hexstring = "%012x" % utc_timestamp_with_ms_int
# If we supply UUIDTOOLS with a MAC address, it will use that rather than retrieving from system.
# Use a regular expression to split into array, then insert ":" characters so it "looks" like a MAC address.
UUIDTools::UUID.mac_address = (utc_timestamp_with_ms_hexstring.scan /.{2}/).join(":")
# Generate Version 1 UUID (see RFC 4122).
comb_guid = UUIDTools::UUID.timestamp_create().to_s
# Assign generted COMBination GUID to .id
self.id = comb_guid
# (*) A note on maximum time handled by 6-byte timestamp that includes milliseconds:
# If utc_timestamp_with_ms_hexstring = "FFFFFFFFFFFF" (12 F's), then
# Time.at(Float(utc_timestamp_with_ms_hexstring.hex)/1000).utc.iso8601(10) = "10889-08-02T05:31:50.6550292968Z".
end
end
end
config.autoload_paths += %W(#{config.root}/lib)
. – qerub