20
votes

I just read a blog post about NHibernate's ability to create a GUID from the system time (Guid.Comb), thus avoiding a good amount of database fragmentation. You could call it the client-side equivalent to the SQL Server Sequential ID.

Is there a way I could use a similar strategy in my Linq-to-Sql project (by generating the Guid in code)?

6
I tried all of these COMB guid samples and they all generate high fragmentation with 10K rows. arul's version - 98% fragmentation.<br/> NHibernate - 53% fragmentation.<br/> rpcrt4.dll - 98% fragmentation<br/> bigint - 6% fragmentation<br/> How well are people expecting COMB goods to perform?Shaun
@Shaun this depends on how fast you insert, if you only have a few inserts per second there should be no fragmentation.Peter

6 Answers

50
votes

C# (safe) code (Compliments of the NHibernate Guid Comb Generator)

Guid GenerateComb()
{
    byte[] destinationArray = Guid.NewGuid().ToByteArray();
    DateTime time = new DateTime(0x76c, 1, 1);
    DateTime now = DateTime.Now;
    TimeSpan span = new TimeSpan(now.Ticks - time.Ticks);
    TimeSpan timeOfDay = now.TimeOfDay;
    byte[] bytes = BitConverter.GetBytes(span.Days);
    byte[] array = BitConverter.GetBytes((long) (timeOfDay.TotalMilliseconds / 3.333333));
    Array.Reverse(bytes);
    Array.Reverse(array);
    Array.Copy(bytes, bytes.Length - 2, destinationArray, destinationArray.Length - 6, 2);
    Array.Copy(array, array.Length - 4, destinationArray, destinationArray.Length - 4, 4);
    return new Guid(destinationArray);
}

A link to the source on github: https://github.com/nhibernate/nhibernate-core/blob/master/src/NHibernate/Id/GuidCombGenerator.cs

9
votes

COMBs are generated the following way:

DECLARE @aGuid UNIQUEIDENTIFIER

SET @aGuid = CAST(CAST(NEWID() AS BINARY(10)) + CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER)

Which transcribed into C# would look like this:

    public static unsafe Guid CombGuid()
    {
        Guid guid = Guid.NewGuid();
        byte[] bytes = guid.ToByteArray();
        long ticks = DateTime.Now.Ticks;
        fixed( byte* pByte = bytes )
        {
            int*    pFirst  = (int *)(pByte + 10);
            short* pNext    = (short*)(pByte + 14);
            *pFirst = (int)(ticks & 0xFFFFFF00);
            *pNext  = (short)ticks;
        }

        return new Guid( bytes );
    }
3
votes

Well, you could generate the Guid by hand. However, one of the advantages of a Guid is that it isn't guessable - i.e. given record 0000-...-0005, there is usually little point (from an attacker) checking for record 0000-....-0004 etc.

Also - re fragmentation? As long as you have a non-clustered index on this data, I'm not sure that this is an issue. You wouldn't normally put a clustered index on a Guid, so the table will be a heap (unless you have a separate clustered index, such as an IDENTITY int). In which case you will be adding to the end, and inserting the new Guid into the non-clustered index. No real pain.

(edit) One problem of using the time directly is that you introduce a lot more risk of collisions; you would need to worry about tight-loop Guid creation (i.e. avoiding repetition when creating a few in sequence), which means synchronization, etc - and it gets even more troublesome if multiple machines are working intensively in parallel - chances are you'll get duplicates.

3
votes

You can always call UuidCreateSequential; this is the 'old' guid generator (pre-2000-ish when MSFT changed it to the more random style guids we are used to today). They renamed the old UuidCreate to UuidCreateSequential, and put their new guid generator in a new implementation of UuidCreate. UuidCreateSequential is also what SQL Server uses in NewSequentialID(), and it is as unique as normal guids but with the benefit that they are sequential if you create a pile of them in a row in the same process.

using System;
using System.Runtime.InteropServices;

namespace System
{
    public static class GuidEx
    {
        [DllImport("rpcrt4.dll", SetLastError = true)]
        private static extern int UuidCreateSequential(out Guid guid);
        private const int RPC_S_OK = 0;

        /// <summary>
        /// Generate a new sequential GUID. If UuidCreateSequential fails, it will fall back on standard random guids.
        /// </summary>
        /// <returns>A GUID</returns>
        public static Guid NewSeqGuid()
        {
            Guid sequentialGuid;
            int hResult = UuidCreateSequential(out sequentialGuid);
            if (hResult == RPC_S_OK)
            {
                return sequentialGuid;
            }
            else
            {
                //couldn't create sequential guid, fall back on random guid
                return Guid.NewGuid();
            }
        }
    }
}
2
votes

@arul, @Doug

Why did you put the time part at the end of the GUID?

I thought that the leading bytes are more significant for ordering, and ordering is why the time part was introduced in the first place to prevent index fragmentation.

Ok, I found the answer, and this answer from Bernhard Kircher and the site Comparing GUID and uniqueidentifier Values (ADO.NET) he references to.

GUIDs generated this way would therefore not work the same way on other databases than MS SQL-Server but this is not related to LINQ-to-SQL.

Sorry for the deformed URLs but I have not enough reputation to post more links.

0
votes

We used a similar method to what Doug has posted above in Entity Framework model first, so you must be able to do it using Linq to SQL too.

While doing this we needed a comb guid generator for testing, and ended up building this little tool to generate comb guids online

http://www.webdesigncompany.co.uk/comb-guid/

Hopefully it will help you too.