6
votes

I'm using QuickCheck to generate arbitrary functions, and I'd like to generate arbitrary injective functions (i.e. f a == f b if and only if a == b).

I thought I had it figured out:

newtype Injective = Injective (Fun Word Char) deriving Show

instance Arbitrary Injective where
  arbitrary = fmap Injective fun
    where
      fun :: Gen (Fun Word Char)
      fun = do
        a <- arbitrary
        b <- arbitrary
        arbitrary `suchThat` \(Fn f) ->
          (f a /= f b) || (a == b)

But I'm seeing cases where the generated function maps distinct inputs to the same output.

What I want:

  • f such that for all inputs a and b, either f a does not equal f b or a equals b.

What I think I have:

  • f such that there exist inputs a and b where either f a does not equal f b or a equals b.

How can I fix this?

1
A Fun Word Char can't be injective, since there's more Words than there are Chars.Joseph Sible-Reinstate Monica
Maybe you need to exploit MonadFix Gen to define f recursively, so that f n is a random number distinct from f 0 .. f (n-1). I'm not sure, however, if this will actually work (it looks unfeasible right now?). Another possibility is to use arbitrary once to first generate a seed of a RNG, and then define f recursively without messing with the monad Gen. It will have the same (or worse) statistical issues that split has on RNG seeds, but since we are only using it for QuickCheck it could be good enough.chi
Not sure how helpful this but if you have one injection f:A->B then any other injection g:A->B can be written p.f where p:B->B is a bijection. So if you have one injection you could generate others by composing with 'random' permutationsdmuir

1 Answers

2
votes

You've correctly identified the problem: what you're generating is functions with the property ∃ a≠b. f a≠f b (which is readily true for most random functions anyway), whereas what you want is ∀ a≠b. f a≠f b. That is a much more difficult property to ensure, because you need to know about all the other function values for generating each individual one.

I don't think this is possible to ensure for general input types, however for word specifically what you can do is “fake” a function by precomputing all the output values sequentially, making sure that you don't repeat one that has already been done, and then just reading off from that predetermined chart. It requires a bit of laziness fu to actually get this working:

import qualified Data.Set as Set

newtype Injective = Injective ([Char] {- simply a list without duplicates -})
 deriving Show

instance Arbitrary Injective where
  arbitrary = Injective . lazyNub <$> arbitrary

lazyNub :: Ord a => [a] -> [a]
lazyNub = go Set.empty
 where go _ [] = []
       go forbidden (x:xs)
        | x `Set.member` forbidden  = go forbidden xs
        | otherwise                 = x : go (Set.insert x forbidden) xs

This is not very efficient, and may well not be ok for your application, but it's probably the best you can do.

In practice, to actually use Injective as a function, you'll want to wrap the values in a suitable structure that has only O (log n) lookup time. Unfortunately, Data.Map.Lazy is not lazy enough, you may need to hand-bake something like a list of exponentially-growing maps.

There's also the concern that for some insufficiently big result types, it is just not possible to generate injective functions because there aren't enough values available. In fact as Joseph remarked, this is the case here. The lazyNub function will go into an infinite loop in this case. I'd say for a QuickCheck this is probably ok though.