27
votes

I work in a safety-critical industry, and our software projects generally have safety requirements imposed; things that we have to demonstrate that the software does to a high degree of certainty. Often these are negatives, such as " shall not corrupt more frequently than 1 in ". (I should add that these requirements come from statistical system safety requirements).

One source of corruption is clearly coding errors, and I would like to use the Haskell type system to exclude at least some classes of these errors. Something like this:

First, here is our critical data item that must not be corrupted.

newtype Critical = Critical String

Now I want to store this item in some other structures.

data Foo = Foo Integer Critical
data Bar = Bar String Critical

Now I want to write a conversion function from Foo to Bar which is guaranteed not to mess with the Critical data.

goodConvert, badConvert :: Foo -> Bar

goodConvert (Foo n c) = Bar (show n) c

badConvert (Foo n (Critical s)) = Bar (show n) (Critical $ "Bzzt - " ++ s)

I want "goodConvert" to type check, but "badConvert" to fail type checking.

Obviously I can carefully not import the Critical constructor into the module that does conversion. But it would be much better if I could express this property in the type, because then I can compose up functions that are guaranteed to preserve this property.

I've tried adding phantom types and "forall" in various places, but that doesn't help.

One thing that would work would be to not export the Critical constructor, and then have

mkCritical :: String -> IO Critical

Since the only place that these Critical data items get created is in the input functions, this makes some sense. But I'd prefer a more elegant and general solution.

Edit

In the comments FUZxxl suggested a look at Safe Haskell. This looks like the best solution. Rather than adding a "no corruption" modifier at the type level as I originally wanted, it looks like you can do it at the module level, like this:

1: Create a module "Critical" that exports all the features of the Critical data type, including its constructor. Mark this module as "unsafe" by putting "{-# LANGUAGE Unsafe #-}" in the header.

2: Create a module "SafeCritical" that re-exports everything except the constructor and any other functions that might be used to corrupt a critical value. Mark this module as "trustworthy".

3: Mark any modules that are required to handle Critical values without corruption as "safe". Then use this to demonstrate that any function imported as "safe" cannot cause corruption to a Critical value.

This will leave a smaller minority of code, such as input code that parses Critical values, requiring further verification. We can't eliminate this code, but reducing the amount that needs detailed verification is still a significant win.

The method is based on the fact that a function cannot invent a new value unless a function returns it. If a function only gets one Critical value (as in the "convert" function above) then that is the only one it can return.

A harder variation of the problem comes when a function has two or more Critical values of the same type; it has to guarantee not to mix them up. For instance,

swapFooBar :: (Foo, Bar) -> (Bar, Foo)
swapFooBar (Foo n c1, Bar s c2) = (Bar s c1, Foo n c2)

However this can be handled by giving the same treatment to the containing data structures.

3
I consider not exporting the data constructors as a very good method. You might also be interested in Safe Haskell.fuz
While the language/type-system do certainly help, I'm not sure how programs written by humans can be (without countless time spent on formal proofs and analysis, etc.) proven "no corruption"...user166390
@pst Safe Haskell is a step in the right direction.fuz
@FUZxxl Yes, types and "type safety" can be talked about all day. However, consider if the following function is foolishly written: add x y = x * y. Now, considering that I use said function (which has the same signature and static "type safety" guarantees of a similar mul) and expect to add the inputs then there is clearly a flaw in the code. But this is diverging from the question which is specific...user166390
Note that parametricity can also be used in your swapFooBar example, by giving it the type (Foo c1, Bar c2) -> (Bar c1, Foo c2).hammar

3 Answers

19
votes

You can use parametricity to get partway there

data Foo c = Foo Integer c
data Bar c = Bar String c

goodConvert :: Foo c -> Bar c
goodConvert (Foo n c) = Bar (show n) c

Since c is an unconstrained type variable, you know that the function goodConvert cannot know anything about c, and therefore cannot construct a different value of that type. It has to use the one provided in the input.

Well, almost. Bottom values allow you to break this guarantee. However, you at least know that if you try to use a "corrupted" value, it will result in an exception (or non-termination).

badConvert :: Foo c -> Bar c
badConvert (Foo n c) = Bar (show n) undefined
14
votes

While hammar's solution is excellent and I would normally suggest smart constructors / not exporting the constructor, today I decided to try solving this in the Coq proof assistant and extracting to Haskell.

Take note! I am not very well versed in Coq / extraction. Some people have done good work with proving and extracting Haskell code, so look to them for quality examples - I'm just toying!

First we want to define your data types. In Coq this looks much like Haskell GADTs:

Require Import String.
Require Import ZArith.

Inductive Critical :=
  Crit : string -> Critical.

Inductive FooT :=
  Foo : Z -> Critical -> FooT.

Inductive BarT :=
  Bar : string -> Critical -> BarT.

Think of those Inductive lines, such as Inductive FooT := Foo : ... ., as data type declarations: data FooT = Foo Integer Critical

For ease of use, lets get some field accessors:

Definition critF f := match f with Foo _ c => c end.
Definition critB b := match b with Bar _ c => c end.

Since Coq doesn't define many "show" style functions, I'll use a placeholder for showing integers.

Definition ascii_of_Z (z : Z) : string := EmptyString. (* FIXME *)

Now we've got the basics, lets define the goodConvert function!

Definition goodConvert (foo : FooT) : BarT :=
  match foo with
    Foo n c => Bar (ascii_of_Z n) c
  end.

That's all fairly obvious - it's your convert function but in Coq and using a case like statement instead of top-level pattern matching. But how do we know this function is actually going to maintain the invariant? We prove it!

Lemma convertIsGood : forall (f : FooT) (b : BarT),
  goodConvert f = b -> critF f = critB b.
Proof.
  intros. 
  destruct f. destruct b.
  unfold goodConvert in H. simpl.
  inversion H. reflexivity.
Qed.

That says that if converting f results in b then the critical field of f must be the same as the critical field of b (assuming some minor things, such as you not messing up the field accessor implementations).

Now lets extract this to Haskell!

Extraction Language Haskell.
Extract Constant ascii_of_Z => "Prelude.show".  (* obviously, all sorts of unsafe and incorrect behavior can be introduced by your extraction *)
Extract Inductive string => "Prelude.String" ["[]" ":"]. Print positive.
Extract Inductive positive => "Prelude.Integer" ["`Data.Bits.shiftL` 1 + 1" "`Data.Bits.shiftL` 1" "1"].
Extract Inductive Z => "Prelude.Integer" ["0" "" ""].

Extraction "so.hs" goodConvert critF critB.

Producing:

module So where

import qualified Prelude

data Bool =
   True
 | False

data Ascii0 =
   Ascii Bool Bool Bool Bool Bool Bool Bool Bool

type Critical =
  Prelude.String
  -- singleton inductive, whose constructor was crit

data FooT =
   Foo Prelude.Integer Critical

data BarT =
   Bar Prelude.String Critical

critF :: FooT -> Critical
critF f =
  case f of {
   Foo z c -> c}

critB :: BarT -> Critical
critB b =
  case b of {
   Bar s c -> c}

ascii_of_Z :: Prelude.Integer -> Prelude.String
ascii_of_Z z =
  []

goodConvert :: FooT -> BarT
goodConvert foo =
  case foo of {
   Foo n c -> Bar (ascii_of_Z n) c}

Can we run it?? Does it work?

> critB $ goodConvert (Foo 32 "hi")
"hi"

Great! If anyone has suggestions for me, even though this is an "answer", I'm all ears. I'm not sure how to drop the dead code of things like Ascii0 or Bool, not to mention make good show instances. If anyone's curious, I think the field names can be done automatically if I used a Record instead of an Inductive, but that might make this post syntactically uglier.

3
votes

I think the solution of hiding constructors is idiomatic. You can export two functions:

mkCritical :: String -> D Critical
extract :: Critical -> String

where D is the trivial monad, or any other. Any function that creates objects of type Critical at some point is marked with D. A function without that D can extract data from Critical objects, but not create new ones.

Alternatively:

data C a = C a Critical
modify :: (a -> String -> b) -> C a -> C b
modify f (C x (Critical y)) = C (f x y) (Critical y)

If you don't export constructor C, only modify, you can write:

goodConvert :: C Int -> C String
goodConvert = modify (\(a, _) -> show a)

but badConvert is impossible to write.