13
votes

In Python documentation for typing & type hints we have the below example:

Vector = List[float]

def scale(scalar: float, vector: Vector) -> Vector:
    return [scalar * num for num in vector]

Vector type alias clearly shows that type aliases are useful for simplifying complex type signatures.

However, what about aliasing primitive data types?

Let's contrast two basic examples of function signatures:

URL = str    

def process_url(url: URL) -> URL:
    pass

vs.

def process_url(url: str) -> str:
    pass

Version with type alias URL for primitive type str is:

  • self-documenting (among others, now I can skip documenting returned value, as it should be clearly an url),
  • resistant to type implementation change (I can switch URL to be Dict or namedtuple later on without changing functions signatures).

The problem is I cannot find anyone else following such practice. I am simply afraid that I am unintentionally abusing type hints to implement my own ideas instead of following their intended purpose.


Note from 2020-10

Python 3.9 introduces "flexible function and variable annotations", which allows to make annotations like:

def speed_1(distance: "feet", time: "seconds") -> "miles per hour":
    pass

def speed_2(
    distance: Annotated[float, "feet"], time: Annotated[float, "seconds"]
) -> Annotated[float, "miles per hour"]:
    pass

Which renders aliasing data types for documenting purposes rather redundant!

See:

4
No, in some programming languages like Haskell, type aliasses are even very common: if you later change your mind about the URL type, then you can simply change it, and all type checking will change. In Python, I occasionally see a class that is defined with two identifiers. - Willem Van Onsem
Note that the signature of process_url will be the same in either case! Only the source code shows the alias; any usage has no idea which name was used to refer to the basic type. Even in the first case, the actual signature is just process_url(url: str) -> str. - MisterMiyagi

4 Answers

17
votes

Using an alias to mark the meaning of a value can be misleading and dangerous. If only a subset of values are valid, a NewType should be used instead.

Recall that the use of a type alias declares two types to be equivalent to one another. Doing Alias = Original will make the static type checker treat Alias as being exactly equivalent to Original in all cases. This is useful when you want to simplify complex type signatures.

Simple aliasing works both ways: the alias URL = str means any URL is a str and also means any str is a URL – which is usually not correct: A URL is a special kind of str and not any can take its place. An alias URL = str is a too strong statement of equality, as it cannot express this distinction. In fact, any inspection that does not look at the source code does not see the distinction:

In [1]: URL = str
In [2]: def foo(bar: URL):
   ...:     pass
   ...:
In [3]: foo?
Signature: foo(bar: str)

Consider that you alias Celsius = float in one module, and Fahrenheit = float in another. This signals that it is valid to use Celsius as Fahrenheit, which is wrong.

Unless your types do cary separative meaning, you should just take a url: str. The name signifies the meaning, the type the valid values. That means that your type should be suitable to separate valid and invalid values!

Use aliases to shorten your hints, but use NewType to refine them.

Vector = List[float]        # alias shortens
URL = NewType("URL", str)   # new type separates
2
votes

I am not sure if this question is opinion based, but I have a feeling the general agreement would be this is a good idea, in general. You state the benefits yourself, not to mention the ability to generalize code etc.

I would venture this is not common practice in Python as the language itself is not very restrictive. In addition, the variable is already called url - that is pretty self explanatory. You could argue you might have something called json_response or the like, and you expect it to be a url, and your method would certainly make it clear, but since Python encourages duck typing, the code usage often gives this hint anyway, and using type aliasing will be just extra safety for an inconsiderate user. It really goes down just to common practices, with no good "do that!" explanation.

Final point - type aliasing, in a sense, is the most primitive version of object oriented programming. You are making it clear what properties you are expecting of this object, in this case the string should be a valid URL.

1
votes

I guess the question one could ask oneself is "the purpose".

I strongly believe in Python's readability is all that matters. With this in mind type hinting, even for primitives is quite OK. Even better if type is masked by virtual "enum"-like type that does some self documenting.

That being said - personally I'd go with the first: URL = str
def process_url(url: URL) -> URL: pass

1
votes

I don't know what is the general perception but I consider it a good practice for things that repeat often as it gives you a single place to define what is meant.

Ad repetition, considering you have a lot of functions like

def foo(url : str):
    """
    :param url: explaining url
    """

You'd end up defining url at each of these functions so instead you can do

def foo(x : Url):
   pass

The trouble with type alias is that you can't document it so I've come to following

class _Url(str):
    """
    Here you can document the type
    """

Url = typing.Union[_Url, str]

This gets you

  1. the behavior of type alias from the call site point of view (no need to cast it)

  2. while allowing you to express the value meaning in type and

  3. being able to document the type itself

The only downside is that its not immediately obvious what the union means but its technically correct and I think the best that can be done at the moment.