In section 1.3 "Opening Handshake" of draft-ietf-hybi-thewebsocketprotocol-17, it describes Sec-WebSocket-Key
as follows:
To prove that the handshake was received, the server has to take two pieces of information and combine them to form a response. The first piece of information comes from the |Sec-WebSocket-Key| header field in the client handshake:
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
For this header field, the server has to take the value (as present in the header field, e.g. the base64-encoded [RFC4648] version minus any leading and trailing whitespace), and concatenate this with the Globally Unique Identifier (GUID, [RFC4122]) "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" in string form, which is unlikely to be used by network endpoints that do not understand the WebSocket protocol. A SHA-1 hash (160 bits), base64-encoded (see Section 4 of [RFC4648]), of this concatenation is then returned in the server's handshake [FIPS.180-2.2002].
Here's the thing I can't understand: why not simply return code 101? If the proper use of Sec-WebSocket-Key
is for security, or to prove they can handle websocket requests, then any server could return the expected key if they wanted to, and pretend they are a WebSocket server.