1
votes

I am trying to learn Backend development by building a vary basic REST API using gorilla mux library in Go (following this tutorial)

Here's the code that I have built so far:

package main

import (
"encoding/json"
"net/http"

"github.com/gorilla/mux"
)

// Post represents single post by user
type Post struct {
Title  string `json:"title"`
Body   string `json:"body"`
Author User   `json:"author"`
}

// User is struct that represnets a user
type User struct {
FullName string `json:"fullName"`
Username string `json:"username"`
Email    string `json:"email"`
}

var posts []Post = []Post{}

func main() {
   router := mux.NewRouter()
   router.HandleFunc("/posts", addItem).Methods("POST")
   http.ListenAndServe(":5000", router)
}

func addItem(w http.ResponseWriter, req *http.Request) {
   var newPost Post
   json.NewDecoder(req.Body).Decode(&newPost)

   posts = append(posts, newPost)

   w.Header().Set("Content-Type", "application/json")
   json.NewEncoder(w).Encode(posts)
}

However, I'm really confused about what exactly is happening in json.NewDecoder and json.NewEncoder part.

As far as I understand, ultimately data transfer over internet in a REST API will happen in form of bytes/binary format (encoded in UTF-8 i guess?). So json.NewEncoder is converting Go data strcuture to JSON string and json.NewDecoder is doing the opposite (correct me if i'm wrong).

  • So who is responsible here for converting this JSON string to UTF-8 encoding for data transfer? Is that also part of what json.NewDecoder and json.NewEncoder do?
  • Also, if these 2 functions are only serializing/de-serializing to/from JSON, why the name encoder and decoder (isn't encoding always related to binary data conversion?). Honestly i'm pretty confused with the terms encoding, serialization, marshaling and the difference between them

Can someone just explain how exactly is data transfer happening here at each conversion level (json, binary, in-memory data structure)?

2
The output JSON string is already UTF-8. UTF-8 is just a text encoding. All data, in memory, on desk, in a register, on the wire, anywhere, is bytes. It sounds like you may be overthinking the encoding a bit. As to the terms, you'd have to ask the authors why they chose them, but "encoding", "serialization", and "marshaling" can all be used to refer to the same thing, as can their opposites (plus other terms like "parsing").Adrian
@Adrian I guess my confusion is arising not from Golang specific but data transfer concepts in general. So json.NewEncoder is also converting the JSON string to its UTF-8 encoding? Because this is a JSON string : "{'name': 'John'}" and this is it's UTF-8 encoding: \x20\x22\x7b\x27\x6e\x61\x6d\x65\x27\x3a\x20\x27\x4a\x6f\x68\x6e\x27\x7d\x22 They aren't the same obviously.D_S_X
No, "{'name': 'John'}" is its UTF-8 encoding. Bytes up to 127 are the same in ASCII and UTF-8.Adrian
Also worth noting, strings in Go are presumed to be UTF-8 by default; and the JSON spec also indicates the JSON documents are presumed UTF-8 by default, and the encoding/json documentation states that it follows the JSON spec (RFC 7195).Adrian
Okay .. but ultimately wouldn't all data be converted to/from binary? Going from low to high level i think this would be data flow in a GET request: Golang Data structure -> JSON -> UTF-8 -> Binary ------- data transfer over wire -------- Binary -> UTF-8 -> JSON -> JSON.parse in broswer Now this question might be far-fetched but who exactly is responsible for converting to/from binary data here. Also can you elaborate more on what you mean by every data is bytes?D_S_X

2 Answers

2
votes

First, We have to understand that the Encoding process doesn't actually mean that it translates types and returns a JSON representation of a type. The process that gives you the JSON representation is called the Marshaling process and could be done by calling the json.Marshal function.

On the other hand, the Encoding process means that we want to get the JSON encoding of any type and to write(encode) it on a stream that implements io.Writer interface. As we can see the func NewEncoder(w io.Writer) *Encoder receives an io.Writer interface as a parameter and returns a *json.Encoder object. When the method encoder.Encode() is being called, it does the Marshaling process and then writes the result to the io.Writer that we have passed when creating a new Encoder object. You could see the implementation of json.Encoder.Encode() here.

So, if you asked who does do the encoding process to the http stream, the answer is the http.ResponseWriter. ResponseWriter implements the io.Writer interface and when the Encode() method is being called, the encoder will Marshal the object to a JSON encoding representation and then call the func Write([]byte) (int, error) which is a contract method of the io.Writer interface and it will do the writing process to the http stream.

In summary, I could say that Marshal and Unmarshal mean that we want to get the JSON representation of any type and vice-versa. While Encode means that we want to do the Marshaling process and then write(encode) the result to any stream object. And Decode means that we want to get(decode) a json object from any stream and then do the Unmarshaling process.

2
votes
  1. The json.Encoder produced by the call to json.NewEncoder directly produces its output in UTF-8. No conversion is necessary. (In fact, Go does not have a representation for textual data that is distinct from UTF-8 encoded sequences of bytes — even a string is just an immutable array of bytes under the hood.)

  2. Go uses the term encode for serialisation and decode for deserialisation, whether the serialised form is binary or textual. Do not think too much about the terminology — consider encode and seralise as synonyms.