Any comments, contributions, or feedback? Ping me!
Follow @adlrocha Tweet to @adlrocha
@adlrocha - Never take Marshaling for granted
Marshaling interfaces in Go.
We sometimes take marshaling for granted, but** there is more than one occasion in which you have no choice but to write your own custom marshaller. Either because your compiler/marshal library is not able to infer how to automatically marshal your objects, because your marshaler is not generating the right output (we saw last week how a JSON document can be parsed with different values across microservices, leading to a variety of potential security risk), etc. The fact is that for one reason or the other, one day you may end up having to tinker with your marshaller, and you better be prepared if you don’t want to drain your time trying to make things work. This is exactly what happened to me this week. Let me walk you through the marvelous world of interface marshaling in Golang.
Vanilla Marshal
I guess everyone is aware about what marshaling means (at least in the field of computer science), but just in case, marshaling _“is the process of transforming the memory representation of an object into a data format suitable for storage or transmission”. _You can marshal an object to several different formats: from formats such as JSON, or XML, to binary representations like CBOR. Throughout this publication, we are going to focus on JSON marshaling, but all of the concepts presented are applicable when targeting other types of representation.
Until you face complex use cases, marshaling seems like a straightforward thing in Golang. You take the encoding/json
library (or the convenient want for your serialization format), annotate your objects, and let the library do the rest. Let’s look at a quick example:
package main
import (
"fmt"
"encoding/json"
)
type Pair struct {
Key string `json:"key"`
Value int `json:"value"`
}
type Pairs []Pair
func main() {
// Marshaling a struct
fmt.Println("== Marshalling struct ==")
p := Pair{Key: "someKey", Value: 1}
// Marshal
byteData, err := json.Marshal(p)
checkErr(err)
fmt.Println("Marshalled:", string(byteData))
// Unmarshal into Pair struct
pout := Pair{}
err = json.Unmarshal(byteData, &pout)
checkErr(err)
fmt.Println("Unmarshalled:", pout)
// Marshaling a list of Pairs
fmt.Println("== Marshalling pairs ==")
p1 := Pair{Key: "someKey", Value: 1}
p2 := Pair{Key: "otherKey", Value: 2}
pl := Pairs{p1, p2}
byteData, err = json.Marshal(pl)
checkErr(err)
fmt.Println("Marshalled:", string(byteData))
// Unmarshalling into the right type
plout := Pairs{}
err = json.Unmarshal(byteData, &plout)
checkErr(err)
fmt.Println("Unmarshalled:", plout)
}
func checkErr(err error){
if err != nil{
panic(err)
}
}
Playground link: https://play.golang.org/p/ZoKXcUuFa1d
We created a Pair
and Pairs
structs, annotated the Pair struct, and marshalled them without involving any kind of black magic. Everything works “out-of-the-box”. Marshaling doesn’t seem that hard right?
Introducing interfaces to the mix
But what happens when we start introducing interfaces to the mix? Things start getting a bit messier. Let Pair have now a key and a value of type Node. Node is an interface type, so it means that key and value can be of several different types. To see what happens when marshaling interface types, we create two new String and Int types which implement the Node interface. Let’s see what happens now when we try to marshal something using the straightforward and naïve approach from above.
package main
import (
"fmt"
"encoding/json"
)
// Object Structs
type Node interface{
Print()
}
type Pair struct {
Key Node `json:"key"`
Value Node `json:"value"`
}
type String struct {
Value string
}
type Int struct {
Value int
}
func (s String) Print(){
fmt.Println(s.Value)
}
func (i Int) Print(){
fmt.Println(i.Value)
}
type Pairs []Pair
func main() {
// Marshaling a struct
fmt.Println("== Marshalling struct ==")
p := Pair{Key: String{"someKey"}, Value: Int{1}}
// Marshal
byteData, err := json.Marshal(p)
checkErr(err)
fmt.Println("Marshalled:", string(byteData))
// Unmarshal into Pair struct
pout := Pair{}
err = json.Unmarshal(byteData, &pout)
checkErr(err)
fmt.Println("Unmarshalled:", pout)
}
func checkErr(err error){
if err != nil{
panic(err)
}
}
Playground link: https://play.golang.org/p/tbG9jrVA_u7
Oh, oh! Problems! We are doing exactly the same as before but now the types inside the struct are interfaces. Unmarshaling doesn’t go as smooth as expected, and now we are getting the following error:
panic: json: cannot unmarshal object into Go struct field Pair.key of type main.Node
Why is this happening? The package encoding/json
uses reflect
under the hood to infer the type in which it needs to unmarshal the serialization. **But interfaces are dynamic types, and our JSON marshaler is not able to infer by itself the right type to use for the unmarshaling. **What can we do? We will have to give some hints to our unmarshaller.
Building a custom unmarshaller
We can give our unmarshaller hints in several ways. The first thing we are going to try is to write a custom unmarshaller for Pair, so we can manually process the serialization and assign the right Node type. The encoding/json
package lets you overwrite its interface to implement your custom unmarshaller, and that is exactly what we are going to do. For this task, we are going to use the RawMessage capabilities of theencoding/json
package. _“RawMessage is a raw encoded JSON value. It implements Marshaler and Unmarshaler and can be used to delay JSON decoding or precompute a JSON encoding”. _This is how our custom unmarshaller for Pair looks like:
// Custom unmarshal for pairs
func (p *Pair) UnmarshalJSON(b []byte) error {
// Use RawMessage to get the key and value of the struct
var objMap map[string]*json.RawMessage
err := json.Unmarshal(b, &objMap)
if err != nil {
return err
}
// Let the compiler know they are of type String
var k, v String
// Unmarshal the key and value
err = json.Unmarshal(*objMap["key"], &k)
if err != nil {
return err
}
err = json.Unmarshal(*objMap["value"], &v)
if err != nil {
return err
}
p.Key = k
p.Value = v
return nil
}
We use RawMessage to get the raw bytes of the Key and Value fields of the struct, and we perform the independent unmarshalling of both letting our unmarshaller know that in this case both, key and value, are of type String. With this simple trick we managed to unmarshal Pairs whose key and value are of type String, but what happens if we create an object where Key or Value are of type Int? Things start breaking again, because our custom unmarshaller only understands String Nodes and not Int Nodes. But how can we tell our unmarshaller that the Key or the Value are of a certain type? We need to add this knowledge to our serialized format.
Building a custom unmarshaller
The same way we overwrite the unmarshaller for Pairs we are going to write a custom marshaller for our Int and String nodes so we can include information about the type in the serialized format that our unmarshaller can use to its convenience. This sounds simple, right? We create, for instance, a MarshalType with an enum of the different types implementing the Node interface, and wrap the current default marshaller for Int and String into a new marshaller that also includes the type. Easy peasy. Wait, don’t be so quick to claim victory. If we naïvely write our marshaller like this:
func (i Int) MarshalJSON() (bdata []byte, e error) {
c := struct {
Type MarshalType `json:"type"`
Value tmp `json:"value"`
}{Type: IntType, Value: ts}
return json.Marshal(&c)
}
We end up reaching an infinite loop. Every time the marshaller encounters (in the above case) an Int type, it calls this function, which already calls json.Marshal for Int (return json.Marshal(&c)
). The infinite loop is served. How can we then wrap the standard recursion in our overwritten marshaller? Using a temporal type to avoid recursion as follows:
func (i Int) MarshalJSON() (bdata []byte, e error) {
// Temporal type to avoid recursion
type tmp Int
ts := tmp(i)
c := struct {
Type MarshalType `json:"type"`
Value tmp `json:"value"`
}{Type: IntType, Value: ts}
return json.Marshal(&c)
}
// Custom Marshal Functions
func (s String) MarshalJSON() (bdata []byte, e error) {
// Temporal type to avoid recursion
type tmp String
ts := tmp(i)
c := struct {
Type MarshalType `json:"type"`
Value tmp `json:"value"`
}{Type: StringType, Value: ts}
return json.Marshal(&c)
}
The two first lines of the function add a temporal type so we avoid recursion when calling the json.Marshal
inside our custom marshaller.
Putting it all together
We are almost done! Now we just have to **modify our Pair custom unmarshaller to identify the type of the Node everytime it encounters one so it knows the right way to unmarshal it. **We create the following auxiliary function for this:
// Unmarshaling Pair types
func UnmarshalType(tp MarshalType, b []byte) (Node, error) {
switch tp {
case StringType:
var n String
err := json.Unmarshal(b, &n)
if err != nil {
return nil, err
}
return n, nil
case IntType:
var n Int
err := json.Unmarshal(b, &n)
if err != nil {
return nil, err
}
return n, nil
}
return nil, fmt.Errorf("Wrong type")
}
This simple function gives the unmarshaller the hint it needs to know how to unmarshal the right type (avoiding the errors we were facing above). Our custom Pair unmarshaller doesn’t change much, we just need to replace the json.Unmarshal by this UnmarshalType, things would run smoothly:
// Custom unmarshal for pairs
func (p *Pair) UnmarshalJSON(b []byte) error {
var objMap map[string]map[string]*json.RawMessage
err := json.Unmarshal(b, &objMap)
if err != nil {
return err
}
var kType, vType int
json.Unmarshal(*objMap["key"]["type"], &kType)
json.Unmarshal(*objMap["value"]["type"], &vType)
p.Key, err = UnmarshalType(MarshalType(kType), *objMap["key"]["value"])
if err != nil {
eturn err
}
p.Value, err = UnmarshalType(MarshalType(vType), *objMap["value"]["value"])
if err != nil {
return err
}
return nil
}
And voilá, we managed to build our custom marshaller and unmarshaller to serialize dynamic types. Cool right? Here is the full working code:
/* Copyright (c) 2021, Alfonso de la Rocha
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
package main
import (
"fmt"
"encoding/json"
)
// Object Structs
type Node interface{
Print()
}
type Pair struct {
Key Node `json:"key"`
Value Node `json:"value"`
}
type Pairs []Pair
type String struct {
Value string
}
type Int struct {
Value int
}
func (s String) Print(){
fmt.Println(s.Value)
}
// MarshalType to strongly type json
type MarshalType int
const (
StringType = iota
IntType
)
func (i Int) Print(){
fmt.Println(i.Value)
}
// Custom Marshal Functions
func (s String) MarshalJSON() (bdata []byte, e error) {
// Temporal type to avoid recursion
type tmp String
ts := tmp(s)
c := struct {
Type MarshalType `json:"type"`
Value tmp `json:"value"`
}{Type: StringType, Value: ts}
return json.Marshal(&c)
}
func (i Int) MarshalJSON() (bdata []byte, e error) {
// Temporal type to avoid recursion
type tmp Int
ts := tmp(i)
c := struct {
Type MarshalType `json:"type"`
Value tmp `json:"value"`
}{Type: IntType, Value: ts}
return json.Marshal(&c)
}
// Unmarshaling Pair types
func UnmarshalType(tp MarshalType, b []byte) (Node, error) {
switch tp {
case StringType:
var n String
err := json.Unmarshal(b, &n)
if err != nil {
return nil, err
}
return n, nil
case IntType:
var n Int
err := json.Unmarshal(b, &n)
if err != nil {
return nil, err
}
return n, nil
}
return nil, fmt.Errorf("Wrong type")
}
// Custom unmarshal for pairs
func (p *Pair) UnmarshalJSON(b []byte) error {
var objMap map[string]map[string]*json.RawMessage
err := json.Unmarshal(b, &objMap)
if err != nil {
return err
}
var kType, vType int
json.Unmarshal(*objMap["key"]["type"], &kType)
json.Unmarshal(*objMap["value"]["type"], &vType)
p.Key, err = UnmarshalType(MarshalType(kType), *objMap["key"]["value"])
if err != nil {
return err
}
p.Value, err = UnmarshalType(MarshalType(vType), *objMap["value"]["value"])
if err != nil {
return err
}
return nil
}
func main() {
// Marshaling a struct
fmt.Println("== Marshalling struct ==")
p := Pair{Key: String{"someKey"}, Value: Int{1}}
// Marshal
byteData, err := json.Marshal(p)
checkErr(err)
fmt.Println("Marshalled:", string(byteData))
// Unmarshal into Pair struct
pout := Pair{}
err = json.Unmarshal(byteData, &pout)
fmt.Println("Unmarshalled:", pout)
// Marshaling a list of Pairs
fmt.Println("== Marshalling pairs ==")
p1 := Pair{Key: String{"someKey"}, Value: Int{1}}
p2 := Pair{Key: String{"otherKey"}, Value: Int{2}}
pl := Pairs{p1, p2}
byteData, err = json.Marshal(pl)
checkErr(err)
fmt.Println("Marshalled:", string(byteData))
// Unmarshalling into the right type
plout := Pairs{}
err = json.Unmarshal(byteData, &plout)
fmt.Println("Unmarshalled:", plout)
}
func checkErr(err error){
if err != nil{
panic(err)
}
}
Playground link: https://play.golang.org/p/RNPwJkzZ6PU
Gits hosting the code: https://gist.github.com/adlrocha/28c522e0bb3e628d65531a84b5c7fb5e
Did you love this publication? Did you hate that you couldn’t read it directly from your email? Send me some love and good feedback here. And if you got to this post using your favorite search engine, do not forget to subscribe for more content here.
Any comments, contributions, or feedback? Ping me!