Search

Wednesday, 14 September 2011

Serialization 101 - Part I: Binary Serialization


Foreword

It's been several weeks since I last posted on my blog, due to various other commitments over the summer. Now that I have a bit more time again, I thought now was a good time to get going again.
When deciding what topic to kick-start with, I decided upon another 101 mini-series this time looking at serialization using the .NET framework. For those of you new to .NET, hopefully this should serve as good primer for getting into serialization; and the more experienced developers amongst you perhaps this will serve as a useful refresher (as indeed it did for me, whilst I was preparing the sample code!).

Introduction

Serialization is to process of persisting the state of an object to some form of permanent storage medium. This is achieved by converting public and private fields of a objects, as well as its class name and assembly name, to a stream of bytes which is then persisted onto the selected storage medium. De-serialization, as the name suggests, reverses the process. As an aside, serialization and de-serialization are sometimes referred to as 'dehydrating' and 'rehydrating' an object respectively.

There are three main mechanisms for serialization in .NET:

  • Binary serialization: Persistence of objects to a binary storage format. We will look at this in this article.
  • SOAP serialization: Persistence of objects in SOAP (Simple Object Access Protocol) format.
  • XML serialization. Persistence of objects as XML.

The Serialization Farmyard

Continuing our farmyard theme from previous articles, the Serialization Farmyard is a simple Windows application in which the user can create a new farmyard, populate it with animals and then save it to disk for later use. The user can also reload a previous farmyard from disk and and edit it as they see fit. The application uses binary serialization/de-serialization for saving and loading farmyards, and the object model is shown below (click to zoom):

Making Objects Serializable

Now lets take a quick look at the Animal class:

// C#
[Serializable]
public abstract class Animal
{
    public virtual string Name { get; set; }
    public virtual int Arrived { get; set; }

    public abstract int SpaceRequired { get; }
}
' Visual Basic
<Serializable()>
Public MustInherit Class Animal

    Public Property Name As String
    Public Property Arrived As Integer

    Public MustOverride ReadOnly Property SpaceRequired As Integer

End Class

Notice how the class has been decorated with a Serializable attribute? This is how we mark a class as being binary serializable (and SOAP serializable too. XML serialization uses to totally different mechanism). It is important to note however, that for an object to be fully serializable, not only must it be decorated with Serializable attribute, but the types of all of its members must too be either serializable or explicitly marked as non-serialized (we will look at an example of this later). If not, then an exception will be thrown at run-time. In the above example, all the members are of primitive types which are always serializable.

Now you could quite easily assume that if a base class is marked as serializable, any sub-classes would therefore also be serializable. This however is not the case, as the Serializable attribute is not inheritable. Therefore any sub-classes of our abstract Animal class must therefore also be explicitly decorated with the Serializable, for example:

// C#
[Serializable]
public class Sheep : Animal
{
    public override int SpaceRequired
    {
        get { return 3; }
    }
}
' Visual Basic
<Serializable()>
Public Class Sheep
    Inherits Animal

    Public Overrides ReadOnly Property SpaceRequired As Integer
        Get
            Return 3
        End Get
    End Property
End Class

Also, don't think you can get away with just decorating your sub-classes with the Serializable attribute! All classes up the inheritance hierarchy must also be suitably decorated otherwise an exception will be thrown at run-time.

Now let's look at the Farmyard class. Firstly, the class definition:

// C#
[Serializable]
public class Farmyard
{
    private List<Animal> animals;
    
    public string Name { get; set; }
    public int Day { get; set; }
    public int Capacity { get; set; }
}
' Visual Basic
<Serializable()>
Public Class Farmyard

    Private animals As List(Of Animal)

    Public Property Name As String
    Public Property Day As Integer
    Public Property Capacity As Integer
End Class

As before, the class is decorated with the Serializable attribute. The properties Name, Day and Capacity are all primitive types and therefore can be serialized. If you look at the MSDN documentation, you will see that the List<T> class is serializable, providing that the type of T is also serializable. In this case it will be, as our Animal class has been decorated with the Serializable attribute.

Now, take a look at the weather property:

// C#
[NonSerialized]
private Weather weather;

public Weather Weather
{
    get { return weather; }
    set { weather = value; }
}
' Visual Basic
<NonSerialized()>
Private _weather As Weather

Public Property Weather As Weather
    Get
        Return _weather
    End Get
    Set(value As Weather)
        _weather = value
    End Set
End Property

This shows how to prevent a member from being serialized at run-time: simply decorate it with the NonSerialized attribute. A common reason for doing this is when a member is of a type that is not serializable, however in our example we don't persist the Weather property because we will be assigning it a new value, when it is loaded from disk (see below).

It's also worth noting that the NonSerialized attribute can only be applied to fields and not properties. Therefore we cannot use the "auto-properties" feature of C#/VB, and have to return to the pattern of declaring a private member variable and exposing it through a separate public property.

Now for a real gotcha: As we know, when serializing an object, the .NET serializer will attempt to serialize all members of that object which are not explicitly marked as being non-serializable. This also includes any delegates or events, or more specifically, any handlers that are currently wired up to them. In an application such as this, the objects wired up to any events are often UI objects which are generally not serializable in the first place; and even if they were, we wouldn't want to persist them to disk. Therefore the delegates which handle any events must also be marked as non-serialized:

// C#
[NonSerialized]
private FarmyardEventHandler animalAdded;

public event FarmyardEventHandler AnimalAdded
{
    add { animalAdded = (FarmyardEventHandler)Delegate.Combine(animalAdded, value); }
    remove { animalAdded = (FarmyardEventHandler)Delegate.Remove(animalAdded, value); }
}
' Visual Basic
<NonSerialized()>
Private _animalAdded As FarmyardEventHandler

Public Custom Event AnimalAdded As FarmyardEventHandler
    AddHandler(value As FarmyardEventHandler)
        _animalAdded = DirectCast([Delegate].Combine(_animalAdded, value), FarmyardEventHandler)
    End AddHandler

    RemoveHandler(value As FarmyardEventHandler)
        _animalAdded = DirectCast([Delegate].Remove(_animalAdded, value), FarmyardEventHandler)
    End RemoveHandler

    RaiseEvent(sender As Object, eventArgs As System.EventArgs)
        _animalAdded(sender, eventArgs)
    End RaiseEvent
End Event

In similarity to non-serialized properties we have to declare a private delegate, decorated with the NonSerialized attribute, and expose it through a public event.

Controlling and Customising Serialization

There are two main approaches to controlling and customising binary serialization in the .NET framework. One is to implement ISerializable interface (which we won't look at in this article, but you can find more information in the MSDN documentation); but the Microsoft-recommended approach is to use the following attributes to decorate certain methods in your class which will be executed at certain points during the serialization/de-serialization process:

  • OnSerializing
  • OnSerialized
  • OnDeserializing
  • OnDeserialized

We can see an example of this in the Farmyard class of our application:

// C#
[OnDeserialized]
private void OnDeserialized(StreamingContext context)
{
    NewDay();
} 

public void NewDay()
{
    Day++;
    Random random = new Random();
    Weather = (Weather)random.Next(0, 4);
    OnNewDay(new FarmyardEventArgs());
}
' Visual Basic
<OnDeserialized()>
Private Sub OnDeserialized(ByVal context as StreamingContext)
    NewDay()
End Sub

Public Sub NewDay()
    Day = Day + 1
    Dim random As Random = New Random()
    Weather = DirectCast(random.Next(0, 4), Weather)
    OnNewDay(New FarmyardEventArgs())
End Sub

As you can see, the method is decorated with the OnDeserialized attribute which means it will be executed once de-serialization is complete. This method calls the NewDay() method, which increments the day and sets the Weather property to a random value. Remember, we are explicitly not serializing the weather when we save the farmyard to disk.

Note also, that the method takes an object of type StreamingContext, and although we don't use it in our example, it provides information about the source and destination of the current serialization stream, as well as any caller-defined data. As always, you can read more about this in the MSDN documentation.

Performing Serialization

OK, so we've looked at how we declare which classes and members are to be serialized (or not, as the case may be!), but how do we actually go about serializing our data to storage and retrieving it later? Well, the answer is we need to use a BinaryFormatter object. This object has two methods: Serialize() and Deserialize() for serialization and de-serialization respectively. Here is the code for saving a farmyard to disk:

// C#
using (Stream fileStream = new FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None))
{
    IFormatter formatter = new BinaryFormatter();
    formatter.Serialize(fileStream, farmyard);
}
' Visual Basic
Using fileStream As Stream = New FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None)
    Dim formatter As IFormatter = New BinaryFormatter
    formatter.Serialize(fileStream, farmyard)
End Using

And for loading a saved farmyard from disk:

// C#
using (Stream fileStream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.Read))
{
    IFormatter formatter = new BinaryFormatter();
    farmyard = (Farmyard)formatter.Deserialize(fileStream);
}
' Visual Basic
Using FileStream As Stream = New FileStream(file, FileMode.Open, FileAccess.Read, FileShare.Read)
    Dim formatter As IFormatter = New BinaryFormatter
    farmyard = DirectCast(formatter.Deserialize(FileStream), Farmyard)
End Using

Note how the Serialize() and Deserialize() methods of the BinaryFormatter object accept an object of type Stream. In our example, we are using a FileStream object as we are serializing our farmyard to disk, but we could just as easily have used a MemoryStream or NetworkStream for serializing to memory or over a network respectively.

Note also that, when de-serializing an object, a cast is required as the Deserialize() method simply returns a method of type Object.

Summary

Binary serialization provides an easy mechanism for persisting objects to permanent storage. By the use of various attributes we can control and refine the serialization process to suit our particular needs.

The source code for the farmyard application can be downloaded here. Unfortunately, due to time constraints, the source code is in C# only.

Next: SOAP Serialization

1 comment:

  1. Ha, that’s actually a really good suggestion. Thanks so much for this!

    ----------------------------------------------------------
    New Style Wedding Dresses
    Column Wedding Dresses
    Wedding Dresses with Sleeves
    New Style Flower Girl Dresses

    ReplyDelete