Binary I/O (a.k.a. Object Serialization)


Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, a file, or across a network. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.

In C++, basic serialization and deserialization can be performed using the write() and read() member functions of the ofstream and ifstream classes.

Reading ASCII Input

Let's assume that we have Student and Course classes identical to the ones used as an example in class. In that example, we read the student data in the Course class from an input file that contained ASCII text:

void Course::read_student_data(const string& file_name)
{
    ifstream in_file;
    char first_name[11], last_name[21], name[31];
    double gpa;

    // Open the input file and test for failure
    in_file.open(file_name);
    if (!in_file)
    {
        cerr << "Error - unable to open input file " << file_name << endl;
        exit(1);
    }

    in_file >> first_name;
    while (in_file)
    {
        in_file >> last_name;
        in_file >> gpa;

        strcpy(name, last_name);
        strcat(name, ", ");
        strcat(name, first_name);

        // Create a Student object and copy it into
        // the array.
        class_list[num_students] = Student(name, gpa);

        num_students++;

        in_file >> first_name;
    }

    in_file.close();

    sort_class_list();
}

That obviously worked, and was a good option under the circumstances. But what if the data in the file was stored in a different format?

Binary Input

To read a file that was created in binary format, we'll need to perform the following steps:

  1. #include <fstream> at the top of the file since that header file contains the classes and member function prototypes that we'll need.
  2. The ifstream class to read the input, which is part of the standard namespace. So we'll need an appropriate using statement or we'll need to fully qualify the class name when we use it.
  3. Declare an ifstream file stream object and open it for input just like you would for a text file.
  4. Call the read() member function of the ifstream object to read the input data.
  5. Close the ifstream object once we're done reading the input.

The read() Member Function

Here's the prototype for the read() member function of the ifstream class:

istream& read(char* s, streamsize n);

This member function extracts n characters (bytes) from the input stream and stores them in the array pointed to by s. The data type streamsize is a signed integral type.

This member function returns a reference to the stream that can be tested to see if you are at end of file. The data type returned is istream. The ifstream class inherits this member function from the class istream, so an ifstream is also an istream.

We can use this member function to read the student data in one of two ways, depending on how the binary file was created.

Method 1: Read a Series of Student Objects until End of File

If the file was created by writing out a series of Student objects, then we can read one Student object at a time into the elements of our array, starting with element 0.

void Course::read_student_data(const string& file_name)
{
    ifstream in_file;

    // Open the input file and test for failure
    in_file.open(file_name);
    if (!in_file)
    {
        cerr << "Error - unable to open input file " << file_name << endl;
        exit(1);
    }
    
    // Read an entire Student object into the first element of the array.
    in_file.read((char*) &class_list[num_students], sizeof(Student));
    while (in_file)
    {
        num_students++;
        
        // Read the next Student object in the file.
        in_file.read((char*) &class_list[num_students], sizeof(Student));
    }
    
    in_file.close();
    
    sort_class_list();
}

Note that this code assumes that the variable num_students has been initialized to 0 in the constructor for the Course class.

The read() member function expects a pointer to a character as its first argument. Since we are passing the function the address of (i.e., a pointer to) a Student object and not the address of a char, we need to type cast the pointer to data type char*.

For the function's second argument, we use the sizeof() operator to automatically compute the size of a Student object.

The read() member function returns the input stream, so we can test the return value directly in the while statement. That allows us to shorten the loop code like so:

// Read an entire Student object an array element.
while (in_file.read((char*) &class_list[num_students], sizeof(Student)))
{
    num_students++;
}

Method 2: Read an Entire Course Object

Since we have a Course class that encapsulates a class list of Student objects and the number of students enrolled in the course, there's an easier way to serialize / deserialize our data. Rather than writing out (and then later reading in) a series of Student objects, we could simply write out (and then later read in) an entire Course object.

Assuming that the input file was created in that fashion, the code to read all of the student data (the entire class list and the number of students enrolled in the course) becomes very short indeed:

void Course::read_student_data(const string& file_name)
{
    ifstream in_file;

    // Open the input file and test for failure.
    in_file.open(file_name);
    if (!in_file)
    {
        cerr << "Error - unable to open input file " << file_name << endl;
        exit(1);
    }
    
    // Read an entire Course object worth of bytes into the Course object
    // that was used to call the read_student_data() member function.
    in_file.read((char*) this, sizeof(Course));
    
    in_file.close();
    
    sort_class_list();
}

For the function's first argument, we pass a special pointer named this. Every non-static member function of a class automatically has access to the pointer this. The this pointer contains the address of the object that called the function. Effectively, the Course object that called the read_student_data() member function is reading data into itself!

For the function's second argument, we use the sizeof() operator to automatically compute the size of a Course object.

Notes

As you can see, this technique for reading input can be very efficient, requiring much less code than is typically needed to read a text file. It's also quite fast.

  1. This technique requires the input file to have been created as a file of binary objects. Don't try to use it to read a file of ASCII text.
  2. The class definitions used when you read the objects must match the class definitions that were used when the objects were written. Any deviation in terms of the data types, lengths, or order of the data members will result in garbled input.
  3. Binary files created in C++ are platform-specific. A binary file created by a program running on Windows may not be readable by a program running on Unix, and vice versa.
  4. Not all of the C++ library classes can be serialized correctly. Any class that uses dynamic storage (like the C++ string class) will not serialize correctly.

Binary Output

To write objects to a file in binary format, we'll need to perform the following steps:

  1. We'll need to #include <fstream> at the top of the file since that header file contains the classes and member function prototypes we'll need.
  2. We'll be using the ofstream class to read the input, which is part of the standard namespace. So we'll need an appropriate using statement or we'll need to fully qualify the class name when we use it.
  3. Declare an ofstream file stream object and open it for output just like you would for a text file.
  4. Call the write() member function of the ofstream object to write the objects to the file.
  5. Close the ofstream object when we're done writing the output.

When you're writing output, you typically know exactly how many objects you want to write, so there's no reason that you can't do so all at once.

The write() Member Function

Here's the prototype for the write() member function of the ofstream class:

ostream& write(const char* s, streamsize n);

We can use this member function to write student data in one of two ways.

Method 1: Write a Series of Student Objects

We can write only the Student objects that are filled with valid data:

void Course::write_student_data(const string& file_name)
{
    ofstream out_file;
        
    // Open the output file and test for failure.
    out_file.open(file_name);
    if (!out_file)
    {
        cerr << "Error - unable to open output file " << file_name << endl;
        exit(1);
    }
        
    // Write out just the Student objects in the array that contain valid data.
    out_file.write((const char*) class_list, sizeof(Student) * num_students);
        
    out_file.close();
}

When we pass the array name class_list to a function, it is automatically converted to a pointer to the first element of the array. We do have to type cast that pointer to the data type expected by the write() function (const char*).

The output file created via this member function would need to be read using Method 1 for reading input as previously described.

Method 2: Write an Entire Course Object

We could also simply write an entire Course object:

void Course::write_student_data(const string& file_name)
{
    ofstream out_file;

    // Open the output file and test for failure.
    out_file.open(file_name);
    if (!out_file)
    {
        cerr << "Error - unable to open output file " << file_name << endl;
        exit(1);
    }

    // Write an entire Course object worth of bytes to the output file.
    out_file.write((const char*) this, sizeof(Course));
    
    out_file.close();
}

The output file created via this member function would need to be read using Method 2 for reading input as previously described.