Passing Variables by Reference


In C, passing a non-array variable by address is the only way to allow a function to change it. C++ provides an easier alternative: passing the variable by reference:

The general syntax to declare a reference variable is

    data-type-to-point-to& variable-name

For example:

int& x     // x is a reference variable that can refer to an int variable.

Here is an example program illustrating passing a variable by reference:

#include <iostream>

using std::cout;
using std::endl;

void add_to_int(int&);

int main()
{
   int num = 5;

   cout << "In main(), num is " << num << endl;
   cout << "In main(), address of num is " << (long int) &num << endl << endl;

   add_to_int(num);

   cout << "In main(), value of num is now " << num << endl;

   return 0;
}

void add_to_int(int& num_ref)
{
   cout << "In add_to_int(), value of num_ref is " << num_ref << endl;
   cout << "In add_to_int(), address of num_ref is " << (long int) &num_ref << endl;

   num_ref += 10;

   cout << "In add_to_int(), value of num_ref is now " << num_ref << endl << endl;
}

Once again, you can see from the following output that num was changed by calling the add_to_int() function.

In main(), num is 5
In main(), address of num is 140725741294572

In add_to_int(), value of num_ref is 5
In add_to_int(), address of num_ref is 140725741294572
In add_to_int(), value of num_ref is now 15

In main(), value of num is now 15

Passing Objects by Reference

Like other non-array variables, objects can be passed by reference. For example, the following program code:

#include <iostream>
#include <string>

using std::cout;
using std::endl;
using std::string;

void add_to_string(string&);

int main()
{
    string s = "dog";

    cout << "In main(), value of s is " << s << endl << endl;

    add_to_string(s);

    cout << "In main(), value of s is now " << s << endl;

    return 0;
}

void add_to_string(string& str)
{
    cout << "In add_to_string(), value of str is " << str << endl << endl;

    str = str + "fight";

    cout << "In add_to_string(), value of str is now " << str << endl << endl;
}

will produce this output:

In main(), value of s is dog
        
In add_to_string(), value of str is dog
        
In add_to_string(), value of str is now dogfight
        
In main(), value of s is now dogfight

It is extremely common to pass C++ objects by reference even when a function does not need to change the original object in the calling routine. Objects can often be quite large, so passing them by value can be expensive in terms of the amount of memory they occupy and the time required to make a copy of them. By contrast, references (implicit pointers) are very small and fixed in size (usually 4 or 8 bytes). Passing objects by reference will usually save us both memory and time.

The only downside to passing an object by reference is that the function will be able to change the original object that was passed to it. That is not desirable. If a function does not need to change an object, ideally we would like to make it impossible to do so.

Furthermore, if we try to pass a constant object by reference, the compiler will produce a syntax error. For example, if you attempt to compile the following program code on turing as test.cpp

#include <iostream>
#include <string>
        
using std::cout;
using std::endl;
using std::string;

void print_string(string&);

int main()
{
    const string s = "dog";

    cout << "In main(), value of s is " << s << endl << endl;

    print_string(s);

    return 0;
}

void print_string(string& str)
{
    cout << "In print_string(), value of str is " << str << endl << endl;
}

the compiler will produce the following syntax error:

test.cpp: In function 'int main()':
    test.cpp:16:18: error: binding 'const string {aka const std::__cxx11::basic_string<char>}' to reference of type
    'std::__cxx11::string& {aka std::__cxx11::basic_string<char>&}' discards qualifiers
    print_string(s);
    ^
test.cpp:8:6: note:   initializing argument 1 of 'void print_string(std::__cxx11::string&)'
void print_string(string&);
^~~~~~~~~~~

The "qualifiers" this error message refers to is the const qualifier on the declaration of the string variable s. Essentially, the compiler is telling us that "this variable you are passing to the function print_string() is constant but you are passing it in a fashion that will allow the function to change it."

References to Constant Variables

To fix that problem, we can use a reference to a constant variable. This creates a reference that cannot be used to modify the variable to which it refers. For example, the following code will compile without any syntax errors:

#include <iostream>
#include <string>

using std::cout;
using std::endl;
using std::string;

void print_string(const string&);

int main()
{
    const string s1 = "dog";
    string s2 = "cat";

    cout << "In main(), value of s1 is " << s1 << endl << endl;

    print_string(s1);

    cout << "In main(), value of s2 is " << s2 << endl << endl;
    
    print_string(s2);

    return 0;
}

void print_string(const string& str)
{
    cout << "In print_string(), value of str is " << str << endl << endl;
}

Note that the C++ compiler does not complain if you pass a non-constant object to a function using a reference to a constant object. That simply means the function will not be able to change the original object in the calling routine by using the reference, even though the original object isn't constant.

As a general rule:

If you want a function or member function to be able to change a C++ object, pass it by reference.

If you do not want a function or member function to be able to change a C++ object, pass it as a reference to a constant variable.

Most of the time, there's no real benefit to passing a variable of a built-in type like int or double using a reference to a constant variable. If you don't want a function to modify the original variable in the calling routine, just pass the variable by value.

Returning References

Member functions may also sometimes return a variable or object by reference or return a reference to a constant variable. This is most commonly done if the member function either returns the object that called it, returns a function parameter that is itself a reference, or returns a data member that is an object.

Returning a reference to a local automatic variable declared inside a function is normally a mistake. The local variable will be deallocated automatically when the function ends, and the calling code will be left with a reference to a variable that no longer exists.

References versus Pointers

C++ references differ from pointers in several essential ways:

  • It is not possible to refer directly to a reference variable after it is defined; any occurrence of its name refers directly to the variable it references. This can be seen in the program output above. The variable numRef is clearly a different variable than num, but based on the output they appear to both have the same address, 140725741294572. That should be impossible, since two different variables can't occupy the same address.

    In fact they do not. Once the reference variable numRef is defined and associated with num, anything we do with it (such as trying to print its address) is actually done to the variable num that it refers to instead.

  • Once a reference is created, it cannot be later made to reference another variable. This is something that is often done with pointers.

  • References cannot be null, whereas pointers can; every reference refers to some variable, although it may or may not be valid.

  • References are not allowed be uninitialized. Because it is impossible to reinitialize a reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of a class must be initialized in the initializer list of the class's constructor. For example:

    int& k; // Compiler will complain: 'k' declared as reference but not initialized
    

Typically, a C++ compiler will either treat a reference as a simple alias for the variable that it refers to (assuming that variable is declared in the same scope) or compile the reference into a pointer that is implicitly dereferenced every time the reference is used.

The syntax required to use pointers tends to make them stand out; this is often not the case with references. In a large block of C++ code, it may not always be obvious if a variable being accessed is defined as a local or global variable or whether it is a reference to a variable in some other location, especially if the code mixes references and pointers. That can make poorly written C++ code harder to read and debug.

However, because the operations allowed on references are so limited, they are much easier to understand than pointers and are more resistant to errors. While pointers can be made invalid through a variety of mechanisms, ranging from carrying a null value to out-of-bounds arithmetic to illegal casts to producing them from random integers, a previously-valid reference only becomes invalid in two cases:

  • If it refers to a variable with automatic allocation which goes out of scope,
  • If it refers to an object inside a block of dynamic memory which has been freed.

In general, if we need to change the value of a non-array argument that is passed to a function, we will pass it by reference. However, there are a couple of cases where you may have no choice but to pass an argument by address or to use pointers:

  1. Arrays are always passed by address. That includes C strings.
  2. Dynamic storage is allocated using pointers.
  3. Occasionally, you may want to use a library function from the old C standard library that requires an address argument.

Understanding of both the pass-by-address and pass-by-reference mechanisms is required to be considered a competent C++ programmer.