1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Lamda Expression in C++ 11 and C++ 14

Discussion in 'C++' started by BiplabKamal, Mar 16, 2016.

  1. In C++ we have two types of functions: stand alone function and member function(non-static). After the compilation both are just function. Difference between a class member function and a standalone function is that the class data is bound to the member function but no external data can be bound to the stand alone function. Compiler does this data binding for member functions by passing this pointer. You can also use function pointer for normal as well as member functions. What you can’t do prior to C++11 is to create small function on the fly which can use variables from the containing scope and pass it as parameter to another function. C++11 lamda expression enables you to create functions without name and with data binding, store it, create copy of it, pass as parameter to a function and return from a function. All this property makes lamda as first class function or closer (We will discuss closure later)

    When we declare a function we provide a name. But why do we need name at all? We do give names for variables, classes, namespaces. You know names are used by the compilers to resolve address at compile time. At run-time no name exists. Then why programmers need to provide the names, compilers could use some internal names not visible to the programmer? Ok programmers also need to refer the user defined data types, data and code they define. Compiler can work with memory addresses but human being are not comfortable with numeric addresses. They need some callable names. These names are all about compile time usage but what about data addresses generated dynamically? C++ uses pointers for dynamic addresses. Assigning addresses at run-time is called dynamic binding. Programmers again use names for pointers(place holders for dynamic addresses). You can give names to place holders(pointers) for data (variables) and code (functions). You can reuse same data and code in multiple places by referring these pointers name. You can reuse data either by value or address. In case of by value the data is copied to a new address but in case of by address the same address is used. You can also reuse the code but always by address. This is because having multiple copies of same code is unnecessary which is not true for data. You also know that when we talk about the address of a piece of data or code we mean the starting address of the continuous memory space where that piece of data is stored. So for run-time binding of code block(function) is done with function pointer. Look into the following code :

    Code:
    #include <iostream>
    void myfn(int * pInt)
    {
        std::cout << "the content of the address: " << *pInt << std::endl;
    }
    int main() 
    {
        // Using named data for static binding because the address is fixed at compile time
        int i = 10; // named data
        myfn(&i); // using address of a named data both for function name and parameter
    
        // Using dynamic binding because the address is resolved at run-time
        int* p = new int(); // using unnamed data for dinamic binding
        myfn(new int());// Using unnamed data for dynamic binding. 
    
        int* intptr = &i; //This is static binding because address of i is fixed at compile time
        myfn(p);// This is static binding because address of p is fixed at compile time
        
    }
    
    In the above code observe that for compile-time binding (static binding) data name is required but for run-time binding data name is not required. Also note that calling the function by name(myfn) is also static binding. Can we dynamically allocate memory for a function code and use dynamic binding for function? We cannot because the address of any function is fixed at compile. Using function pointer we can capture a function address and invoke it later. Actually the name of a function is itself a pointer. Following code will show the use of function pointer :
    Code:
    #include<vector>
    
    // Sort the numbers according to comparing logic
    void MySort(std::vector<int>& ilist, bool(*)(int, int))
    {
        // Sorting code here
    
    }
    // returns true if 2nd value is larger than the 1st value
    bool Ascending(int val1, int val2)
    {
        if (val1 < val2)
            return true;
        else return false;
    }
    // returns true if 2nd value is smaller than the 1st value
    bool Descending(int val1, int val2)
    {
        if (val1 > val2)
            return true;
        else return false;
    }
    int main() 
    {
        std::vector<int> v = {8,5, 9,4, 5,2,0,3};
        //Sort asending
        bool(*CompareFn)(int, int) = Ascending;
        MySort(v, CompareFn);
    
        //Sort Dessending
        MySort(v, Descending);
    }
    
    In previous code we saw that we can create data on the fly only when we want to use it. But what if a programmers could create a function on the fly only when he wants to use it. Yes this requirement exists. In the above example if you know that Ascending() and Descending() are callbacks and they are simple. So no need to first define the function body, and then pass the function pointer. Creating the function body without name inside the call will be simpler to use. Also this will improve the readability because sometime function names does not clearly mean what it does. C++11 added the feature of anonymous function definition. It looks like creating a function object. They are called lamda expression. We will see the detail of the syntax later but let us see how the above code will look like using lamda expression:
    Code:
    #include<vector>
    #include<iostream>
    // Sort the numbers according to comparing logic
    void MySort(std::vector<int>& ilist, bool(*pCompare)(int, int))
    {
        // Sorting code here
        std::cout << "Address of the comparing function :" << pCompare<<std::endl;
    
    }
    
    int main() 
    {
        std::vector<int> v = {8,5, 9,4, 5,2,0,3};
        //Sort asending
        MySort(v, [](int val1, int val2) {return val1 < val2;});
        
        //Sort Dessending
        MySort(v, [](int val1, int val2) {return val1 > val2;});
    }
    
    Lamda expression is a function with no name which can be defined and assigned to a lamda variable or passed as function parameter, can be copied or it can be returned from a function. All these are the characteristics of an object but focus is on the function like std::function. This is called ‘closure’. Understanding closure and closure class will help the context of lamda expression.

    In object oriented programming languages two terms ‘object’ and ‘closure’ are two sides of the same coin. The type of object is class and the type of closure is closure class. Classes define how functions are bound to data and closure classes define how data is bound to a function. In case of class object member functions are bound to member variables whereas in case of closure variables in the scope of the function is bound to the function. In other words closures are functions which refer to independent (free) variables defined in the environment. Internally closure class also implemented similar to normal class but with only one member function (function operator) and data members for the data variable within the environment referred.

    Lamdas also enables C++ to have functions as first class object. Normal functions are not first class object but lamdas are. This means we can create copies of lamdas, pass lamdas as function parameters or return lamda from functions. Actually for any lamda expression compiler creates a closure class representing the type of the lamda and creates instances of the class. So internally lamdas are instances of C++ classes. The statements inside the lamda becomes the member function(function operator) of the class and the capture values become the member variables of the class. Let us look into the lamda syntax:

    Lamda expression is a function definition. So it has return type, parameter list and body. It also has an exception specification and mutable specification. In addition it has a capture clause which captures the variables from the scope of the lamda. Actually the capture clause is the main aspect which differentiate the lamda from a normal function. Normal function cannot access variables defined outside the function but lamdas can. The syntax looks like:

    [<capture cluse>](<parameter list>) mutable throw(<exception list>) -> <return type> {<body>};

    All parts of the syntax are optional except the capture clause and the body. So the minimum syntax of defining a lamda is:

    []{}; //A lamda which does nothing

    In the following code lamda captures variables defined in the outer function
    Code:
    void Outerfunction()
    {
        int var1 =10;
        int var2 = 100;
        // Creating a lamda that captures var1
        auto mylamda= [var1]() 
        {
            //var1 = 20;//Error
            std::cout << "Value of captured variable is :" << var1 << std::endl;// Will print 10
        };
        mylamda(); //Calling the lamda function
    }
    
    Notice that the captured variable ‘var1’ is not modifiable inside the lamda. This is because lamda function operator is const-by-value, which means it cannot modify the variables captured by value. To enable the body of lamda expression to modify variables captured by value we can use mutable specifier. Following code shows how to modify variables captured by value:
    Code:
    void Outerfunction()
    {
        int var1 =10;
        int var2 = 100;
        // Lamda captures var1
        auto mylamda= [var1]() mutable 
        {
            var1 = 20;// Modifying the captured variable
            std::cout << "Value of captured variable is :" << var1 << std::endl; // Will print 20
        };
    
        mylamda(); //Calling the lamda function
        std::cout << "Value of var1 after lamda is called :" << var1 << std::endl;// Will print 10
    }
    
    Note that var1 is captured by value and the modification by the lamda body does not affect the original variable. Variables can be captured by reference using & like [&var1]. Modification of variables captured by reference will modify the original variable. The code below shows how the variables are captured by reference:
    Code:
    void Outerfunction()
    {
        int var1 =10;
        int var2 = 100;
        // Lamda captures var1
        auto mylamda= [&var1]() 
        {
            var1 = 20;// Modifying the captured variable
            std::cout << "Value of captured variable is :" << var1 << std::endl;// Will print 20
        };
    
        mylamda(); //Calling the lamda function
        std::cout << "Value of var1 after lamda is called :" << var1 << std::endl;// Will print 20
    }
    
    So variables from the environment can be captured by value or reference. There is a default capture mode which says how the variables from the environment referred in the lamda body are captured(by value or by reference). [&] means all variables referred in the body will be captured by reference and [=] means all variables referred will be captured by value. Along with default capture you can also specify opposite capture mode for some variables like [&,var1] and [=,&var2]. You cannot write [&,&var1] or [=,var2]. You can also use parameter pack expansion in the capture clause like [args...]. If you are using a lamda expression in a class method you can pass the this pointer to the capture clause to access the data members and methods of the enclosing class. C++14 also allow you to introduce new variables with initialization in the capture clause where type of the new variable is deduced from the initialization expression. Following code block shows different ways of capturing variables by lamda in class method:

    Code:
    #include<iostream>
    class MyClass
    {
    public:
        int member1;
        int member2;
    public:
        MyClass()
        {
            member1 = 100;
            member2 = 200;
        }
        void Memberfunction()
        {
            int var1 = 10;
            int var2 = 100;
            // Lamda captures variables by reference except this pointer and p is a new(local) variable of int type
            auto mylamda = [&, this,p = 10]()
            {
                this->member1 = 20;
                this->member2 = 200;
                var1 = 20;// Modifying the variable captured by reference
                //p = 30; // error: not modifiable, You can use mutable specifier to make it modifiable
                std::cout << "Value of captured variable is :" << var1 << std::endl;
            };
    
            mylamda(); //Calling the lamda function
            std::cout << "Value of var1 after lamda is called :" << var1 << std::endl;
            std::cout << "Value of member variable after lamda is called : member1 =" << member1 << " member2 = " << member2 << std::endl;
        }
    };
    
    Using captured variables may cause serious problem if you are not aware of the lifetime dependency of captured variables with the original variables. Variables captured by value has no lifetime dependency but variables captured by reference has lifetime dependency. This may happen when lamda function is called asynchronously. Lifetime of lamda object may exceed the life time of the variables captured by reference. So the referring variables will be dangling when lamda function is invoked.

    Parameter list, exception specification and return type are similar to those for normal functions. Return type is automatically deduced if return type is not specified. If there is a single return statement the return type will be deduced from the return expression. If no return statement is given then return type is deduced to be void. Following code will show the usage of different parts of lamda:

    Code:
    #include<iostream>
    #include<vector>
    // Following function will return a lamda which will sort a vector
    // of int either in ascending or descending order
    auto GetSortFunction(bool bAscending)
    {
        // flag internally used to change the sorting algorithms
        bool bSimpleSorting = false;
        std::vector<int> list;
        return [=](std::vector<int> v) mutable throw (int)->std::vector<int> {
            if (bAscending)
                bSimpleSorting = true;// Modifying the variable captured by value
            if (v.empty())
                throw - 1; // throw for empty vector
            if (bAscending)
            {
                if (bSimpleSorting)
                {
                    std::cout << "Doing Buble Sort Ascending order" << std::endl;
                    //Do the buble sort in ascending order
                }
                else
                {
                    std::cout << "Doing Quick Sort Ascending order" << std::endl;
                    //Do the quick sort in ascending order
                }
                v = std::vector<int>{ 1,2,3,4,5,6 }; // arbitary numbers
            }
            else
            {
                if (bSimpleSorting)
                {
                    std::cout << "Doing Buble Short Descending order" << std::endl;
                    //Do the buble short in Descending order
                }
                else
                {
                    std::cout << "Doing Quick Sort Descending order" << std::endl;
                    //Do the quick sort in Descending order
                }
                v = std::vector<int>{ 6,5,4,3,2,1 };
            }
            return v;
        };
        
    };
    int main() 
    {
        auto ShortFnA = GetSortFunction(true);
        std::vector<int> v = ShortFnA(std::vector<int>{ 2,4,3,6,5,1 });
        
        auto ShortFnD = GetSortFunction(false);
        ShortFnD(std::vector<int>{ 2, 4, 3, 6, 5, 1 });
        
    }
    [CODE]
    If you want to know how lamdas are implemented by compiler, let us look into the following code compiled in visual studio 2015:
    [CODE=Cpp]
    #include<iostream>
    int main() 
    {
        int var1 = 100;
        int var2 = 200;
        auto lamda = [=](int factor) { return (var1 + var2)*factor;};
        std::cout << "The size of lamda = " << sizeof(lamda) << std::endl;
        std::cout<<"Output of the lamd call: "<<lamda(3)<<std::endl;
    }
    
    What will be the size of the lamda object? It is the total size of the captured variables. Actually compiler will generate class with a name given by the compiler. The class will have member variables representing captured variables and function operator as member. Parameter list will be arguments of the function operator and return type of the lamda will be the return type of the function operator. Compiler generated equivalent code will look similar to the following code:

    Code:
    #include<iostream>
    class lamda_
    {
    private:
        int var1_;
        int var2_;
    public:
        lamda_(int t1_, int t2_) :var1_(t1_), var2_(t2_) {};
        int operator() (int factor_) const
        {
            return (var1_ + var2_)*factor_;
        }
    };
    int main() 
    {
        int var1 = 100;
        int var2 = 200;
        lamda_ lamda(var1, var2);
        std::cout << "The size of lamda = " << sizeof(lamda) << std::endl;
        std::cout<<"Output of the lamd call: "<<lamda(3)<<std::endl;
        
    }
    
    Lamdas are more handy to use than function pointer or std::function. Specially when you need to pass callbacks to many standard library algorithm functions. When a callbak function’s body is small, it is better to use lamda expression. In this way you can avoid defining a function and then define a function pointer. Also you need to add more arguments in the function to pass required environment variables.
     

Share This Page