Getting Started With LINQ in C# With Examples

shabbir's Avatar author of Getting Started With LINQ in C# With Examples
This is an article on Getting Started With LINQ in C# With Examples in C#.
LINQ (Language Integrated Query) is arguable one of the most astonishing features of the .NET framework. If you have slightest idea of database programming, you must have written SQL queries. Something like “Select * Name From Customers”. This line was actually used to get names of all the customers from the customers table. This method of querying database is uniform and standard. However, .NET Framework’s powerful features such as type-safety, intelli-sense and code readability were not integrated by those simple SQL queries.

In order to address this issue, LINQ was introduced in .NET Framework 3.5. LINQ allows you to query almost all collections that implement the IEnumerable<T> interface, be it any list, array, XML DOM, or a table in SQL Server database. Apart from several advantages of LINQ, static type checking and dynamic query composition are two of the major features. There are two namespaces that contain LINQ types: System.Linq and System.Linq.Expressions. LINQ is basically a set of features that partially belong to C# language and partially to the .NET framework. LINQ was introduced in .NET Framework 3.5 and C# 3.0.

Table Of Content

Basics of LINQ



LINQ has two basic units: Sequences & Elements. Sequences are types that implement IEnumerable<T> interface and contains some elements. A simple array can be considered as a sequence. Elements are the items that are contained by a sequence. For instance, have a look at following array of strings:
Code:
string[] Cars = { "Toyota", "Honda", "Suzuki" };
Here ‘Cars’ is a sequence whereas ‘Toyota’, ‘Honda’ & ‘Suzuki’ are the elements. Since Cars is an array object that would reside in the memory of the program, this sequence is called local sequence. However, if you have some SQL Server table as a sequence, it would be called a remote sequence.

Inside System.Linq namespace, you have a class Enumerable; this class contains methods that are used to query sequences. These methods are called query operators. These query operators are implemented as extension methods. In the Enumerable class, there are around 40 query operators and they are referred to as standard query operators. The functionality of these operators is to take sequence as input and returning back that sequence as output after applying transformations. We would explain this in our first example. Queries that are used to access local sequences are called local queries or more commonly referred as LINQ to Objects queries.

A query in LINQ is basically an expression that takes at least one input sequence and has one operator. Query transforms the sequence with the help of an operator and returns the result back. For example, we can get all the car names from the ‘Cars’ array where car name is greater than 5 characters. We can write a LINQ query for that. Have a look at our first example of this tutorial to understand this concept.

Example1

Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            string[] Cars = { "Toyota", "Honda", "Suzuki", "BMW" };

            IEnumerable<string> carscoll = System.Linq.Enumerable.Where(Cars, c => c.Length > 5);
            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }

            Console.WriteLine("\n*******************************\nOperator as extension method...\n");
            carscoll = Cars.Where(c => c.Length > 4);
            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }
            Console.ReadLine();
        }
    }
}
Carefully look at our first example; we have a sequence that is basically an array of strings which we have named ‘Cars’. This sequence has four items, the names of the cars. Since, every sequence implement IEnumerable<T> interface, therefore we can store the result of the LINQ query in IEnumerable<T> interface’s object, which we have declared in Example1 with the name ‘carscoll’. On the right side of this object, we have a LINQ query i.e.:
Code:
System.Linq.Enumerable.Where(Cars, c => c.Length > 5);
You can start right from the class Enumerable, if you import System.Linq namespace, however for the better understanding of the reader, I have written complete namespace information. Here, in the above query ‘Where’ is basically query operator. Note, we have to pass a lambda expression to a query operator. This lambda expression basically drives the transformation of the sequence. So, lambda expression here says that return all the elements from the sequence, where length is greater than 5. In our ‘Cars’ sequence, we have two elements i.e. ‘Toyota’ & ‘Suzuki’. These two elements would be returned to ‘carscoll’ object. We have them enumerated over this object and displayed the contents. ‘Toyota’ & ‘Suzuki’ would be displayed on the output screens.

We can also directly call query operators on the sequences because query operators are implemented as extension methods. This is what we have done next in our code in the following line:
Code:
carscoll = Cars.Where(c => c.Length > 4);
Here we directly have called ‘Where’ operators on ‘Cars’ sequence. But this time we have changed our query and the query would fetch all those cars having names greater than 4 characters. This time ‘Honda’ would also be included in the result along with ‘Toyota’ & ‘Suzuki’. You can again enumerate upon ‘carscoll’ to see the updated results as we have done in Example1. The output of Example1 is as follows:

Output1



Example2
Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            string[] Cars = { "Toyota", "Honda", "Suzuki", "BMW" };

            IEnumerable<string> carscoll = from c in Cars
                                           where c.Length > 5
                                           select c;

            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }
            Console.ReadLine();
        }
    }
}
In Example2, everything is similar to Example1, except the LINQ query on the right side IEnumerable<string> carscoll object. Here we have the following syntax
Code:
from c in Cars
where c.Length > 5
select c;
This is similar to the line Cars.Where(c => c.Length > 5), only syntax is different. Most of the time, it is preferable to use fluent syntax over query syntax due to several reasons which we will explore in the following section. The output of the code in Example2 is as follows:

Output2



Complex Queries via Query Operator Chaining



In both of our 1st and 2nd example, we executed a simple query, involving one operator to get the desired result. However this is not the case always, most of the time you will need to query sequences implementing several conditions. In such cases, you can use more than one operator in a LINQ query. This process is called query operator chaining. Have a look at our next example to further understand this concept.

Example3

Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            string[] Cars = { "Toyota", "Honda", "Suzuki", "BMW" };

            IEnumerable<string> carscoll = Cars.Where(c => c.Contains("o")).OrderBy(c => c.Length).Select(c => c.ToUpper());

            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }

            Console.WriteLine("\n************************************\nLINQ query without extension methods\n");
            carscoll = Enumerable.Select(
                Enumerable.OrderBy(
                    Enumerable.Where(
                        Cars, c=>c.Contains("a")
                        ), c=>c.Length
                ), c=>c.ToUpper()
            );
            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }
            Console.ReadLine();
        }
    }
}
Let us explore the code in Example three. Here we have same Cars sequence, but now we want a LINQ query that can get all the elements that contain string “a” in them. We want these elements to be sorted based on the length of the string and we want to convert these elements into upper case. Note, here we need to perform three functionalities. Checking if element contains a certain string, sorting the data and then projecting it into uppercase. We need three query operators for this purpose. Have a look at this line of code, it does the actual business.
Code:
Cars.Where(c => c.Contains("o")).OrderBy(c => c.Length).Select(c => c.ToUpper());
Actually here we are executing query operators extension methods in the form of a chain. First we used ‘Where’ operator to find all those elements that contains string “a” in them. The output of this method would be ‘Toyota’ and ‘Honda’. This is actually in internal IEnumerable<string> object; this would be pumped as input to the next query operator in chain i.e ‘OrderBy’. In ‘OrderBy’ query operator we are sorting the data based on the length of the elements. The returned sequence would be another anonymous IEnumerable<string> object contained ‘Honda’ and the second element ‘Toyota’ in sorted order based on length. This would be pumped to ‘Select’ query operator. ‘Select’ query is just used to perform transformation specified within the bracket. Inside the ‘Select’ query operator, we are transforming the input sequence elements to uppercase. Final result of these chained query operator would be capital ‘HONDA’ as the first element & capital ‘TOYOTA’ as the second element. This is what we required i.e. all the elements containing string “a”, sorted by length and converted into uppercase.

In order to show you the importance of extension methods and the benefits they bring to a program, we have chained query operators using ordinary method as well. Have a look at this line of code:
Code:
carscoll = Enumerable.Select(
    Enumerable.OrderBy(
        Enumerable.Where(
            Cars, c=>c.Contains("a")
            ), c=>c.Length
    ), c=>c.ToUpper()
);
Here we are executing same LINQ query which gets us all the elements containing string “a”, sorting by length and projected to capital case. But in this case, we are using ordinary query operator method rather than extension methods. You can see that this LINQ query is less compact, readable and maintainable as compared to the query that involved extension methods. The output of the code in Example3 is as follows.

Output3



An important thing to remember here is that you can also use delegates with the query operators that refer to particular method, rather than using lambda expressions with query operators. This approach is particularly useful when you are querying XML sequence via LINQ. However if you are querying remote database sequences such as SQL server, you have to use lambda expression with query operators because sequences implementing IQueryable<T> interface has a class Queryable that contains operators which only accept lambda expressions.

Some important Query Operators



Though, there are around 40 query operators in the Enumerable class and similar number in Queryable class, we will discuss some of the most important ones.
  • Take - Take operator is used to get first x number of elements from a sequence where x is a number specified by user.
  • Skip - Skip is another very important query operator. This operator skips first x number of elements in a sequence where x is a number specified by the user.
To see how, ‘Take’ and ‘Skip’ operators work, have a look our fourth example.

Example4

Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            string[] Cars = { "Toyota", "Honda", "Suzuki", "BMW" };

            Console.WriteLine("\n****************************\nUsing Take query operator...\n");
            IEnumerable<string> carscoll = Cars.Take(3);

            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }
            Console.WriteLine("\n****************************\nUsing Skip query operator...\n");
            carscoll = Cars.Skip(2);
            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }
            Console.ReadLine();
        }
    }
}
Have a look at the code in Example4; we have same ‘Cars’ sequence that contains four strings as car names. First we use ‘Take’ query operator with Cars sequence and passed it the value 3. This LINQ query would return sequence that would contain first three elements of the ‘Cars’ sequence.

Next, we have called ‘Skip’ query operator on ‘Cars’ sequence and passed it 2, it will skip the first two elements in the ‘Cars’ and would return all the remaining elements. The result of the code in Example4 is as follows.

Output4



A very important thing to note here is that, query operators do not alter the original sequence that has been passed to them; instead they internally creates anonymous sequence and returns it back to the variable that is used to store the sequence.
  • Reverse - Reverse operator as the name signifies, takes an input sequence and reverses the order of the element. Next example demonstrates this concept.
Example5

Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            string[] Cars = { "Toyota", "Honda", "Suzuki", "BMW" };

            IEnumerable<string> carscoll = Cars.Reverse();
            foreach (string car in carscoll)
            {
                Console.WriteLine(car);
            }
            Console.ReadLine();
        }
    }
}
Ouput5



In Example5 we had a Cars sequence. We called ‘Reverse’ operator on this sequence and you will see that when you enumerate upon the returned sequence, it will contain elements in the reverse order.

Query Operators returning single value



All the query operators do not return a sequence of elements. Some query operators transform the sequence and return back single value. Some of important query operators that return single value are:
  • First - This operator returns the first element of a sequence.
  • Last - It returns the last element of the sequence.
  • ElementAt - Returns the element located at the specified index.
  • Count - Use to get the number of elements in a given sequence.
  • Min - It returns the minimum value in a sequence.
  • Contains - This operator returns ‘true’ if sequence contains specified element.
  • Any - This operator returns ‘true’ if the sequence contains any element that satisfies the lambda expression passed to it.
Note: Min & Count are called aggregate operators because they return aggregate values.
Note: Contains, Any and all the other operators that return some bool values are called quantifier operators.

Now, in our next example, I will show you that how the operators returning single values are actually used in the code. Have a look at our 6th example.

Example6

Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            int[] integers = { 4, 2, 3, 8, 2, 7, 9 };

            int first = integers.First();
            int last = integers.Last();
            int element4 = integers.ElementAt(3);

            int thridlowest = integers.OrderBy(i => i).Skip(2).First();

            int totalnumbers = integers.Count();
            int minimum = integers.Min();

            bool contains = integers.Contains(9);
            bool haseven = integers.Any(n => n % 2 == 0);

            Console.WriteLine("\n**************************\n");
            Console.WriteLine("First Element:" + first);
            Console.WriteLine("Last Element:" + last);
            Console.WriteLine("4th element:" + element4);
            Console.WriteLine("Third smallest element:" + thridlowest);
            Console.WriteLine("Total Elements:" + totalnumbers);
            Console.WriteLine("Minimum Number:" + minimum);
            Console.WriteLine("Sequence contains 9?" + contains);
            Console.WriteLine("Sequence has any even number?" + haseven);

            Console.ReadLine();
        }
    }
}
In Example6, we have declared an integer type array which we have named ‘integers’. This array contains some random integers in unsorted order. First we called the ‘First’ query operator to get the first element. Next, we called ‘Last’ query operator to get the last element. We then called ElementAt operator which would return value at the specified index. These are simple methods.

Now, consider a scenario where you want to get the third smallest element. You can do this by following query:
Code:
int thridlowest = integers.OrderBy(i => i).Skip(2).First();
Here basically, we are chaining the query operators. As we know that query operators are executed from left to right. Therefore we first ordered all the elements using OrderBy query operator. The result of this query would be the sequence of elements ordered in ascending ordered. The next operator in the chain i.e Skip operator would take this ordered sequence as input sequence and would skip the first two elements because we have passed it 2. Now, we have a sequence arranged in ascending order whose first 2 elements which were actually first two smallest elements of the sequence have been skipped. It means that now the third smallest element is present at the beginning. We can get this element by simply using ‘First’ query operator which is our next operator in the chain. The result of this query would be 3 as there are two elements in the sequence, smaller than 3, i.e 2 twice.

We have then applied ‘Count’ operator to get total elements and then ‘Min’ operator to get the minimum value. Next we have called ‘Contain’ operator and then ‘Any’ operator to find if the sequence contains element ‘9’ and if it has an even element, respectively. Notice how we found that if a sequence contains an even number or not. Consider the following line:
Code:
bool haseven = integers.Any(n => n % 2 == 0);
The lambda expression says that find all the elements ‘n’ in the sequence that are divisible by 0. If any element is found that is divisible by zero, it means that is even number, so return true. If you do not pass any lambda expression to the ‘Any’ operator, it would simply check if there is any element in the sequence, if there is, it will return true. If the sequence is empty and contains zero elements ‘Any’ operator would return false. The output of the code in Example6 is as follows:

Output6



Operators that take two sequences as input



Till now, we have seen query operators that take one sequence as input. However, there are query operators that take two sequences. Two of the most important such operators are as follows:
  • Concat - This query operator takes two sequences as input and appends the sequence that is passed to it at the end of the sequence which calls it.
  • Union - This query operator simply takes union of the two sequences that have been passed to it.
Our next example explains the working of both of these operators. Have a look at our 7th example.

Example7
Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            int[] integers = { 4, 2, 3, 8, 2, 7, 9 };

            int[] integers2 = { 6, 1, 7, 2, 5, 8, 4 };

            Console.WriteLine("Concatenating two sequences ...\n");
            IEnumerable<int> result = integers.Concat(integers2);

            foreach (int num in result)
            {
                Console.Write(num + ", ");
            }

            Console.WriteLine("\n\nTaking union of two sequences ...\n");
            result = integers.Union(integers2);

            foreach (int num in result)
            {
                Console.Write(num + ", ");
            }
            Console.ReadLine();
        }
    }
}
In Example7, we have two arrays: Integer and Integer2. Both of these arrays contain some random integers. We first called ‘Concat’ operator and displayed its result using foreach operator. You will see in the output elements of sequence Integer2 would be integrated after the elements of sequence Integer and, duplicated elements are also displayed.

Contrarily, when you use ‘Union’ operator on two sequences, this operator would discard duplication and would return single element even if that element is present in both the sequences. The output of the code in Example7 is as follows:

Output7



Difference between Query Expressions in LINQ & SQL



In our second example, I explained how query syntax or commonly known as query expressions can be used instead of fluent syntax for executing LINQ queries over sequences. We shall explain differences between query syntax and SQL syntax later, first let us revisit the concept of query expressions, let us see another example, where we would be getting the all the even elements from a sequence, via query syntax. Have a look at the 8th example of this article.

Example8

Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            int[] integers = { 4, 2, 3, 8, 2, 7, 9 };

           IEnumerable<int> evens = from n in integers
                                where n % 2 == 0
                                select n;
           foreach (int even in evens)
           {
               Console.Write(even + ", ");
           }
            Console.ReadLine();
        }
    }
}
Pay attention on these lines of code:
Code:
IEnumerable<int> evens = from n in integers
                                where n % 2 == 0
                                select n;
Here we are using query syntax to get all the elements, where n%2 ==0, means n is even. Note that all the LINQ queries implemented via query syntax start with from and end with either ‘select’ clause or the ‘group’ clause. Another extremely important thing to note is the concept of range variable.

Range variable is basically a variable that enumerates through the provided sequence. In Example8, ‘n’ is basically the range variable which enumerates through sequence, ‘integers’. Another thing to note here is that the scope of range sequence is local. The output of the code in Example8 would contain all the even elements in the ‘integer’ sequence. The output of Example8 is as follows:

Output8



If you are familiar with the SQL queries, you might have noticed that query syntax is quite similar to the SQL queries. However, there are many integral differences between the two. A striking difference between LINQ and SQL is that in former case you can never use a variable unless it is declared. In fact, LINQ queries follow all the standard rules of C# language syntax. On the other hand, in SQL queries you can directly use a variable without first declaring it.

Another major difference between the LINQ and SQL is in terms of sub-queries. In C#, subqueries are just another C# language expression whereas when you write subqueries in SQL, it has to conform to special rules.

In LINQ, queries are well-structured and more organized where control of data flow is from left to right. SQL language is not that well-structured control-flow is often times not that organized and readable.

Combining Fluent Syntax & Query Expressions



You can choose whatever LINQ query style you like (I personally prefer fluent syntax). However, you can also integrate both of them. You can write LINQ queries that are partially written in query expression and partially written in fluent syntax. Have a look at our next example.

Example9
Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {

            int[] integers = { 4, 2, 3, 8, 2, 7, 9 };

            int evennums = (from n in integers
                                where n % 2 == 0
                                select n).Count();
            Console.WriteLine("Total number of even numbers are: "+ evennums);
            Console.ReadLine();
        }
    }
}
In Example9, we are again finding all those elements that are even. But in this example we have enclosed our query expression inside round brackets and after the ending round bracket we have appended ‘Count’ operator using dot. Here ‘Count’ operator denotes the fluent syntax. The query syntax would pump all the elements that are event to this ‘Count’ operator and count operator would return the number of even elements which would be 4 because we have 4 even elements in our ‘integers’ sequence. The output of the code in Example9 is as follows:

Ouput9



Deferred Query Execution



Apart from numerous benefits that LINQ brings, deferred query execution is the most exciting. Deferred execution principle states that query should not be executed until it is enumerated. Or we can say that when MoveNext method is called on a sequence, query is actually executed at that time. This concept is best explained with the help of example. Have a look at our next example.

Example10
Code:
using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            List <int> integers = new List<int>{4,2};

            IEnumerable<int> subtracted = integers.Select(n => n - 2);

            integers.Add(8);

            foreach(int num in integers)
            {
                Console.Write(num +" ");
            }
            Console.ReadLine();
        }
    }
}
In Example10, we have a List<int> collection which we have named ‘integers’. We have initialized this collection with two elements i.e ‘4’ & ‘2’. Next, we have written a LINQ query that would subtract 2 from each element of the ‘integers’ collection. So, the result of this query should be a sequence that contains ‘2 & 0’. After executing the query, we are adding another element ‘8’ into the ‘integers’ collection. Note that, LINQ query has been constructed before adding the third element. After that we enumerate on the ‘integers’ collection display the result. The output of the code in Example10 is as follows

Output10



You can see that though when you constructed the query, you had two elements. But in the output, you have three elements, all transformed according to the query. You can see that you added ‘8’ after constructing the query but ‘8’ has also been transformed into ‘6’ (Subtracting 2 according to query). The reason has been mentioned earlier, query was actually executed when we enumerated the sequence using foreach loop. And at that time, third element was also present in the sequence.

This article explains most fundamental concepts of LINQ. LINQ is a very vast domain, in upcoming articles, I’ll show you some advanced features of LINQ.