Go4Expert

Go4Expert (http://www.go4expert.com/)
-   C# (http://www.go4expert.com/articles/c-sharp-tutorials/)
-   -   Subqueries & Interpreted Queries in LINQ (http://www.go4expert.com/articles/subqueries-interpreted-queries-linq-t30039/)

shabbir 31Mar2014 08:25

Subqueries & Interpreted Queries in LINQ
 
If you are beginner to LINQ, I will suggest you to first read Introduction to LINQ. However, if you have read that article or have some basic understanding of LINQ queries and lambda expressions, you can carry on with this article. In this article, I am going to explain you some more advanced LINQ concept. We will begin with sub-queries in LINQ and finally, we will discuss the concept of interpreted queries. While reading the tutorial, I will suggest you to test code sample yourself. I will also encourage you to modify them and see their behavior.

Subqueries in LINQ



Sub-query or commonly written as subquery is basically a query within a query. In order to write a subquery, you have to write a query after the lambda expression of the outer query. Subquery follows all the standard C# expression rules and in fact is a C# expression. Since, you can right any valid C# expression on the right side of a lambda expression, you can also write a subquery. The best way to learn this concept is via an example. Have a look at our first example.

Example1

Code:

using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            List<string> persons = new List<string> { "Bret Lee",
                                                      "Michael Hussey",
                                                      "Ricky Ponting",
                                                      "James Fauklkner",
                                                      "Stewart Law" };

            IEnumerable<string> orderedbylastname = persons.OrderBy(p => p.Split().Last());

            foreach (string name in orderedbylastname)
            {
                Console.WriteLine(name);
            }
            Console.ReadLine();
        }
    }
}

In Example1, we have a List<string> collection which we have named ‘persons’. This collection contains complete names (First & Last) of some random persons. What if we want to sort this list by last name? The first thing that comes to your mind after is perhaps the ‘OrderBy’ operator and that’s correct. You need an ‘OrderBy’ operator. But you will have to pass all the last names in the ‘persons’ collection to this ‘OrderBy’ operator so that the operator can return the sorted sequence based on last name. In order to fetch the last names, you can use another query. You can use ‘Split’ operator which would split the sequence of string into words. Then you can call ‘Last’ operator which would return the last element of the split sequence. Since, all the elements in the ‘persons’ collection only have two words, therefore, calling ‘Split’ operator would split the string into two words: First name and the last name. When you call ‘Last’ operator on this sequence, last word would be returned which is actually the last name in ‘persons’ sequence. This last name can be fed into the outer query that uses ‘OrderBy’ operator. The output of the code in Example1 is as follows:

http://imgs.g4estatic.com/c-sharp/li...ry/output1.png

It is evident from the output that the names have been arranged in alphabetical order of the last names of the elements in sequence ‘persons’.

Our first example was a very simple one where we used only one subquery; however, you can have multiple, in fact as many Subqueries as you want depending upon the requirement of the task you want to perform. In our next example, we will show you that how you can get all the elements of a string sequence, which are equal in length to the largest element of the sequence. Have a look at our next example.

Example2

Code:

using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            List<string> persons = new List<string> { "Bret",
                                                      "Hussey",
                                                      "Ponting",
                                                      "Faukner",
                                                      "Lewis" };

            IEnumerable<string> orderedbylastname = persons.Where(p => p.Length == persons.OrderBy(p2 =>p2.Length).Select(p2 =>p2.Length).Last());

            foreach (string name in orderedbylastname)
            {
                Console.WriteLine(name);
            }
            Console.ReadLine();
        }
    }
}

In this example we have again a List<string> collection named ‘persons’ which contain some random names. In order to get all those names that are equal in length to the largest name we have used following query:
Code:

persons.Where(p => p.Length == persons.OrderBy(p2 =>p2.Length).Select(p2 =>p2.Length).Last());
In the above line of code, we have an outer query with ‘Where’ operator, inside the lambda expression of the outer query, we are comparing the length of the elements in sequence ‘persons’ use a range variable ‘p’. In order to get the length of the largest element in the sequence, we have used a subquery. We first sorted all the elements based on their length by using OrderBy operator. The result of the sorted sequence is fed into Select operator that would select the length of all the sorted elements. And finally, the last operator returns the length of the last element in the sorted sequence. Since, we sorted the sequence based on length; therefore the last element would have the largest length. We again come back to our outer query. Here we will compare the length of each element with the largest length, and elements having same length as the largest element would be returned. Since, the largest element has length ‘7’, therefore all the elements with length ‘7’ would be returned. The output of the code in Example2 is as follows.

Output2

http://imgs.g4estatic.com/c-sharp/li...ry/output2.png

An important point to note in Exampl2 is that in the subquery we used ‘p2’ range variable instead of the range variable of the outer query i.e ‘p’. The reason is that the range variable of the outer query has a scope over the subquery, therefore we need another range variable in the subquery for comparison with the outer query’s range variable.

We can further, abridge the code in Example2, by using following query:
Code:

persons.Where(p => p.Length == persons.Max(n=>n.Length));
This would also return as all the elements with length equal to that of largest element in the sequence. Here we have simply used Max query operator and passed it the predicate that find the element with maximum length.

It is always better to avoid Subqueries unless required necessarily. This technique results in a more readable and organized code. For example, you can get all the elements with length equal to that of the maximum element, this way as well:
Code:

int maximum = persons.Max(n => n.Length);
IEnumerable<string> orderedbylastname = persons.Where(p => p.Length == maximum);

The above line of code eliminates the need for subquery. Also, you can use the ‘maximum’ variable, wherever you want without querying the ‘persons’ collection again and again.

Progressive Query Construction



Instead of building your LINQ queries in a single long statement, it is always advisable to write queries progressively. We know that a query operator basically returns a decorator sequence; this decorator sequence can be further queried via another operator and so on. Have a look at our 3rd example.

Example3

Code:

using System;
using System.Collections.Generic;
using System.Linq;

namespace CSharpTutorial
{
    class Program
    {
        public static void Main()
        {
            List<string> persons = new List<string> { "Bret",
                                                      "Hussey",
                                                      "Ponting",
                                                      "Faukner",
                                                      "Lewis" };

            var names = persons.Where(n=>n.Contains("t"));
            var sorted = names.OrderBy(n=>n.Length);
            var caps = sorted.Select(n=>n.ToUpper());

            foreach (string name in caps)
            {
                Console.WriteLine(name);
            }
            Console.ReadLine();
        }
    }
}

The code in Example3 is straight forward, here instead of writing one long query to get all the elements that contain “t”, sorted by length and converted to uppercase, we have three queries. The first query returns all the elements that contain “t”. This is a decorated sequence and we can store it any var type variable which we have named ‘names’. Next, we will fed this ‘name’ sequence to the OrderBy operator which will sort elements in the names ‘sequence’ based on their length. The resultant sequence will be stored in the ‘sorted’ variable and finally, Select operator would execute on this ‘sorted’ variable to return the elements in capitals. If you look at the output of the code in Example3, it will contain ‘BRET’ & ‘PONTING’ because these elements contain ’t’, are sorted by length and converted to uppercase.

Output3

http://imgs.g4estatic.com/c-sharp/li...ry/output3.png

Interpreted Queries



LINQ queries are divided into two broad categories: local & remote queries. Till now in this tutorial, we have implemented LINQ queries over local collection of objects. These are local queries and can be executed over all those collections that implement IEnumerable<T> interface. These queries use a query operator located in Enumerable class and these operator can execute queries using fluent syntax, query expression or simple delegates. The local queries are just other C #expressions as explained earlier.

Remote queries or what we call interpreted queries are those queries that are execute over remote data such as SQL Server tables located on a remote server. Interpreted queries are executed over any sequence of data which implements the IQueryable<T> interface. Interpreted queries resolve to query operators located in the Queryable class. Interpreted queries emit expression trees at runtime.

The .NET Framework contains two major implementations of the IQueryable<T> interface:
  • LINQ to SQL.
  • Entity Framework.
The ordinary collections can also be converted to implement the IQueryable<T> interface. This is done by calling AsQueryable method on the collection.

In order to explain interpreted queries, we will use LINQ to SQL technique because implementing an interpreted query over entity framework first requires us to write Entity Data Model which is out of the scope of this article. However, the interpreted queries over LINQ to SQL are still valid for the Entity Framework.

Now, you will be required doing some database work to proceed. If you are familiar with creating databases, well and good; if you have never created a database via visual studio, I will show you how to do this. Follow these steps.

Note: This tutorial has been written using visual studio 2010. The options that are explained might be located at different place on your version of VS, but the concept remains the same.

1. Right Click on the project name and click Add => New Item. See following figure:

http://imgs.g4estatic.com/c-sharp/li...query/fig1.png

2. A window will appear, find the option “Local Database” and name it “Bookstore”. See the figure below:

http://imgs.g4estatic.com/c-sharp/li...query/fig2.png

3. Click on the “Server Explorer” on the left side of the window if it is not already open and you should see your database in the server explorer. See following figure:

http://imgs.g4estatic.com/c-sharp/li...query/fig3.png

4. Right click on Tables. Click “Create New Table” and then add data as the following figure:

http://imgs.g4estatic.com/c-sharp/li...query/fig4.png

5. Again, Explore the Table option in the Bookstore database, You will find “Authors” table there; right click on authors table and choose the option “Show Table Data”.

http://imgs.g4estatic.com/c-sharp/li...query/fig5.png

6. Table will initially contain no data, manually enter the following data in the table and now your table should look like this.

http://imgs.g4estatic.com/c-sharp/li...query/fig6.png

Now, to see how interpreted queries actually work inside a code, have a look at our fourth example.

Example4
Code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Data.Linq;
using System.Data.Linq.Mapping;

namespace CSharpTutorial
{
    [Table]
    public class Authors
    {
        [Column(IsPrimaryKey = true)]
        public int ID;

        [Column]
        public string Name;
    }

    class Program
    {
        public static void Main()
        {
            DataContext datacontext = new DataContext("Bookstore.sdf");
            Table<Authors> authors = datacontext.GetTable<Authors>();
            IQueryable<string> names =  from a in authors
                                        where a.Name.Contains("i")
                                        orderby a.Name
                                        select a.Name.ToUpper();

            foreach (string name in names)
            {
                Console.WriteLine(name);
            }
            Console.ReadLine();
        }
    }
}

Carefully examine the code in Example34. First check the namespaces, first we have to add reference to the System.Data.Linq dll in order to use the System.Data.Linq namespace. We also have to add System.Data.Linq.Mapping namespace. You must be wondering about the next line of code i.e

Code:

    [Table]
    public class Authors
    {
        [Column(IsPrimaryKey = true)]
        public int ID;

        [Column]
        public string Name;
   
    }

Table here is basically a TableAttribute class that is used to create a class which is associated to some table in the database. Table is written inside square brackets. Inside this class we write Column in square bracket followed by simple class member variables. These member variables refer to the columns of the table. Remember that we had an ‘Authors’ table which has two columns: an ID and a Name. Therefore, in our Authors class in our code, we have two members ID and Name. Note that we can also specify database rules in our C# code. We have specified in our first ColumnAttribute class that the following member corresponds to primary key. These tables and there structures would be explained in detail in my next article on LINQ to SQL & Entity Framework. For now, just bear in mind that this Authors class corresponds to Authors table in our Bookstore database.

Now, come inside our main method. The DataContext type is the starting point for entering any LINQ to SQL framework. Whenever you want to query remote SQL database, you will have to start from here. The constructor of DataContext type takes a string called connection string and it is basically path to the database to which we want to connect. In our case, we want to connect to the Bookstore database and fortunately enough for us; it is located in the same directory as our code file. Therefore, we can just pass the name of the database “Bookstore.sdf” in the constructor of the DataContext object. You also specify the full path name to your database, it doesn’t make any difference. We have datacontext object of type DataContext in our code. Using this datacontext object, we can get all the tables in the Bookstore database.

In our next line of code, we have a Table<T> type. Table <T> basically implements IQueryable<T> interface and we can execute remote queries on Table<T> types which we shall see. Consider following line of code.
Code:

Table<Authors> authors = datacontext.GetTable<Authors>();
Basically here we getting the content of Authors table from the Bookstore database by calling GetTable method on the datacontext object. We are storing this table’s content to Table<Authors> authors which can hold Authors table from the Bookstore database because we have already declared an Author class in our code to hold such type of data.

Now, coming towards the query; we have query that contains three query operators but there is a huge difference between these query operators and the one we discussed in the case of remote queries. In remote queries, these query operators belonged to Enumerable class and resolved to ‘decorated sequence’, but in case of these interpreted queries, query operators in Queryable class would be used. Though, Table<T> implements both IEnumerable and IQueryable interface, yet ‘Where’ operator from Queryable class would be used because it is close in syntax. Another important thing to note here is that query operator in Queryable class emit expression tree which is basically an object model that can be inspected at run time. These expression tree are basically System.Linq.Expression types and are used for translating LINQ to SQL or entity framework to underlying SQL.

Like, local queries, you can also chain interpreted queries via operator. Our query will first fetch all the names in the column Names that contain ‘i’ in them. Next, we will sort these names into alphabetically order by calling OrderBy and finally we will use Select operator to convert all the names to uppercase. In our “Authors” table, we had two names that contained ‘i’. These are “Michael” & “Richards”. So if you look at the output of the code in Example4, it will contain these two names as shown below.

Output4

http://imgs.g4estatic.com/c-sharp/li...ry/output4.png

Using Logical Operators with Query Operators



In our last example, we queried the database based on one attribute that was author’s name. However, we can take any number of database columns into account while writing LINQ to SQL queries. In our 5th Example, I am going to show you that how you can write more complex queries on a database. If you have multiple columns in your table, you can use logical operators such as OR ‘|’|, AND ‘&&’ and NOT ’!’ to achieve the desired result. Have a look at our 5th example of this article.

Example5
Code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Data.Linq;
using System.Data.Linq.Mapping;

namespace CSharpTutorial
{

    [Table]
    public class Authors
    {
        [Column(IsPrimaryKey = true)]
        public int ID;

        [Column]
        public string Name;
   
    }

    class Program
    {
        public static void Main()
        {
            DataContext datacontext = new DataContext("Bookstore.sdf");
            Table<Authors> authors = datacontext.GetTable<Authors>();
            IQueryable<string> names =  from a in authors
                                        where a.Name.Contains("a") && a.ID<4
                                        orderby a.Name
                                        select a.Name.ToUpper();

            foreach (string name in names)
            {
                Console.WriteLine(name);
            }
            Console.ReadLine();
        }
    }
}

The code in Example5 is quite similar to that of Example4 but here we have a slightly complex query. This example shows that how LINQ queries can be used to interact with multiple columns of a database table. In our Bookstore database, we have one table which we named Authors and that table contained two columns, the ID which was a primary key and the second column was Names. In previous example we only interact with the Names column and fetched all those names which contain some substring. However, what if we want get the names of all those authors that contain a letter “a” in them and have an ID less than 4? We can do this by using logical && operators in the where statement as we have done in the Exampl5. Consider the following lines of code from Example5.
Code:

            IQueryable<string> names =  from a in authors
                                        where a.Name.Contains("a") && a.ID<4
                                        orderby a.Name
                                        select a.Name.ToUpper();

In the above lines of code we have used query expression syntax to achieve our desired objective. We will get an element from authors using a range variable which is ‘a’ in this case. In the next line, in the ‘Where’ query operator, we have first put a condition that from the range variable ‘a’, get the element if it contains string “a”. Then we used an AND operator followed by a new condition that return only those Authors containing ‘a’ in their names and with ID less than 4. After that we sort those names by alphabetical order using OrderBY and then will convert them to uppercase using Select operator. The output would contain three elements i.e. JAMES, MARK and MICHAEL, these elements contain a in their names, have ID less than 4 and are converted to capitals.

Output5

http://imgs.g4estatic.com/c-sharp/li...ry/output5.png

Like local queries, the execution of the interpreted queries, is also deferred until they are enumerated. Therefore, the query in Example3, will execute only when ‘names’ sequence is enumerated via foreach loop. Also, if an interpreted query is enumerated twice, the corresponding database is queried twice.


All times are GMT +5.5. The time now is 07:53.