Go4Expert

Go4Expert (http://www.go4expert.com/)
-   C (http://www.go4expert.com/articles/c-tutorials/)
-   -   Ultimate Guide To Searching (http://www.go4expert.com/articles/ultimate-guide-searching-t21355/)

techgeek.in 14Mar2010 13:01

Ultimate Guide To Searching
 

Introduction



In computer science a search algorithm is a technique of finding a particular item among a collection of items. Searching for data is the second most common activity (after sorting data) necessary for creating many types of programs.

Sequential Search



In this technique we scan the entire list from the start or beginning till we encounter our sought value. If we reach the end of the list and did not encounter our sought value then it means that the value is not present in the list.For small lists, a sequential search is simple to use and fast. But if you need to search large amounts of data, the sequential search is slow and cumbersome.

Code in C:-
Code:

int Find (int X, int A[], int n)
{
    int i;
   
    for (i=0; i<n && A[i]!=X; i++);
    if (i= =n) return -1;
    else return i;
}

In the above code the element X is searched in list A[]. If found then the position is returned otherwise -1 is returned.
Let us analyse this algorithm. Obviously this is an O(n) algorithm. But what is the number of comparisons? Check out, that every time in the for loop 2 comparisons are being made. Hence, the number of comparisons is 2(n+1). Can we decrease it and reduce the number of comparisons to 1 every time in the for loop? Yes, we can do this by using "sentinel".

Sentinel

Sentinels are bounds kept in a list in order to minimize runtime. Suppose, if before entering the array, we insert an X in the nth position, then it is sure that X is present in the array, and we need not check for the case that X is absent, and we have reached the array bound.

Code:

int Find (int X, int A[], int n)
{
    int i;
   
    A[n]=X;
        for (i=0; A[i]!=X; i++);
    if (i= =n)
        return -1;
        else return i;
}

This reduces the number of comparisons to n+1.

Binary Search



It is a technique of locating the position of an element in a sorted list. It looks at the middle element of the sorted list if it is equal to the value being sought then the position has been found. Otherwise if the sought value is lower than the middle element then the lower half is chosen for further searching and if it is greater then the upper half is chosen. Thus this method reduces the number of elements needed to be checked by a factor of two each time. A binary search involves continually dividing the data in half hence, binary and searching the lower or upper half as appropriate.
The steps involved in performing a binary search can be summarized as follows:

1. Start in the middle of the list.
2. Compare the search key with the item at the current location.
3. If the search key is at the current location, then you are done.
4. If the search key is less than the item at the current location, then it must be in the lower half of the data (if it exists at all) so divide the list in two and go to step 1, using the lower half.
5. Otherwise, it must be in the upper half of the list (again, if it exists at all), so go to step 1, using the upper half.

code in c:-

Code:

int binarySearch(int a[],int low,int high)
{
    if (low<=high)
    {
        mid=(low+high)/2;
        if (a[mid]<x)
            return    binarysearch(a,mid+1,high);
        else if (x<a[mid])
            return    binarysearch(a,low,mid-1);
        else return mid;
    }
    return -1;
}

Here, the runtime is equal to the runtime of the if part or the else part plus the time for test.
Runtime T(N)=T(N/2)(if or else part)
+ k(test)
which is the recurrence relation
T(N)=T(N/2)+k
T(1)=1;

T(1)=1
T(2)=1+k
T(4)=(1+k)+k=1+2k
T(8)=(1+2k)+k=1+3k
T(16)=(1+3k)+k=1+4k
| | | |
| | | |
| | | |
T(N)=1+k*log N=O(log N)
So, this is an O(log N) algorithm which is the best possible.

Binary Search Tree



Binary Search Tree is a Binary Tree in which a left descendant of a node is always lesser than the node and the right descendant is always greater than the node. This structure enables one to search for and find an element with an average running time f(N)=O(log N).
Example:-
http://www.go4expert.com/images/arti...ing/search.jpg

Algorithms:-

This is an algorithm to find the position or index of an element in a BST.
Code:

Position ind(int X, SearchTree T)
{
        if(T= =NULL)
                return NULL;
        if(X < T->Element)
                return Find(X,T->Left);
        else
        if(X > T->Element)
                return Find(X,T->Right);
        else
                return T;
}

Following is an algorithm to find the position of the minimum element in a BST.

Code:


Position FindMin(SearchTree T)
{
        if(T= =NULL)
                return NULL;
        else
        if(T->Left= =NULL)
                return T;
        else
                return FindMin(T->Left);
}        /*A recursive implementation*/

Following is an algorithm to find the position of the maximum element in a BST.

Code:

Position FindMax(SearchTree T)
{
      if(T!=NULL)
      while(T->Right!=NULL)
      T=T->Right;
      return T;
}/*A non-recursive implementation*/

Code in C For searching a particular item in a BST:-

Code:

//Structure of the binary search tree node is as shown below:-

struct treenode
{

int info;
struct treenode* left;
struct treenode* right;

}

typedef treenode TREENODE;

// Find an item

TREENODE* find(int item, TREENODE* t)
{
if(t==null) return null;
if(item< t->info) return find(item, t->left);
if(item> t->info) return find(item, t->right);
return t;
}

A Little About Breadth-First Search and Depth-First Search



Breadth-First Search:-

breadth-first search (BFS) is a graph search algorithm that begins at the root node and explores all the neighboring nodes. Then for each of those nearest nodes, it explores their unexplored neighbor nodes, and so on, until it finds the goal.The aim of BFS algorithm is to traverse the graph as close as possible to the root node. Queue is used in the implementation of the breadth first search. Lets see how BFS traversal works with respect to the following graph:

http://www.go4expert.com/images/arti...arching/s1.jpg

If we do the breadth first traversal of the above graph and print the visited node as the output, it will print the following output. A B C D E F. The BFS visits the nodes level by level, so it will start with level 0 which is the root node, and then it moves to the next levels which are B, C and D, then the last levels which are E and F.

Algorithm:-
  • Step 1: Push the root node in the Queue.
  • Step 2: Loop until the queue is empty.
  • Step 3: Remove the node from the Queue.
  • Step 4: If the removed node has unvisited child nodes, mark them as visited and insert the unvisited children in the queue.

Code in C:-
Code:

# include<stdio.h>
# define size 20
# define T 1
# define F 0

struct Edge
{
        int terminal;
        struct Edge *next;
};

struct Vertex
{
        int visit;
        int vertex_no;
        char info;
        int path_length;
        struct Edge *Edge_Ptr;
};
struct Q
{
        int info;
        struct Q *next;
};

void Table(int , int matrix [size][size], struct Vertex vert[size]);
struct Edge *Insert_Vertex (int , struct Edge *);
void BFS ( int , struct Vertex vert [size]);
void Input(int, int mat [size][size]);
void Output(int number, int mat [size][size]);
struct Q *Insert_Queue(int vertex_no, struct Q *first);
struct Q *Delete_Queue(int *vertex_no, struct Q *first);

/* Insert vertex into connectivity list */

struct Edge * Insert_Vertex (int vertex_no, struct Edge
*first)          {
        struct Edge *new1, *current;
        new1 = (struct Edge *) malloc(sizeof(struct Edge));
        new1->terminal = vertex_no;
        new1->next = NULL;
        if (!first)
                return (new1);
        for (current = first; current->next; current = current->next);
        current->next = new1;
        return (first);
}

/* Insert vertices into queue */

struct Q * Insert_Queue(int vertex_no, struct Q *first)
{
        struct Q *new1, *current;
        new1 =(struct Q *) malloc(sizeof(struct Q));
        new1->info = vertex_no;
        new1->next = NULL;
        if (!first)
                return (new1);
        for (current = first; current->next; current = current->next);
        current->next = new1;
        return (first);
}

struct  Q * Delete_Queue(int *vertex_no, struct Q *first)
{
        struct Q *previous;
        if (!first)
                return (NULL);
        *vertex_no = first->info;
        previous = first;
        first = first->next;
        free(previous);
        return (first);
}

/* Initializing entries */

void Table(int vertex_num, int matrix [size][size],
struct  Vertex vert[size])
{
        int i, j;
        for (i = 0; i < vertex_num; i++)
        {
                vert [i].visit = F;
                vert [i].vertex_no = i+1;
                vert [i].info = 'A'+ i;
                vert [i].path_length = 0;
                vert [i].Edge_Ptr = NULL;
        }

        for (i =0; i < vertex_num ; i++)
                for (j =0; j < vertex_num ; j++)
                        if (matrix [i][j] > 0 )
                                vert [i].Edge_Ptr = Insert_Vertex (j, vert [i].Edge_Ptr);
}

/* Computing path length */

void BFS ( int index, struct Vertex vert [size])
{
        struct Q  *queue = NULL;
        struct Edge *Link;
        vert [index].visit = T;
        queue = Insert_Queue(index, queue);
        while(queue)
        {
                queue = Delete_Queue(&index, queue);
                for ( Link = vert [index].Edge_Ptr; Link; Link = Link->next)
                {
                        if (vert [Link->terminal].visit == F)
                        {
                                vert[Link->terminal].visit = T;
                                vert[Link->terminal].path_length=vert[index].path_length+1;
                                queue = Insert_Queue(Link->terminal, queue);
                        }
                }
        }
}

/* Input function to read adjacency matrix */

void Input(int number, int mat [size][size])
{
        int i, j;
        printf("\n Input the adjacency matrix \n");
        for (i =0; i < number; i++)
        {
                for (j=0; j < number; j ++)
                {
                        scanf("%d", &mat [i][j]);
                }
                printf("\n");
        }
}

/* Output function to display adjacency matrix */

void Output(int number, int mat [size][size])
{
        int i, j;
        printf("\n Adjacency matrix \n");
        for (i =0; i < number; i++)
        {
                for (j=0; j < number; j ++)
                {
                        printf("  %d", mat [i][j]);
                }
                printf("\n");
        }
}

/* Function main */
void  main()
{
        int i, number, index;
        int mat [size][size];
        struct Vertex vert [size];
        struct Edge *List;
        printf("\n Input the number of vertices in the graph: ");
        scanf("%d", &number);
        Input(number, mat);
        Output(number, mat);
        Table(number, mat, vert);
        printf("\n Input the starting vertex 0- %d :",number-1);
        scanf("%d", &index);
        BFS (index, vert);
        printf("\n Path length of the vertex from %c", vert[index].info)
            ;
        printf("\n Vertex Length Vertex Connectivity \n ");
        for (i = 0; i < number; i++)
        {
                printf("\n  %c    %d  ",vert[i].info, vert[i].path_length);
                for (List= vert[i].Edge_Ptr; List; List = List->next)
                {
                        printf(" ");
                        putchar(List->terminal+'A');
                }
        }
}

Depth-First Search:-

Depth-First Search is also a graph search algorithm that progresses by expanding the first child node of the search tree that appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then the search backtracks, returning to the most recent node it hasn't finished exploring.The aim of DFS algorithm is to traverse the graph in such a way that it tries to go far from the root node. Stack is used in the implementation of the depth first search. Lets see how depth first search works with respect to the following graph:

http://www.go4expert.com/images/arti...arching/s1.jpg

As stated before, in DFS, nodes are visited by going through the depth of the tree from the starting node. If we do the depth first traversal of the above graph and print the visited node, it will be A B E F C D. DFS visits the root node and then its children nodes until it reaches the end node, i.e. E and F nodes, then moves up to the parent nodes.

Algorithm:-
  • Step 1: Push the root node in the Stack.
  • Step 2: Loop until stack is empty.
  • Step 3: Peek the node of the stack.
  • Step 4: If the node has unvisited child nodes, get the unvisited child node, mark it as traversed and push it on stack.
  • Step 5: If the node does not have any unvisited child nodes, pop the node from the stack.

Code in C:-


Code:

# include<stdio.h>
# define size 20
# define T 1
# define F 0

struct Edge
{
        int terminal;
        struct Edge *next;
};
struct Vertex
{
        int visit;
        int vertex_no;
        char info;
        int path_length;
        struct Edge *Edge_Ptr;
};
void Table(int , int matrix [size][size], struct Vertex vert[size]);
struct Edge *Insert_Vertex (int , struct Edge *);
void DFS ( int , int *dist, struct Vertex vert [size]);
void Input(int, int a [size][size]);
void Output(int, int a [size][size]);

struct Edge * Insert_Vertex (int vertex_no, struct Edge *first)
{
        struct Edge *new1, *current;
        new1 = (struct Edge *) malloc(sizeof(struct Edge));
        new1->terminal = vertex_no;
        new1->next = NULL;
        if (!first)
                return (new1);
        for (current = first; current->next; current = current->next);
        current->next = new1;
        return (first);
}

/* Initializing entries */

void Table(int vertex_num, int matrix [size][size],
struct Vertex vert[size])
{
        int i, j;
        for (i = 0; i < vertex_num; i++)
        {
                vert [i].visit = F;
                vert [i].vertex_no = i+1;
                vert [i].info = 'A'+ i;
                vert [i].path_length = 0;
                vert [i].Edge_Ptr = NULL;
        }

        for (i =0; i < vertex_num ; i++)
                for (j =0; j < vertex_num ; j++)
                        if (matrix [i][j] > 0)
                                vert [i].Edge_Ptr = Insert_Vertex (j, vert [i].Edge_Ptr);
}

/* Computing path length */
void DFS ( int index, int *dist,
struct Vertex vert [size])
{
        struct Edge *Link;
        vert [index].visit = T;
        vert [index].path_length = *dist;
        *dist += 1;
        for ( Link = vert [index].Edge_Ptr; Link; Link = Link->next)
                if (vert [Link->terminal].visit == F)
                        DFS (Link->terminal, dist, vert);
}

/* Input function to read adjacency matrix */

void Input(int number, int a [size][size])
{
        int i, j;
        printf("\n Input the adjacency matrix \n");

        for (i =0; i < number; i++)
        {
                for (j=0; j < number; j ++)
                {
                        scanf("%d", &a [i][j]);
                }
                printf("\n");
        }
}

/* Output function */
void Output(int number, int a [size][size])
{
        int i, j;
        printf("\n Adjacency matrix \n");
        for (i = 0; i < number; i++)
        {
                for (j = 0; j < number; j ++)
                {
                        printf("  %d", a [i][j]);
                }
                printf("\n");
        }
}

/* Function main */
void  main()
{
        int i;
        int number, index, dist;
        int a [size][size];
        struct Vertex vert [size];
        struct Edge *List;
        printf("\n Input the number of vertices in the graph: ");
        scanf("%d", &number);
        Input(number, a);
        Output(number, a);

        Table(number, a, vert);
        printf("\n Input the starting vertex 0- %d:", number-1);
        scanf("%d", &index);
        dist = 0;
        DFS (index, &dist, vert);
        printf("\n Path length of the vertex from %c", vert[index].info);
        printf("\n Vertex Length Vertex Connectivity \n ");
        for (i = 0; i < number; i++)
        {
                printf("\n  %c        %d  ", vert[i].info, vert[i].path_length);
                for (List= vert[i].Edge_Ptr; List; List = List->next)
                {
                        printf(" ");
                        putchar(List->terminal+'A');
                }
        }
}

(note:- For the BFS and DFS part readers are expected to know about Stack, Queue, Adjacency Matrix and other concepts related to graph. All this things are not explained in order to keep the article short).

shabbir 2Apr2010 09:11

Re: Ultimate Guide To Searching
 
If you like this this article nominate it for Article of the month - Mar 2010


All times are GMT +5.5. The time now is 17:07.