Go4Expert

Go4Expert (http://www.go4expert.com/)
-   C++ (http://www.go4expert.com/forums/cpp/)
-   -   Intersection of two strings (http://www.go4expert.com/forums/intersection-strings-t18656/)

CrazyGal 22Jul2009 19:12

Intersection of two strings
 
Hi,

Could someone tell me if there's a better way of doing this? I was asked this question in an interview and I wrote a piece of code of O(n2).

Code:

// This function returns a new string of the common alphabets found in a and b
char* intersection(char*a, char* b)
{
  static char temp[20];
  int i = 0;
  if ( (a==NULL) || (b==NULL))
        exit(0);
  else
    {
        while ( a != NULL)
        {
              if(strchr(b, *a))
              {
                    temp[i] = *a;
                    i++;
                    a++;
              }
        } 
    }

    return temp;
}

Thanks,
Tina

CrazyGal 22Jul2009 19:14

Re: Intersection of two strings
 
Resending the query with code tags.

Intersection of two strings
Hi,

Could someone tell me if there's a better way of doing this? I was asked this question in an interview and I wrote a piece of code of O(n2).

// This function returns a new string of the common alphabets found in a and b
Quote:

char* intersection(char*a, char* b)
{
static char temp[20];
int i = 0;
if ( (a==NULL) || (b==NULL))
exit(0);
else
{
while ( a != NULL)
{
if(strchr(b, *a))
{
temp[i] = *a;
i++;
a++;
}
}
}

return temp;
}
Thanks,
Tina

SaswatPadhi 22Jul2009 20:30

Re: Intersection of two strings
 
I think this will work too :
Code: C++

#include <algorithm>
#include <iostream>
#include <iterator>
#include <set>
#include <string>

using namespace std;

int main()
{
    string a = "abcdk57";
    string b = "abcdek5";
    string c;

    set_intersection(
        a.begin(), a.end(),
        b.begin(), b.end(),
        back_inserter(c));

    cout << c << '\n';
}


CrazyGal 23Jul2009 18:10

Re: Intersection of two strings
 
Yea, true. But could there be some way which doesn't use the standard calls?

Thanks,
Tina

SaswatPadhi 23Jul2009 20:26

Re: Intersection of two strings
 
Hi Tina,

So, you want it without using STL calls.
Now, I have quite a few things to discuss :

(1) What exactly do you want your func to return ?? Please answer the example :
Let StringA = "THIS IS COOL"
And String B = "THIS IS NOT COOL"
What do you want your func to return ??

I think if you want to return "Intersection", you should return : "CHILOST" (only the common characters). But, you code (though has a small error) returns "THIS IS COOL".

(2) The second thing I wanna say is, the error in your code :

Look carefully :

Code:

char* intersection(char*a, char* b)
{
    static char temp[20];
    int i = 0;
    if ( (a==NULL) || (b==NULL))
        exit(0);
    else
    {
        while ( a != NULL)
        {
            if(strchr(b, *a))
            {
                temp[i] = *a;
                i++;
                a++; // here is the error
            }
        }
    }
    return temp;
}

You see the error ? When a character in a is not found in b, the program goes into an infinite loop !! So, the correct code will be :


Code: C++

char* intersection(char*a, char* b)
{
    static char temp[20];
    int i = 0;
    if ( (a==NULL) || (b==NULL))
        exit(0);
    else
    {
        while ( a != NULL)
        {
            if(strchr(b, *a))
            {
                temp[i] = *a;
                i++;
            }
            a++;
        }
    }
    return temp;
}


(3) The third thing, I think I have a better code .. (probably O(m+n))

Code: C++

#include <cstring>
#include <stdio.h>
#include <ctime>

char* Intersection_1(char* a, char* b)
{
    static char temp[300];
    int i = 0;
    if ( (a==NULL) || (b==NULL))
        return NULL;
    else
    {
        while ( *a != '\0')
        {
            if (strchr(b, *a))
            {
                temp[i] = *a;
                i++;
            }
            a++;
        }
    }

    return temp;
}

char* Intersection_2(char* StringA, char* StringB)
{
    static char C[300];
    char* pC = C;
    static char Table[2][200];
    int EndOfA = strchr(StringA, '\0')-StringA;
    int EndOfB = strchr(StringB, '\0')-StringB;

    for (int i=0; i < EndOfA; ++i)   (*(*Table+StringA[i]))++;
    for (int i=0; i < EndOfB; ++i)   (*(*(Table+1)+StringB[i]))++;

    for (int i=32; i < 127; ++i)
    {
        if ( (*(*Table+i)) && (*(*(Table+1)+i)) )
            *(pC++) = i;
    }
    *pC = '\0';

    return C;
}


int main()
{
    clock_t t1, t2;

    char A[200] = "THIS PROGRAM IS WRITTEN BY SASWAT PADHI. THIS IS THE FIRST TEST STRING. *&%$*%#*$%#(%^)(UFSUDEIRTGITuffifiiiugiu length = 125";
    char B[200] = "SGSYEDBSDJDISYD#**EFHHWSKDLJD_))@#**** RANDOM DATA IN THIS STRING **** YEOYRHWEJQPWEI[QIWEQJWDBKDFB,SDBLHSYWERYWOYR8423RH LENGTH = 134";
    char* C;
    float diff;

    t1 = clock();
    for (int i = 0; i < 100000; ++i)       C = Intersection_1(A,B);
    t2 = clock();
    diff = ((float)t2 - (float)t1) / 1000.0F;
    printf("\nC = \"%s\"\nAlgo 1 --> Running Time for 100000 iterations : %lf sec\n", C, diff);


    t1 = clock();
    for (int i = 0; i < 100000; ++i)       C = Intersection_2(A,B);
    t2 = clock();
    diff = ((float)t2 - (float)t1) / 1000.0F;
    printf("\n\nC = \"%s\"\nAlgo 2 --> Running Time for 100000 iterations : %lf sec\n", C, diff);

    return 0;
}


The program compares two LONG strings 1 lakh (10^5) times. The output :

Code:

C = "THIS PROGRAM IS WRITTEN BY SASWAT PADHI THIS IS THE FIRST TEST STRING **#*#
)FSDEIRTGIT  = 12"
Algo 1 (Your Algo)--> Running Time for 100000 iterations : 1.218000 sec


C = ")*12=ABDFGHIMNOPRSTWY"
Algo 2 (My Algo)--> Running Time for 100000 iterations : 0.344000 sec


CrazyGal 23Jul2009 20:55

Re: Intersection of two strings
 
Hi,

Thanks for your response. Its a good one. I see you're trying to build a histogram. Yes, I need to print only the common characters. So, we need to have a huge array capable of holding all the ASCII characters. Doesn't it take more memory? Or it's something we need to sacrifice to getter better performance?

Please let me know.

Thanks for writing to me.

Tina

xpi0t0s 23Jul2009 21:16

Re: Intersection of two strings
 
It's not that huge, ASCII only constitutes 128 characters, or 256 if you count all 8-bit possibilities of extended ASCII.

There is often a space/performance tradeoff in optimising code. You can do stuff really quick and take up loads of memory, or you can do stuff in virtually no memory and take ages, and there are often points in between.

So you can reduce the histogram from Saswat's 400 chars to just 32 bytes as follows:
set all 32 bytes to zero
scan the first string and for each character present, take the character code and split it in two: the lower 3 bits and the upper 5 bits.
The upper 5 bits divided by 8 gives you the index into the 32-byte array.
The lower 3 bits identify one of the 8 bits in that byte.
Set the relevant bit (1 << that_3_bit_number) to 1.

Scan the second string and again split the character code into two parts (5 and 3 bits).
Check the relevant bit and if it's 1 then you have a duplicate character - output it.

This prints duplicates, of course, so if string 1 is THIS IS COOL and string 2 is XPIOTOS then the program output will be IOTOS - the duplicate O's may be a problem. If not then this is your solution.

If it is then you can have a second array of 32 bytes, and when you parse the 2nd string update the 2nd array in the same way as when you parsed the 1st string.
To pick out updates scan both sets of 256 bits and where there is a 1 in both, you have a common character. Output then would be IOST (in some order).

Or you could just have a single array of 256 ints, initially zero, scan the first string, set relevant flag to 1, scan the 2nd string and shift the relevant flag left by 1. Then scan the ints and where a flag is 2 you have a character that was in both strings.

mayjune 24Jul2009 02:47

Re: Intersection of two strings
 
@Crazygal
Quote:

I was asked this question in an interview and I wrote a piece of code of O(n2).
how's it n^2? As theres only one while loop, no nested loops nothing... how can it be n^2 then?
Please clarify....

@saswat
Quote:

for (int i=37; i < 130; ++i)
why is the iteration from 37 to 130? I didn't get it.... is it just for this particular string? or otherwise..?

SaswatPadhi 24Jul2009 05:52

Re: Intersection of two strings
 
2:47 AM !?? Don't you sleep mayjune ?? :p

CrazyGal is right, it's O(n^2), but it would be better to say it's O(n*m).
You see the use of strchr(), which searches for a character inside string b. It performs 'm' operations : matches the character with each char of string b.
As strchr() is executed in each iteration of the "while" loop, total number of operations is n*m.
Hence, the time complexity is O(n*m). :)

Oops, the loop should be from 32 to 127. (Edited my post)
The idea is to scan the range of "printable" ASCII chars only ;)

CrazyGal 24Jul2009 09:22

Re: Intersection of two strings
 
Hi,

Thanks for your response xpi0t0s. It seems little complicated for me to get what you're trying to say. I'll try to code your algorithm and figure out if I could understand the logic.

Thanks again.

Tina


All times are GMT +5.5. The time now is 21:44.