Go4Expert (http://www.go4expert.com/)
-   C (http://www.go4expert.com/forums/c/)
-   -   Similarity bet. text files (http://www.go4expert.com/forums/similarity-bet-text-files-t5321/)

kens 18Jul2007 01:17

Similarity bet. text files
Find the similarity between two text files.The similarity index needs to b defined by yrself. The similarity shud b content based! for example two docs talki abt tennis n cricket will hv a lower similarity index than both talkin abt the sam sport.

Please suggest how to approach this problem.

DaWei 18Jul2007 03:08

Re: Similarity bet. text files
I realize that English might not be your first language. In that case, do not use AOL-speak guides as a suitable translator or dictionary. I would not give the sweat off my balls to a person so lazy as to use 'b' for 'be'. There are innumerable other examples in your post.

Work. Break a sweat. Expend some energy. Post some ideas or some code. Then you will receive help.

Don't post code without learning about code tags.

shabbir 18Jul2007 09:07

Re: Similarity bet. text files
You should be comparing using the Dynamic programming technique. That way you can find the similarity between 2 strings. Use the NEEDLEMAN AND WUNSCH ALGORITHM or SMITH-WATERMAN ALGORITHM for sequence comparison using the Dynamic programming approach

kens 18Jul2007 13:19

Re: Similarity bet. text files
I will take care to post it properly next time.

The algorithms you suggested, as far as I know, are used for string matching and DNA matching espicially pairwise sequence matching. Can you also suggest some other algorithm of lesser complexity? Thanks for your help.

shabbir 18Jul2007 14:04

Re: Similarity bet. text files
NEEDLEMAN AND WUNSCH ALGORITHM is a very simple algorithm and is the basic of dynamic programming but your requirement to match the conversation is not that simple to analyze.

DaWei 18Jul2007 17:37

Re: Similarity bet. text files
You might have a look at this discussion.

All times are GMT +5.5. The time now is 19:09.