Similarity bet. text files

kens · Jul 17, 2007

Find the similarity between two text files.The similarity index needs to b defined by yrself. The similarity shud b content based! for example two docs talki abt tennis n cricket will hv a lower similarity index than both talkin abt the sam sport.

Please suggest how to approach this problem.

DaWei · Jul 17, 2007

I realize that English might not be your first language. In that case, do not use AOL-speak guides as a suitable translator or dictionary. I would not give the sweat off my balls to a person so lazy as to use 'b' for 'be'. There are innumerable other examples in your post.

Work. Break a sweat. Expend some energy. Post some ideas or some code. Then you will receive help.

Don't post code without learning about code tags.

shabbir · Jul 18, 2007

You should be comparing using the Dynamic programming technique. That way you can find the similarity between 2 strings. Use the NEEDLEMAN AND WUNSCH ALGORITHM or SMITH-WATERMAN ALGORITHM for sequence comparison using the Dynamic programming approach

kens · Jul 18, 2007

I will take care to post it properly next time.

The algorithms you suggested, as far as I know, are used for string matching and DNA matching espicially pairwise sequence matching. Can you also suggest some other algorithm of lesser complexity? Thanks for your help.

shabbir · Jul 18, 2007

NEEDLEMAN AND WUNSCH ALGORITHM is a very simple algorithm and is the basic of dynamic programming but your requirement to match the conversation is not that simple to analyze.

DaWei · Jul 18, 2007

You might have a look at this discussion.

Log in or Sign up

Similarity bet. text files

kens New Member

DaWei New Member

shabbir Administrator Staff Member

kens New Member

shabbir Administrator Staff Member

DaWei New Member

Share This Page

Log in or Sign up

Similarity bet. text files

kens New Member

DaWei New Member

shabbir Administrator Staff Member

kens New Member

shabbir Administrator Staff Member

DaWei New Member

Share This Page

Useful Searches