1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

string search

Discussion in 'Perl' started by abhisheksainiabhishe, Jun 10, 2008.

  1. abhisheksainiabhishe

    abhisheksainiabhishe New Member

    Joined:
    Jun 10, 2008
    Messages:
    7
    Likes Received:
    0
    Trophy Points:
    0
    I am writing a perl program which should do the following...

    for ex. if I have a html file like..

    <b>this is bold.</b>This is
    bold too</b>

    I have to write the program (without using any html parser function) that would print it like.....

    <b>this is bold.This is bold too</b>

    basically it would remove unnecarry tags.

    I just have to use regular expressions for it.

    My instructor advised me not to read the html file line by line as it would not take care of if a tags have beginning tags in on line 1 and the end tag is on the line after (as seen in the file above). I was suggested to put all the html file into one scalar variable.
    Now I have made the program so it puts all the html file in one scalar variable. Now my question is how would I search for several instances of <b> and </b> tags in the scalar variable. Should I read it character by character? I am very consfused on this part. Please advise me. Thanks!
     
  2. abhisheksainiabhishe

    abhisheksainiabhishe New Member

    Joined:
    Jun 10, 2008
    Messages:
    7
    Likes Received:
    0
    Trophy Points:
    0
    Hi,

    so far i have am able to remove the bold tags as.....

    <b>abcd</b>efgh<b>ijkl</b>

    to

    <b>abcdefghijkl</b>

    by using...
    $allHtmlDocument =~ s/$endBoldTag(\s*)$startBoldTag//gi;

    now the problem is...

    if I have <b>abcd</b><i><b>efgh</i></b>

    and I want to make it like

    <b>abcd<i>efgh</i></b>


    then I still need to remove the bold tags (as there are only tags between them) but I also need to keep the tags between them.how would i capture those tags. I am unable to figure out any way since I am not reading the whole document line by line.

    Thanks!
     
  3. abhisheksainiabhishe

    abhisheksainiabhishe New Member

    Joined:
    Jun 10, 2008
    Messages:
    7
    Likes Received:
    0
    Trophy Points:
    0
    i got it too
    will post the solution sometime.

    thanks anyways!

    this thread can be closed now.
     

Share This Page