XML parsing in Perl

Discussion in 'Perl' started by pradeep, May 19, 2006.

  1. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    As the world is fast becoming aware of the benifits of XML, perl developers would also want to use XML in their CGI-Perl scripts. XML parsing seems to be one hell of a job when you look at the XML::Parser module, but XML::Simple comes to the rescue with the ease of use it brings.

    Installing XML::Simple

    XML::Simple works by parsing an XML file and returning the data within it as a Perl hash reference. Within this hash, elements from the original XML file play the role of keys, and the CDATA between them takes the role of values. Once XML::Simple has processed an XML file, the content within the XML file can then be retrieved using standard Perl array notation.

    It can be installed from the shell or CPAN module:

    Code:
    shell> perl -MCPAN -e shell
      OR
      cpan> install XML::Simple
    Basic XML Parsing

    Once you've got the module installed, create the following XML file and call it "data.xml":

    Code:
    <?xml version='1.0'?>
       <employee>
              <name>Pradeep
              <age>23
              <sex>M
              <department>Programming
       </employee>
    And then type out the following Perl script, which parses it using the XML::Simple module:

    Code:
    #!/usr/bin/perl
       
       # use module
       use XML::Simple;
       use Data::Dumper;
       
       # create object
       $xml = new XML::Simple;
       
       # read XML file
       $data = $xml->XMLin("data.xml");
       
       # print output
       print Dumper($data);
    When you run this script, here's what you'll see:

    Code:
    $VAR1 = {
      		  'department' => 'Programming',
      		  'name' => 'Pradeep',
      		  'sex' => 'M',
      		  'age' => '23'
      		};
    As you can see, each element and its associated content has been converted into a key-value pair of a Perl associative array. You can now access the XML data as in the following revision of the script above:

    Code:
    #!/usr/bin/perl
       
       # use module
       use XML::Simple;
       
       # create object
       $xml = new XML::Simple;
       
       # read XML file
       $data = $xml->XMLin("data.xml");
       
       # access XML data
       print "$data->{name} is $data->{age} years old and works in the $data->{department} section\n";
    Here's the output:

    Code:
    Pradeep is 23 years old and works in the Programming section
    XML::Simple can help you achieve more complex parsing, which we'll look at some other day. Till then happing parsing.
     
  2. tarunt

    tarunt New Member

    Joined:
    Feb 4, 2010
    Messages:
    2
    Likes Received:
    0
    Trophy Points:
    0
    Occupation:
    programmer
    Location:
    india, u.p
    hi pradeep,

    please provide usage of XML:Sax in an example.
     
  3. bharatbsharma

    bharatbsharma New Member

    Joined:
    May 7, 2010
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    0
    i am using this cpan module and this is really useful
    But i have a query

    my xml file looks like this
    --------------------------------------------
    <mac>
    <calls>
    <scallPs>83234</scallPs>
    <sreadPs>7462</sreadPs>
    <swritPs>7394</swritPs>

    </calls>

    <cpu>
    <usr>10</usr>
    <sys>3</sys>
    <wio>0</wio>
    <idle>87</idle>
    </cpu>
    </mac>
    ------------------------------------------
    i can print individual value like $data->{cpu}->{idle}

    But how can i find no. of elements in <cpu> or <calls> .. which is 3 and 4 respectively.
    and how can find find the no. of element of <map> which is 2 namely <CPU> and <calls>

    Thanks in advance
     
  4. amangupta14

    amangupta14 New Member

    Joined:
    May 26, 2010
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    0
    I need to parse a huge XML file, please let me know which module to use for the same.

    XML file is something like this
    Code:
    <datapoint><name>CMS</name><pid>2416</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>2.48415111588068</CPU><CPUTime>47500</CPUTime><VirtualBytes>338404</VirtualBytes><PrivateBytes>95096</PrivateBytes><HandleCount>2678</HandleCount><Threads>159</Threads> </datapoint>
    <datapoint><name>java</name><pid>420</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>0</CPU><CPUTime>19656.25</CPUTime><VirtualBytes>493860</VirtualBytes><PrivateBytes>115920</PrivateBytes><HandleCount>1691</HandleCount><Threads>93</Threads> </datapoint>
    <datapoint><name>java</name><pid>4880</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>4.96830223176136</CPU><CPUTime>333437.5</CPUTime><VirtualBytes>589440</VirtualBytes><PrivateBytes>206056</PrivateBytes><HandleCount>1934</HandleCount><Threads>93</Threads> </datapoint>
    <datapoint><name>java</name><pid>7280</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>1.55259444742543</CPU><CPUTime>305546.875</CPUTime><VirtualBytes>819052</VirtualBytes><PrivateBytes>476528</PrivateBytes><HandleCount>6101</HandleCount><Threads>72</Threads> </datapoint>
    <datapoint><name>java</name><pid>3048</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>0.931556668455255</CPU><CPUTime>1125</CPUTime><VirtualBytes>196352</VirtualBytes><PrivateBytes>19536</PrivateBytes><HandleCount>509</HandleCount><Threads>19</Threads> </datapoint>
    <datapoint><name>java</name><pid>2752</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>0.310518889485085</CPU><CPUTime>250</CPUTime><VirtualBytes>190936</VirtualBytes><PrivateBytes>14520</PrivateBytes><HandleCount>337</HandleCount><Threads>14</Threads> </datapoint>
    <datapoint><name>Disk (0 C:)</name><pid>0</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><DiskRead_KBPerSec>0.00</DiskRead_KBPerSec><DiskWrite_KBPerSec>28.05</DiskWrite_KBPerSec><DiskBusy_Percent>2.79</DiskBusy_Percent><DiskRead_Percent>0.00</DiskRead_Percent><DiskWrite_Percent>2.79</DiskWrite_Percent><DiskIdle_Percent>97.75</DiskIdle_Percent> </datapoint>
    <datapoint><name>Disk (_Total)</name><pid>0</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><DiskRead_KBPerSec>0.00</DiskRead_KBPerSec><DiskWrite_KBPerSec>28.05</DiskWrite_KBPerSec><DiskBusy_Percent>2.79</DiskBusy_Percent><DiskRead_Percent>0.00</DiskRead_Percent><DiskWrite_Percent>2.79</DiskWrite_Percent><DiskIdle_Percent>97.75</DiskIdle_Percent> </datapoint>
    <datapoint><name>Network (Broadcom NetXtreme Gigabit Fiber)</name><pid>0</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><NetworkReceived_KBPerSec>26.50</NetworkReceived_KBPerSec><NetworkSent_KBPerSec>103.43</NetworkSent_KBPerSec> </datapoint>
     
    Last edited by a moderator: May 26, 2010
  5. rajseo

    rajseo Banned

    Joined:
    Feb 23, 2010
    Messages:
    28
    Likes Received:
    1
    Trophy Points:
    0
    Nice and great suggestion thanks for sharing....
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice