Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Perl (http://www.go4expert.com/articles/perl-tutorials/)
-   -   XML parsing in Perl (http://www.go4expert.com/articles/xml-parsing-perl-t812/)

pradeep 19May2006 18:19

XML parsing in Perl
 
As the world is fast becoming aware of the benifits of XML, perl developers would also want to use XML in their CGI-Perl scripts. XML parsing seems to be one hell of a job when you look at the XML::Parser module, but XML::Simple comes to the rescue with the ease of use it brings.

Installing XML::Simple

XML::Simple works by parsing an XML file and returning the data within it as a Perl hash reference. Within this hash, elements from the original XML file play the role of keys, and the CDATA between them takes the role of values. Once XML::Simple has processed an XML file, the content within the XML file can then be retrieved using standard Perl array notation.

It can be installed from the shell or CPAN module:

Code:

shell> perl -MCPAN -e shell
  OR
  cpan> install XML::Simple

Basic XML Parsing

Once you've got the module installed, create the following XML file and call it "data.xml":

Code: XML

<?xml version='1.0'?>
   <employee>
          <name>Pradeep
          <age>23
          <sex>M
          <department>Programming
   </employee>

And then type out the following Perl script, which parses it using the XML::Simple module:

Code: Perl

#!/usr/bin/perl
   
   # use module
   use XML::Simple;
   use Data::Dumper;
   
   # create object
   $xml = new XML::Simple;
   
   # read XML file
   $data = $xml->XMLin("data.xml");
   
   # print output
   print Dumper($data);


When you run this script, here's what you'll see:

Code:

$VAR1 = {
                    'department' => 'Programming',
                    'name' => 'Pradeep',
                    'sex' => 'M',
                    'age' => '23'
                  };

As you can see, each element and its associated content has been converted into a key-value pair of a Perl associative array. You can now access the XML data as in the following revision of the script above:

Code: Perl

#!/usr/bin/perl
   
   # use module
   use XML::Simple;
   
   # create object
   $xml = new XML::Simple;
   
   # read XML file
   $data = $xml->XMLin("data.xml");
   
   # access XML data
   print "$data->{name} is $data->{age} years old and works in the $data->{department} section\n";

Here's the output:

Code:

Pradeep is 23 years old and works in the Programming section
XML::Simple can help you achieve more complex parsing, which we'll look at some other day. Till then happing parsing.

tarunt 5Feb2010 11:06

Re: XML parsing in Perl
 
hi pradeep,

please provide usage of XML:Sax in an example.

bharatbsharma 7May2010 11:54

Re: XML parsing in Perl
 
i am using this cpan module and this is really useful
But i have a query

my xml file looks like this
--------------------------------------------
<mac>
<calls>
<scallPs>83234</scallPs>
<sreadPs>7462</sreadPs>
<swritPs>7394</swritPs>

</calls>

<cpu>
<usr>10</usr>
<sys>3</sys>
<wio>0</wio>
<idle>87</idle>
</cpu>
</mac>
------------------------------------------
i can print individual value like $data->{cpu}->{idle}

But how can i find no. of elements in <cpu> or <calls> .. which is 3 and 4 respectively.
and how can find find the no. of element of <map> which is 2 namely <CPU> and <calls>

Thanks in advance

amangupta14 26May2010 18:43

Re: XML parsing in Perl
 
I need to parse a huge XML file, please let me know which module to use for the same.

XML file is something like this
Code:

<datapoint><name>CMS</name><pid>2416</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>2.48415111588068</CPU><CPUTime>47500</CPUTime><VirtualBytes>338404</VirtualBytes><PrivateBytes>95096</PrivateBytes><HandleCount>2678</HandleCount><Threads>159</Threads> </datapoint>
<datapoint><name>java</name><pid>420</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>0</CPU><CPUTime>19656.25</CPUTime><VirtualBytes>493860</VirtualBytes><PrivateBytes>115920</PrivateBytes><HandleCount>1691</HandleCount><Threads>93</Threads> </datapoint>
<datapoint><name>java</name><pid>4880</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>4.96830223176136</CPU><CPUTime>333437.5</CPUTime><VirtualBytes>589440</VirtualBytes><PrivateBytes>206056</PrivateBytes><HandleCount>1934</HandleCount><Threads>93</Threads> </datapoint>
<datapoint><name>java</name><pid>7280</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>1.55259444742543</CPU><CPUTime>305546.875</CPUTime><VirtualBytes>819052</VirtualBytes><PrivateBytes>476528</PrivateBytes><HandleCount>6101</HandleCount><Threads>72</Threads> </datapoint>
<datapoint><name>java</name><pid>3048</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>0.931556668455255</CPU><CPUTime>1125</CPUTime><VirtualBytes>196352</VirtualBytes><PrivateBytes>19536</PrivateBytes><HandleCount>509</HandleCount><Threads>19</Threads> </datapoint>
<datapoint><name>java</name><pid>2752</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><CPU>0.310518889485085</CPU><CPUTime>250</CPUTime><VirtualBytes>190936</VirtualBytes><PrivateBytes>14520</PrivateBytes><HandleCount>337</HandleCount><Threads>14</Threads> </datapoint>
<datapoint><name>Disk (0 C:)</name><pid>0</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><DiskRead_KBPerSec>0.00</DiskRead_KBPerSec><DiskWrite_KBPerSec>28.05</DiskWrite_KBPerSec><DiskBusy_Percent>2.79</DiskBusy_Percent><DiskRead_Percent>0.00</DiskRead_Percent><DiskWrite_Percent>2.79</DiskWrite_Percent><DiskIdle_Percent>97.75</DiskIdle_Percent> </datapoint>
<datapoint><name>Disk (_Total)</name><pid>0</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><DiskRead_KBPerSec>0.00</DiskRead_KBPerSec><DiskWrite_KBPerSec>28.05</DiskWrite_KBPerSec><DiskBusy_Percent>2.79</DiskBusy_Percent><DiskRead_Percent>0.00</DiskRead_Percent><DiskWrite_Percent>2.79</DiskWrite_Percent><DiskIdle_Percent>97.75</DiskIdle_Percent> </datapoint>
<datapoint><name>Network (Broadcom NetXtreme Gigabit Fiber)</name><pid>0</pid><time>5/10/2010 10:51:50</time><machine>DEWDFTF11382S</machine><NetworkReceived_KBPerSec>26.50</NetworkReceived_KBPerSec><NetworkSent_KBPerSec>103.43</NetworkSent_KBPerSec> </datapoint>


rajseo 27May2010 17:45

Re: XML parsing in Perl
 
Nice and great suggestion thanks for sharing....


All times are GMT +5.5. The time now is 21:31.