XML is used for designing the web pages in an application. XML means Extensible Markup Language. It is a markup language containing tags to define data. The tags used in the language contain the content specific meaning. The data is stored in a structured format. XML is a cross platform, hardware and software independent markup language. It helps computers to transfer data between the heterogeneous systems. It is used as a common data interchange format in number of applications. Advantages of XML Some of the advantages of XML in web designing are as mentioned below: 1. Domain specific vocabulary: It uses a set of predefined tags to present data in different format. While working with HTML, you are restricted to use only the specific tags. On the other hand, XML does not contain any predefined tags. User can create new tags based on the requirements of the application. 2. Data interchange: It is essential for performing business transactions. User needs to establish standard interfaces among the related enterprises in which the data is stored in disparate forms to facilitate data interchange. XML produces files that are unambiguous, easy to generate, and easy to read. 3. Smart Searches: HTML provides with a set of predefined tags, it is difficult to implement a search on an HTML document. Consider the following example. Code: The best picture award goes to <FILM> Titanic </FILM> The film was based on the story of the ship called <SHIP> Titanic </SHIP> In the above code, it is clear that Titanic is a reference to a film by the name. This enables the user to search for the film. 4. Granular updates: Document updates are slow as the entire document needs to be refreshed from the server. In uploading an XML document, only the part of the document is uploaded. 5. User selected view of data: In HTML, user needs to create separate HTML pages to display the information in different formats. XML document focus on the data and not its presentation. 6. Message transformation: In XML, a message can be stored in the form of a document, object data or data from a database. The messages are designed such that they reflect the information content and not the use of the messages. XML Standard XML was defined by W3C to ensure that structured data is uniform and independent of vendors and applications. Due to the flexibility for customization in XML, W3C made certain rules that need to be followed by the users. They are as mentioned below: XML must be directly usable over the Internet XML must support a wide variety of applications XML must be compatible with SGML XML should have absolute minimum number of optional features XML documents must be legible and clear XML design must be formal and concise Identifying the structure of XML documents An XML application is considered to be well designed if it is robust and scalable. To design a robust and scalable XML application, user needs to perform the following steps: Create an information model to understand the structure and meaning of the information that will be stored in the documents Identify the required components of the XML document Create the document considering the set of the predefined rules A. Information Modeling An information model is a description of the information used in an organization. In the absence of an information model, there is only data and no information. In XML, an information model is used to understand the structure and meaning of information that will be stored in XML documents. It helps to identify the objects involved in an application, properties of the objects, and relationship among them. XML provides the following capabilities to information modeling. Heterogeneity: Each record can contain different data fields. This can be used to express information without restrictions. Extensibility: New data types can be added whenever required. This allows user to accept rather than avoid change. Flexibility: Data fields can vary in size and configuration between instances. XML imposes restrictions on data. User can create static, dynamic, or a combination of both these information models. Static Model A static information model helps to define all the objects in an application and the relationship among them. The best approach for the model is the step by step method. The method is described as follows: Naming: User should start the information modeling exercise by naming the entities, objects, classes, and data elements. Defining the object types: User needs to define the object types. Using the type hierarchy: User needs to organize the listed and named object types into a hierarchical classification schema. Finding relationship: The user needs to determine the relationship between the objects Defining properties: Properties are the values associated with the objects. User needs to define the properties for the objects. Dynamic Model In the dynamic model, data flow diagrams and process diagrams are used to determine the flow of information. User can determine the information flow of an application in the form of messages. Some of the approaches of the dynamic model are as follows: Process and workflow models: These models focus on the roles of the people and organization. Data flow models: These models are similar to the process and workflow models. The model describes the data stores that store the information, processes that manipulate the data, and data flows that transfer the data from one processor to another. Object models: These models have a dynamic component and a static component. Object life histories: They focus on the individual objects. Object life histories describe what happens to an object through the lifetime. Use cases: They analyze the user tasks that are accomplished. They can be useful for both in modeling the business and describing the internal behavior of the system. Object interaction diagrams: These diagrams analyze the exchange of messages between the objects to detail then the data flow model. B. Components of XML document XML enables to store the structured data such that different devices can recognize it. User needs to recognize the data before storing it in the XML document. This involves arranging the data in a hierarchy. The various components of the XML document are as follows: 1. Processing Instruction (PI): An XML document usually begins with the XML declaration statement called the Processing Instruction known as PI. The PI provides the information on how the XML file should be processed. The statement can be written as follows: Code: <? xml version=”1.0” encoding=”UTF-8”? > The above statement indicates that the XML version 1.0 is used. The statement is optional. The statement must be written in lowercase letters. The statement used the encoding property to specify the encoding schema used to create the XML file. The UTF 8 is the standard character set used to create pages in English. UTF stands for (Universal Character Set) Transformation Format. The character set uses the eight bits of information to represent each character. 2. Tags: They are used to specify the name of a given piece of information. It is the means of identifying data. The tag consists of opening and closing angular brackets. The name is enclosed in the tag. They occur in pairs. The start tag contains the name of the tag and the end tag includes the forward slash (/) before the name of the tag. The example for the tag is as shown below: Code: <P> My name is Nick </P> In the above snippet, the <P> is the start tag and </P> is the end tag. 3. Elements: They are the basic units to identify and describe the data in XML. They are used as building blocks of an XML document. Elements are represented using tags. XML gives meaningful names to the elements. It helps to improve the readability of the code. Consider the following example. Code: <AUTHORNAME> John Smith </AUTHORNAME> The element AUTHORNAME is provides the description of the content within the tags. An XML document must always have a root element. A root element contains all other elements in a document. Consider the following example. Code: <? xml version=”1.0”?> <AUTHORS> <AUTHOR> <FIRSTNAME> John </FIRSTNAME> <LASTNAME> Smith </LASTNAME> </AUTHOR> </AUTHORS> In the above example, the AUTHORS element contains all the other elements in the XML document. An XML document can contain only one root element. All the other elements are embedded within the opening and closing tags of the root element. 4. Content: Content refers to the information represented by the elements of an XML document. Consider the following example. Code: <BOOKNAME> Harry Potter </BOOKNAME> In the above example, the name of the book, Harry Potter is the content of the BOOKNAME element. XML enables to declare and use elements that contain different types of information. An element can contain: a. Character or data content Elements can contain only textual information. Consider the following example. Code: <BOOKNAME>The Painted House </BOOKNAME> In the above example, the BOOKNAME element contains only the textual information and is said to have a character or data content. b. Element Content Element can contain other elements. The elements contained in another element are called child elements. The containing element is called the parent element. A parent element can contain many child elements. All the child elements of a parent element are siblings and are related to each other. Consider the following example. Code: <AUTHOR> <FNAME> JOHN </FNAME> <LNAME> SMITH </LNAME> </AUTHOR> The AUHTOR element has two child elements as FNAME and LNAME. c. Combination or Mixed Content Elements can contain textual information as well as other elements as shown in the following example. Code: <PRODUCTDESCRIPTION> The product is available for four colors. <COLOR> RED </COLOR> <COLOR> BLUE </COLOR> <COLOR> YELLOW </COLOR> <COLOR> GREEN </COLOR> </PRODUCTDESCRIPTION> In the above example, the PRODUCTDESCRIPTION element contains textual information as well as the COLOR element. Hence, it is said to have combination or mixed content. 4. Attributes Attributes provide additional information about the elements for which they are declared. An attribute consists of a name – value pair. Consider the following example. Code: <PRODUCTNAME PRODID=”P001”> Doll </PRODUCTNAME> In the above example, the PRODUCTNAME element has an attribute called PRODID, whose value is set to P001. The attribute name and value are specified within the opening tag of the PRODUCTNAME element. Elements can have more than one attributes. User must decide whether the specific piece of information is represented as an element or as an attribute. 5. Entities An entity is a name that is associated with the block of data. The data can be a text or reference to an external file that contains the textual or binary information. It is a set of information that can be used by specifying a single name. XML provides predefined entities called internal entities that enable user to express characters in an XML document. Some of the predefined internal entities that form a part of the XML specification are as mentioned below: Internal Entity Description < Used to display less than ( < ) symbol > Used to display greater than ( > ) symbol & Used to display the ampersand ( & ) symbol & quot; Used to display the double quote ( “ ) symbol Internal entities are replaced by the symbols that they represent when used in an XML document. Consider the following code snippet. Code: <DISPLAY> The price of the toy is < 200 </DISPLAY> In the above code snippet, when the XML file is opened in the browser, the internal entity is replaced with the (< ) less than symbol. 6. Comments Comments are statements used to explain the XML code. They are used to provide documentation information about the XML file or the application to which the file belongs. The parser ignores the comment entries during the code execution. Comments are not essential in an XML file. It is a good programming practice to add the comments along with the code. Comments are created by using the opening angular brackets followed by an exclamation mark and two hyphens (<!-- ). The text is followed containing the comments. The following example demonstrates the use of comments. Code: <!—PRODUCTDATA is the root element -- > C. Predefined XML Document rules You have learnt about the basic components of the XML document. These components can be used to create a well formed XML document. The rules that are followed while creating the XML document are as follows: 1. Every start tag must have an end tag. They cannot be inferred. They must be explicitly specified. Every XML document must have a root element that contains the other elements used in the document. Code: <LI> This is the example of bulleted item </LI> In the above example, the start and end tags are clearly specified. The example follows the rule of XML. 2. Empty tags must be closed using a forward slash (/). They do not contain any information. They contain only attributes. The values are specified within the opening and closing angular brackets. Code: < PICTURE name=”Flowers.jpg “/> 3. All attribute values must be given in double quotation mark. Consider the following example. Code: <FONT size=”12” /> The size attribute of the font element takes the value as 12. 4. Tags must appear in proper nesting. The opening tags must appear in the reverse order in which they appear. Consider the following example. Code: <AUTHOR> John Smith <BOOKNAME> Client </BOOKNAME></AUTHOR> 5. XML tags are case sensitive. The opening and closing tags must correspond to the values. Consider the following code. Code: <PRICE> 250 </PRICE> The code will display the desired output.