Online courses in ASP.NET MVC / Core, jQuery, Angular, and Design Patterns conducted by Bipin Joshi. Read more...
Learn ASP.NET MVC / Core, jQuery, Angular, and Design Patterns through our online training programs. Courses conducted by Bipin Joshi on weekends. Read more details here.

Introduction To XML

What is XML?

  • XML stands for eXtensible Markup Language
  • All of you must have used HTML tags and elements. HTML provides a fixed set of elements and we are bound to use only those elements.
  • XML on the other hand allows us to create our own tags and elements
  • Since we create our own tags they can be descriptive making the document more readable
  • HTML is designed to display your data in a web browser
  • XML is designed to represent your data rather than its display. The display of data is taken care by other means like CSS or XSL or custom applications
  • HTML page or data can be displayed only on web browsers
  • XML data can be used by any application including web browser which understands how to interpret the data
  • Since the data is separated from display, any change in data can be easily incorporated without touching the display mechanism
  • XML originated from SGML – Standard Generalized Markup Language – which provides specifications to create markup languages
  • HTML is also an example of markup language
  • SGML and XML are controlled by World Wide Web Consortium(W3C)
  • XML made its first public appearance in 1996
  • The first official specification of XML was published in 1998

A Simple XML document

Consider following file named myfirstxml.xml which represents a simple XML document. Try to compare it with HTML. XML files are just plain text files having .xml extention

Myfirstxml.xml

<? Xml version="1.0" ?>

<!DOCTYPE mylibrary SYSTEM "mylibrary.dtd">

<catalog>

<book book_no="100">

<author>Author 1</author>

<title>Title 1 </title>

<photo src="photo1.gif" />

</book>

<book book_no="200">

<author>Author 2</author>

<title> Title 2</title>

<photo src="photo2.gif" />

</book>

</catalog>

Common XML Terms

  • Processing instructions

<? Xml version="1.0" ?>

They are special instructions and enclosed in a pair of <? And ?>

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

  • Version : specifies xml version used for the document. Currently it should be 1.0
  • Encoding : Optional argument. Specifies character code set used
  • Standalone : Optional argument. Specifies weather the document depends on any other external document or markup. If your document is based on any DTD then set it to "no" .
  • Document Type Declaration

<!DOCTYPE mylibrary SYSTEM "mylibrary.dtd">

If your XML document is based on some DTD you must declare that DTD name here. The document name "mylibrary" is arbitery and need not be the same as the DTD file name

  • Tag

<author>

Tags are identifiers of a particular instance of data. They are enclosed between a pair of < and >. Generally a set of start tag (<--->) and end tag(<--- />) form an element

  • Element

<author></author>

<photo src="photo1.gif" />

An element is a set of tags. Element generally comprise of a set of start tag and end tag. However some times they can be represented in an alternative way like shown in the second example. Here instead of using a pair of <photo> and </photo> we have used a shortcut <photo --- />

  • Attribute

Book_no

They provide some extra information about an element

  • Root

Catalog

Every XML document must have an element at the top of hierarchy called the root element

  • Tree

<catalog>

----

</catalog>

An XML document can be viewed as an inverted tree with root element at the top and all other elements at various branch levels

  • Node

Catalog

Each point which starts a branch or is at a leaf level is called as a Node

  • Parent

Catalog

Parent elements are the elements having sub elements

  • Child

Book

Child elements are the elements beneath parent elements

Basic Rules of XML Grammar

  • XML is case sensitive. So, all the tag names – start and end - must appear in the same case
    e.g.
    <mytag> is not same as <MYTAG> or <MyTag>
  • All start tags must have corresponding end tags
    e.g
    <mytag>Some Data
    <my_other_tag>Some other data</my_other_tag>
    Above XML is wrong as <mytag> do not have corresponding end tag </mytag>
  • Empty elements must be written in abbreviated form
    e.g.
    <photo src="mypicture.gif" />
  • All tags must be nested properly
    e.g.
    <mytag>some data
    <my_other_tag>Some other data
    </mytag>
    </my_other_data>
    Above XML is invalid because the nesting of tags is incorrect. The correct nesting would be
    <mytag>some data
    <my_other_tag>Some other data
    </my_other_data>
    </mytag>
  • All attribute values must be enclosed in quotation marks
    e.g.
    <book book_no=100> is invalid. Valid usage would be
    <book book_no="100">

What is a DTD ?

  • DTD stands for Document Type Declaration
  • It defines the structure or rules for an XML document which is based on the DTD
  • DTD is written in a special format called Extended Backus-Naur Form(EBNF)



Bipin Joshi is a software consultant, trainer, author and a yogi having 21+ years of experience in software development. He conducts online courses in ASP.NET MVC / Core, jQuery, AngularJS, and Design Patterns. He is a published author and has authored or co-authored books for Apress and Wrox press. Having embraced Yoga way of life he also teaches Ajapa Meditation to interested individuals. To know more about him click here.

Get connected : Twitter  Facebook  Google+  LinkedIn

Posted On : 18 Dec 2000



Tags : XML