Basic Structure

Documents

What is an XML Document?

An XML document is a well-formed and optionally valid file that follows XML syntax rules. It consists of elements, attributes, and text organized in a hierarchical structure. Every XML document must have exactly one root element that contains all other elements.


Structure of an XML Document

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <child>Content</child>
  <child>
    <subchild>More content</subchild>
  </child>
</root>
  • XML Declaration (optional but recommended) defines version and encoding.

  • Root Element wraps the entire document.

  • Elements can be nested to represent complex data structures.

  • The document must be well-formed:

    • Properly nested tags.

    • One unique root element.

    • Tags must be closed.

    • Case-sensitive tags.


Valid vs Well-Formed

Type
Description

Well-Formed

XML syntax rules are followed (required)

Valid

Well-formed + conforms to a DTD or XML Schema

Declaration

XML Declaration

The XML declaration is an optional but recommended statement that appears at the very beginning of an XML document. It specifies important information about the XML version and the character encoding used.


Syntax


Attributes

Attribute
Description
Default

version

Specifies the XML version. Usually "1.0".

Required

encoding

Defines the character encoding (e.g., UTF-8, ISO-8859-1).

Optional (default is UTF-8 or UTF-16)

standalone

Indicates if the document relies on external DTD or not. Values: "yes" or "no".

Optional


Example

Tags & Elements

Tags and Elements

  • Element is the basic building block of an XML document.

  • An element is defined by a start tag, content, and an end tag.

  • Elements can be empty, containing no content.


Syntax Examples


Rules

  • Tags are enclosed in angle brackets < >.

  • Start tag: <tagName>

  • End tag: </tagName>

  • Empty element tag ends with />.

  • Tags are case-sensitive (<Name><name>).

  • Elements can be nested inside other elements.

  • Elements can contain:

    • Text

    • Other elements

    • Attributes (inside start tag)


Example

Attributes

What are Attributes?

Attributes provide additional information about elements. They appear inside the start tag and consist of name-value pairs.


Syntax

Rules

  • Attribute names are case-sensitive.

  • Attribute values must be enclosed in double quotes or single quotes.

  • Multiple attributes are separated by spaces.

  • Attributes cannot contain elements or multiple values; use elements instead for complex data.


Example

Comments

Rules

  • Comments start with <!-- and end with -->.

  • Comments cannot contain the sequence -- inside them.

  • Comments can span multiple lines.

CDATA Sections

What is CDATA?

CDATA (Character Data) sections tell the XML parser to treat enclosed text as raw text, ignoring any markup or special characters. This is useful for embedding code, HTML, or characters that would otherwise be interpreted as XML syntax.


Syntax

Rules

  • CDATA sections start with <![CDATA[ and end with ]]>.

  • Inside CDATA, characters like <, >, and & are not treated as markup.

  • CDATA sections cannot contain the string ]]>.


Example

Last updated