Plists, XML and XPATH – A Series

I’ve been doing some research into the various data storage methods on smart phones and found myself getting engrossed in plists. Though I’ve mentioned them in classes and we’ve talked about how they were constructed and various roadblocks to extracting information from them ,I’d never really done an in-depth module or exercise on them.

Well, I hope with this series to change that omission. At the end of the series, I’ll provide a link to all the posts gathered into one paper. Now without further ado, here is the first of the series on Plists, XML and XPATH.

What is XML?

Although the term XML is thrown around in forensic classes and seen as an option in for analysis output in many of the major forensic tools, how many examiners understand how XML is constructed and the rules that apply to it? As it turns out, XML isn’t all that hard to understand and having a grasp on how it’s used to store data is useful to understanding Plists and other files stored on digital devices – such as current.gpx on Garmin GPS units – which use the XML format.
Let’s look at the building blocks that make up XML and some of the rules that govern them.

Definition of XML

First, let’s define the term XML. XML stands for Extensible Markup Language. This language is an official recommendation of the World Wide Web Consortium (W3C). XML is a metalanguage that allows for the creation and formatting of documents; it is in common use on the Internet and the default of many office productivity suites including Microsoft Office and Apple iWork.

XML Terminology

Next, let’s discuss some terminology used in XML so we can understand when these terms are used in later discussions when we are looking at and reading plists. This is by no means all the terminology that is used in XML; rather these terms are covered here in order to give an examiner a working knowledge of items that may be encountered when working on XML formatted evidence.

Elements – XML is made up of one or more elements. Elements consist of two tags – an opening tag, which is the name of the tag delimited by a less-than sign (“”), and a closing tag, which is the same as the opening tag except that there is a forward slash (“/“) before the element name. An example of an element is MAC Address . The text inside the two tags is considered part of that element and is processed per the element’s rules.
Attribute – An element can have an attribute that serves to modify or refine the default meaning of the elements. Attributes can also be applied to empty elements which are used to provide non-texual content or give additional information to the application that is parsing through the XML. Here is an example of a picture element with the src attribute: . This could also be displayed as a short hand because the element is empty.
Declaration – Most XML documents begin by declaring information about themselves for a processing program as in the following example: . This would tell a parsing program that the XML document uses the Version 1.0 format and optimized for UTF-16 unicode encoding.
Document Type Definition (DTD) – This is an external file that specifies the rules for how all the elements, attributes and other data are defined and related. Below is Apple’s DTD for plists (also located at

Apple’s Plist DTD

Root Element – This is the outermost element to which the DTD applies and is usually the start and end points of the document. An example for a plist would be .
CDATA – This stands for “character data.” Anything that occurs after a CDATA section is not to be marked up and is treated as plain text.
PCDATA – This stands for “parsed character data” and means that any character data that is not an element can appear between the tags. In the above Apple DTD, means that any characters such as “WebbookmarkType” can show up between the key element tags but not another tag such as .

Keep checking back for regular updates to this series.