SQLite Records
I am facinated by SQLite. But before you say “Mike you really need to get out more and get a life! ” consider this -the two most popular smartphone operating systems of today, iOS and Android, use SQLite databases to store important information such as contacts, SMS, and call records. That’s a lot of users – both victims and perpetrators. Have you ever wondered how the SQLite structures its records? An understanding of the SQLite record architecture is crucial to validating the output of forensic tools and for knowing where to look for evidence – including that elusive brass ring, deleted information.
SQLite database images always begin with a well know 16 byte signature which in ASCII is represented by “SQLite format 3″ followed by a null byte. The database image header is 100 bytes in length. A full discussion of the database image header can be found on the official SQLite page. Amongst other things a well formed serialized database image will have a database schema or table layout. Knowing the schema of tables is a key bit of knowledge as it will provide the guide to the record header and subsequent record contents.
We will be using a record from the sms database of a first generation iPhone running iOS version 3.13 as an example for this post. Below is a screen shot taken of the schema of the message table using my favorite Mac SQLite editor Base.
Having noted the schema of the message table within the sms database, lets take a look at a record in a hex viewer.
The record is structured as is show in the below graphic.
In addition the record uses the values in the below table to represent the values of the bytes.
So let’s take our example and work through the record header. The first byte of course is our record length – in this case the record is 110 bytes long. The second byte is our key.
The next byte is the length of our record header including this byte – which is 19 bytes.
The next byte is null which indicates that the value is not included in the record. This would have been the ROWID from the schema. The next byte corresponds to the address row in the schema. The byte 0×27 is odd and greater than 13. According to our table this corresponds to text and the byte length is derived by taking the decimal value subtracting thirteen and then dividing by two.
39 -13 = 26 / 2 = 13
We can see from the below graphic that the address or telephone number is indeed 13 bytes in length.
The next byte, 0×4 corresponds in the schema to the date of the SMS. This is a four byte value and is stored in epoch time. The value here is 1296980309 and translates to Sun, 06 Feb 2011 08:18:29 GMT.
The next byte, 0×81 is, as is indicated in the schema, the text message – but it is unique. SQLite uses a compression method based on Huffman coding to store values greater than 127 bits. In this instance the byte in the record header indicating the text message is \x81 or 129 and therefore greater than 127. Since the method uses 2 bytes up to a decimal value of 16,383, we can assume the next byte \x0D is also for the length of the text message. The method to calculate the length of the text message is as follows – where X = the first byte value and Y = the second byte value -
(X-128) x 128 + Y
Calculated out this comes to the below
(129-128) x 128 + 13
\x142
To find the length of the header we now refer to the table and do our calculation as normal
(N-13)/2
(129)/2
64
This is indeed the length of the text message. The message is straight hex to ascii.
The next byte in the header is 0×1 and in the schema refers to the flags row. The value in this case is 0×02 which means that it is an incoming text.
The last byte of significance in this record occurs at offset 17 in the record and as our table indicates is a value greater than 13 (0×11 = decimal 17). Since the calculation is N-13/2 the value we have here is 2 bytes and this refers in the schema to the country. In our example this is 0x6A 0x6F or “jo” for Jordan.
I hope that you find this post useful in your forensic endeavors. This post would not have been possible without the generous help and counsel of DC Shafik Punja of Calgary and Sheran Gunasekera of ZenConsult Pte Ltd.
Research regarding Huffman encoding in SQLite records was conducted using Murilo Tito Pereira’s article “Forensic analysis of the Firefox 3 Internet history and recovery of deleted SQLite records” published by Elselvier.











Also, many programs use SQLite databases to store information. Notabley, Chrome stores much of it’s user information in SQLite database files. This makes the standard retrieval of “web”idence a little more interesting. From what i’ve seen it’s all plain text anyways and can be caught onto be carvers and the sort, but knowing where to look can help you pin point key evidence first instead of hitting it with a forensic carpet bomb. (IE, topping out your i7 for 5 hours straight)
Thanks for the post
Good information!
The greater-than in the header values table should be greater-than-or-equal (twice), see also http://www.sqlite.org/fileformat.html#database_header
Sorry to say, but you did a small mistake :
The first entry isn’t the length, in fact first two bytes are the record index, calculated as following :
0x6E & 0x7F = 0x6E //first mask 7 bit every byte
0x2F & 0x7F =0x2F
(0x6E << 7) + 0x2F = 0x372F
So this record has the index 0x372F (hex) = 14127 (dec)
WBR,
Bjoern
Hi Bjoern
Sorry to correct you.. But:
It’s not always the record index, it depends if the most significant bit is set or not. If set then it’s record index, otherwise – length.