Wednesday, April 20, 2011

iPhone Backups

A few weeks ago, I noticed a project on Github that extracts text messages from your iPhone and prints them out. More specifically, it accesses the backups that iTunes makes and queries a SQLite database to retrieve the messages. We are going to build something similar using Factor.

First, some vocabularies we will be using:

USING: accessors db db.sqlite db.tuples db.types io.directories
io.files.info io.files.unique io.pathnames kernel sequences
sorting ;

Database

We will need a word to choose the most recently modified file in a directory:

: last-modified ( path -- path' )
    [
        [ file-info modified>> ] sort-with last
    ] with-directory-files ;

The iPhone backups are stored (on Mac OS X) in the ~/Library/Application Support/MobileSync/Backup directory. We can choose the most recent backup:

: last-backup ( -- path )
    home "Library/Application Support/MobileSync/Backup"
    append-path dup last-modified append-path ;

The messages are stored in a SQLite database file with the name 3d0d7e5fb2ce288813306e4d4636395e047a3d28.

: last-messages ( -- path )
    last-backup "3d0d7e5fb2ce288813306e4d4636395e047a3d28"
    append-path ;

Since we will be using a backup file (that might be needed later), we will make make a helper word that copies the SQLite database to a temporary file (so we can't mess up the original) and then opens it:

: <copy-sqlite-db> ( path -- sqlite-db )
    temporary-file [ copy-file ] [ <sqlite-db> ] bi ;

Using this, we can make another helper word that allows us to run arbitrary queries on a copy of the most recent messages database:

: with-messages-db ( quot -- )
    [ last-messages <copy-sqlite-db> ] dip with-db ; inline

Schema

The SQLite database has a message table that contains all the text messages sent or received by your iPhone. Using .schema, we can see what it looks like:

CREATE TABLE message (
    ROWID INTEGER PRIMARY KEY AUTOINCREMENT, 
    address TEXT, 
    date INTEGER, 
    text TEXT, 
    flags INTEGER, 
    replace INTEGER, 
    svc_center TEXT, 
    group_id INTEGER, 
    association_id INTEGER, 
    height INTEGER, 
    UIFlags INTEGER, 
    version INTEGER, 
    subject TEXT, 
    country TEXT, 
    headers BLOB, 
    recipients BLOB, 
    read INTEGER
);

Queries

We could use raw SQL queries to retrieve all the messages:

( scratchpad ) [ "select * from message" sql-query ] with-messages-db

Or, we can build a word that performs a SQL query to retrieve just the number of messages, converting the response into a number.

: count-messages ( -- n )
    "select count(*) from message" sql-query
    first first string>number ;

Using it is pretty straightforward:

( scratchpad ) [ count-messages ] with-messages-db

Unfortunately, raw SQL queries are not able to automatically convert the columns from the result set based on the type of data stored. It would be nicer if numbers stored in the database could be retrieved as numbers from Factor. We can instead model the database tables, and then use high-level constructs to query and interact with the database. You can read a nice tutorial to get more information on how db.tuples works.

First, we create a tuple that matches the message table:

TUPLE: message rowid address date text flags replace svc-center
group-id association-id height ui-flags version subject country
headers recipients read? ;

Then, we define a persistence relationship between the message tuple and the message table:

message "message" {
    { "rowid" "ROWID" +db-assigned-id+ }
    { "address" "address" TEXT }
    { "date" "date" INTEGER }
    { "text" "text" TEXT }
    { "flags" "flags" INTEGER }
    { "replace" "replace" INTEGER }
    { "svc-center" "svc_center" TEXT }
    { "group-id" "group_id" INTEGER }
    { "association-id" "association_id" INTEGER }
    { "height" "height" INTEGER }
    { "ui-flags" "UIFlags" INTEGER }
    { "version" "version" INTEGER }
    { "subject" "subject" TEXT }
    { "country" "country" TEXT }
    { "headers" "headers" BLOB }
    { "recipients" "recipients" BLOB }
    { "read?" "read" INTEGER }
} define-persistent

Using this, we can count the number of messages very simply:

: count-messages ( -- n )
    T{ message } count-tuples ;

Or, retrieve all the messages:

: all-messages ( -- messages )
    T{ message } select-tuples ;

Or, retrieve all the messages we've sent:

: sent-messages ( -- messages )
    T{ message { flags 3 } } select-tuples ;

Or, retrieve only the unread messages we've received:

: unread-messages ( -- messages )
    T{ message { flags 2 } { read? 0 } } select-tuples ;

Or, retrieve the messages sent from a particular phone number:

: messages-from ( addr -- messages )
    message new swap >>address select-tuples ;

Or, using group-by, retrieve your messages as groups of conversations:

: all-conversations ( -- conversations )
    all-messages [ group-id>> ] group-by ;

Much more useful, and much cleaner, than using the raw SQL queries on the database. The results are returned as a sequence of message objects that can be manipulated directly within Factor.

Try It

You can use your new query tools, for example, to chart the number of messages you send or receive per day of the week:


The code for this is on my Github.

No comments: