Project

General

Profile

Cache Management » History » Version 2

Eric Vieillevigne, 05/12/2015 01:55 PM

1 2 Eric Vieillevigne
{{>toc}}
2
3 1 Eric Vieillevigne
h1. SYNC API for devices
4
5
The mobile devices are heavyweight clients with intermittent wireless (somewhat low) Internet connectivity.
6
In this situation, it is quite interesting to keep a copy of the data in memory, in order to speed-up operations (caching) and enable offline access to the application.
7
8
The Mobile Client Application shall enable:
9
* Keep a copy of data retrieved for the currently logged user in a local repository (caching)
10
* Offline browsing and usage of the application with the lastest data
11
* Even Browsing the old data while the application is connecting and retrieving the updated data.
12
13
> For instance, I shall be able to enter the application and see the old wall posts even if I am not connected. Whenever the Internet is coming-up again, the new wall post shall appears dynamically on the wall screen while I am browsing it.
14
15
* Caching data on the client side is not supported if the sync api are not used. *Older methods of caching are deprecated*. 
16
17
> Caching has server-side effects. We may decide to change the value of object IDs for instance, or change image URLs. Old data will become obsolete without warning. As a result and to enforce backward-compatibility with server API update, caching must follow strict procedures.
18
19
What is not required from the Mobile Client Application is:
20
* To initialize and log-in for the "first time" without connectivity.
21
* To execute any actions that requires to modify data (ie any write api) without connectivity.
22
23
> For very specific scenario, such behavior may be authorized (ie geo-location tracking)
24
25
A specific set of APIs have been designed to specifically enable that: the SYNC API.
26
27
> The sync process enables the client application to initialize his local repository and to update it while only transmitting minimal data (ie the changed information).
28
29
h2. Data Characteristics
30
31
We typically want to sync : the family profile, the various family data (contacts, events, places, etc), the family wall, the messages, etc... As a result we can summarize the characteristics of the data we want to sync as:
32
# *Data Types*: Being of several kinds: contacts, events, places, etc...  loosely related between them. 
33
# *Loose relationships*: Relationships between data types are light. There may be a relationship between a contact and an event (the birthday), which means that there may be a contactId in the event's properties and an eventId if the contact's properties, but this is not a key idea in the data model.
34
# *Small hierarchy*: Each piece of data may be composed of sub elements (comments for instance), but the complexity of such deepness is low. Each element (each contact, event, etc) may be considered atomic.
35
However an extension mechanism is provided to be able to stream comments/messages if they are too numerous.
36
# *Identity and modification date*: Each element to be synchronized will be composed of a unique ID and a modification date enabling his tracking.
37
# *List based*: for each data types, there is a list for each user which need to be synchronized.
38
> *Full list*: for some elements, the client need to obtain the complete list of existing elements (ie the contact list, the event list)
39
40
> *Paginated list*: for some other elements, the client needs only to obtain a partial list (because the full number of element is too big). For instance the messages or the wall posts. There are too much wall posts to get all of them, and a pagination system must be put in place.
41
42
> *Singleton*: for instance, the user's settings, the family details, etc... There are only one element of this type to sync.
43
44
> Note: No relationship can point to data types managed in incomplete list.
45
46
h2. Sync process: *Full list*
47
48
Because the type of data are roughly independant, we can assume that we can sync each type of data intependantly.
49
50
The process of syncing one type of data is:
51
# The client is empty and does not contain any data
52
# The client is calling the sync api for the type of data he is syncing. IE for contact it is "ctcsync", for event "evtsync", etc...
53
# The api returns the following: the *date of sync*, a *list of created/updated* items (since last sync), a *list of deleted ids* (since last sync), and a flag indicating *sync* or not (ie from scratch)
54
# The client process the result: 
55
# > If the sync flag is not set, he clears all his previous data of this type. In this case the list of deleted ids is empty and the list of created/updated items contains all the items.
56
# > He deletes all items in the list of deleted ids. (using the ID to identify them)
57
# > He creates or updates all items in the list of created/updated items. (using the ID to identify them)
58
# > He store the date of sync for the next round of sync.
59
# For any subsequent call the the sync api for the same type, he will return *UNMODIFIED* the *date of sync*. The server will use that to compute the list of created/updated/deleted items (deltas).
60
# The client can request all data instead of deltas by ommitting the *date of sync* while calling the sync api. He may decide to do so if his local storate cannot be trusted (ie for instance he may have been cleared to reclaim storage, a different user may have logged-in, etc...)
61
# The sync operation is "equipotent". That means that the same sync operations performed twice on the client locale storage will yeld to the same result. So the client shall store the new *date of sync* only at the end of the process.
62
63
>
64
> Sample on how to compute the *date of sync*, and why Equipotent is important
65
>
66
> * The client calls the server
67
> * The server sets the current date of sync to 'now'
68
> * If a database modification occurs here, it will be sync'ed twice but it does not matter
69
> * The server gets data from the database, compute the list of changes, etc...
70
> * The server returns the changes and the date of sync
71
> * The client incorporates the changes to his local database
72
> * If the clients crashes here, changes will be sync'ed again next time, but it does not matter.
73
> * The client store the new date of sync in his local storage
74
>
75
76
77
h3. currentClientDate and date (deprecated)
78
79
Optionally, the client may not use the date of sync from the server but use his own date of sync: a date of sync that he has recorded himself.
80
In this case, he must also return the "now date", ie the "currentClientDate" from a http date header. 
81
82
However, this solution is deprecated, and kept only for compatibility purposes.
83
84
h2. Sync Process: *Paginated list*
85
86
h3. New/updated elements pagination
87
88
Note that those list are *ORDERED LIST*, whereas the Full list are not. Sorting properties are:
89
* They are sorted by their sorting criteria, ie messages by most recently received, wall messages by appearance, etc...
90
* Their sorting criteria may be be different from their modification date used for the sync process (IE setting the "read" flag of a message thread shall not change its position) .
91
* The client must display the elements according to the sorting criteria.
92
93
In order to sustain de potentially heavy loads on those objects (ie hundreds of message threads), we need to implement a pagination mechanism:
94
* At first the sync mechanism is the same as for Full list: the client gives a date of the last sync, and the server compute the list of deleted ids and created/updated items.
95
* However both the client and server agree on a sorting criteria, usually a date (in the case of message threads, creation date of the most recent message in the thread). *TBD:input parameter of the API?*
96
* The client have an incomplete sorted list of elements: the top of the list is complete but the bottom of the list is missing some elements. The last element of his list has an id of C. (ie the client has the most recent message threads, but the least recent is id=123 and possibly some older thread messages are missing. In this case C=123).
97
The client transmits the *lastId=C* parameter.
98
* Then the client ask the server to crop the list with a given number of items X in an input param of the api (ie 10 because we only want the 10 last modified message threads and having or transmitting all the data is possibly too much). The client transmits the *nb=X* parameter.
99
* The server sorts the list of items according to the criteria and keeps only the item above C. ( *Deleted ids list is not paginated* ).
100
* The server will only return the X first items of the sorted created/updated list from the list above C.
101
If some items has been cropped (ie some modification were not sent to the client), the *crop* flag will be positioned to true.
102
* If the *crop* flag is set, the client known it does not have all the updates, he cannot reconciciate his cache elements with the new elements, he must clear his cache and start fresh with a paginated list of X elements.
103
> The client can assume that other messages are not displayed and put a "get more" button in the UI or something like that...
104
* If the *crop* flag is not positioned, the client can reconciliate updates with existing elements from his cache and display to the user the elements in his cache.
105
* In order for this sync to work, the item must be displayed to the user using the same sorting criteria as the one asked from the server
106
* The client shall remove older items and keep a reduced list depending on memory constraints.
107
108
> The client must still honor all other sync mechanisms, such as the "date of sync" and the *sync* flag.
109
110
>
111
>Example with message threads:
112
>
113
>* On the client I have 2 message threads, from "bob" and "alice", both unread
114
>* On the server I have some new updates to dispatch to the client: "alice" message is read and a new message from "joe".
115
>* The client call the 'msgsyncthread' api with the last date of sync, and ask for 10 more messages (sorting date for thread cannot be set and is the date of the last message)
116
>  Parameters are lastId=alice,nb=10,date=d
117
>* The server returns the new "joe" message and the updated message from "alice" (read flag to false)
118
>
119
>Example 2 with message threads:
120
>
121
>* On the client I have 2 message threads, from "bob" and "alice", both unread
122
>* On the server I have 12 new threads from joe...
123
>* The client call the 'msgsyncthread' api with the last date of sync, and ask for 10 more threads (lastId=alice,nb=10,date=d)
124
>* The server returns the 10 new threads from joe sorted by the thread date, and set the flag 'crop=true'. 2 new threads from joe are left on the server.
125
>* The client receives the 'crop=true' flag, so he discards the existing threads from "bob" and "alice" and only displays the 10 new threads from joe.
126
>Failing to do so would result in a "hole" between the threads from joe and the threads from "bob", which could never been filled by a subsequent sync query.
127
>The client shall display a "get more" button to indicate that the list is incomplete.
128
>
129
>
130
>Example 3 with a non transmitted changes:
131
>
132
>* From the situation above (on the client we have joe1 to joe10 threads), on the server the alice thread is changing (ie his read flag goes to false).
133
>* The client call 'msgsyncthread' with lastId=joe10,nb=10,date=d2
134
>* The server retrieve the sorted list of threads and only keeps the one with id<joe10
135
>* There are no changes in this list, the server return an empty created/updated list.
136
>
137
>* If the server would have transmitted the new state of the message from "alice", the list on the client would have been joe1...joe10, alice. There would be a hole in the list, missing the element "bob" between "joe10" and "alice". It would have been impossible for the client to know it and retrieves the element ! Therefore a BUG !
138
>
139
140
h3. Aquiring more pages
141
142
A new pagination API (which is not directly related to sync) will enable the client to request missing/more elements. The process is the following:
143
* The client have an incomplete sorted list of elements: the top of the list is complete but the bottom of the list is missing some elements. The last element of his list has an id of C. (ie the client has the most recent message threads, but the least recent is id=123 and possibly some older thread messages are missing. In this case lastId=123).
144
* The client calls the pagination api with lastId=C and the number nb=Y of next elements to retrieve (Y=10, I want 10 more elements)
145
* The server loads and sorts the elements according to the same criteria, compute the list of Y (at most) elements after C. 
146
If the server known that there are no more elements after this, he positions a flag *nomore*
147
* The client completes his cache with the returned elements.
148
* If the client see *nomore* = true, he knows it is not required to retrieve more elements, his cache is complete.
149
150
> This paging api does not replace the sync api. It is encouraged to perform a sync whenever a pagination is called, too, in order to have the most updated information to display to the user, using HTTP grouping. If the sync api return a *crop* flag, the result of the pagination api shall be discarded (however it is very unlikely).
151
152
> The *nomore* flag is also set on the sync api. It may improve the UX and prevent unuseful api calls.
153
154
>
155
>Following of the Example 2 with message threads:
156
>
157
>* The user clicks on the "get more" button, or swipe, or something... , and the client needs to display the next messages after the 10th thread from "joe".
158
>* The client calls the pagination API with the id of the "joe10" thread
159
>* The server respond with 4 next threads: "joe11", "joe12", "bob" and "alice". He set the nomore flag = true
160
>* The client displays a complete list of 14 threads.
161
>
162
163
h2. Sync process: *partial elements*
164
165
There is a case where the server shall not transmit the full content of an element. Imagine a wall message with hundreds of comments, or a message thread with hundreds of messages. Even if we said in the data characteristics that the data has a *Small hierarchy* : "Each element may be considered as atomic", it is not practical in this case.
166
167
As a result, for comments and messages (in a thread), we shall assign a sorting criteria and have the client and the server sort those items in this order. ( For comments and messages it is the creation date)
168
169
> The client must display comments and messages with this sorting criteria
170
171
* The server will only return incomplete wall message and threads, with only the X first comments and messages.
172
* The nomore flag will be managed for those apis.
173
* The client will not cache the comments and messages independantly of the wall messages and message threads.
174
* The client may be able to call a "next comment" or "next message" api to retrieve more messages or comments for a given wall message or thread.
175
* The client may cache those comments and messages only along with their parent elements.
176
177
>
178
> Example with joe's threads:
179
>
180
>* Let's say that one of the joe's thread have 20 messages, whereas all other joe's threads has 1 message.
181
>* The client is calling the 'msgsyncthread' api with maxmessages=10
182
>* In the result, the threads elements contain 10 messages max. The one with 20 messages only has 10 messages and nomore=false, whereas the other threads have 1 messages and nomore=true
183
>* The client store those threads in his cache. 
184
>* Whenever the user click on a thread, the client display only the messages in the cache. For the thread with nomore=false, the client also displays a "get more" button or something ...
185
>* If the user clicks on the button, the client calls the pagination apis for the child elements (messages in this case), with the lastId and the nb ...
186
>
187
188
The pagination API for those messages and comments are similar than for the other types of objects.
189
190
h2. Additional behavior
191
192
* For Wall Messages, a list of *types* is sent to the sync api as an additional filter. This enables the client to manage a less kind of types than the sever
193
194
* ...
195
196
h2. Singleton
197
198
Singleton information are to be synchronized, too, but they simply consist in :
199
* family name and picture
200
* family members and accounts and account identifiers
201
* account profiles
202
* account setting
203
204
For those type of data a specific API shall be designed:
205
* The family modification date shall be modified whenever the list of family members changes (created/updated/deleted)
206
* A special API shall return all modified elements for family/family members/account profiles and settings since a last date of sync.
207
208
h2. Sync process: central API
209
210
The workflow of operations enabling the sync is typically the following:
211
# While being used for the first time, the user must enter his credentials, requiring the app to authenticate to the server and initialize his internal database
212
# The client SHALL call all SYNC API at once using JSON HTTP Grouping. See [[RequestResponseProtocol]]
213
# The client may use only one *date of sync* in his local storage, as long as it is updated by the oldest one in the result of the grouped call to all syncapi.
214
# Relationship between elements are resolved only after all items are synchronized. If a relationship cannot be resolved, the local database is cleared.
215
216
> Inconsistencies may arise whenever relationships are managed. IE Wall Messages may still be here and references events that are already deleted. In order to resolve those inconsistencies, the Client MUST check for them and when detected, rerun a full sync.
217
218
h2. Multimedia caching
219
220
Standard HTTP caching MUST takes place on Media URI. Media URI are immutable (ie whenever the URI changes, the content changes). See [[MediaManagement]]