Download ChronoShare Design Documentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
ChronoShare Design Documentation
December 23, 2012
Abstract
This is the design doc. We document our thinking process of the design and the (semi) details on
how to implement the design.
1
Introduction
The goal of ChronoShare is to provide a user-friendly, completely distributed and band-width efficient file
sharing and collaboration tool that runs on top of Named Data Networking (NDN). There are quite a few
products out there providing similar services, like Dropbox, Box, SparkleShare, Google Drive, SkyDrive,
SugarSync, etc., due to a tremendous demand for file sharing among multiple personal devices and among
people who work on the same project (and other reasons). Unlike most other products which are based on
centralized design, ChronoShare is completely distributed, allowing service to continue working without
requiring a particular device to the present. Security is provided by NDN’s built-in signatures (which
guarantees provenance) and (per-file-per-version) encryption (which provides access control).
The main idea of ChronoShare is to use ChronoSync [2] to track the sequentially named actions
generated by each participant. Each participant apply actions to the folder in the same order, resulting
in the same file state for the folder. Given the file state, participant can decide whether or not to
actually fetch the files in the shared folder.
The definition for participant, file state and action are explained as following:
Participant : A participant is usually a user account on a device. We assume users will have different
keys for different devices.
File State : The state of files maintained in the shared holder, including file path, file version, owner,
timestamp, etc. for each file.
Action : An operation to change the file state. E.g. add a file, delete a file, etc.
2
Requirements
This section gives a list of requirements that ChronoShare must satisfy.
1. Completely server-less, but also convenient to add a permanent storage.
2. Users can use different topology-dependent name prefixes as they move without causing troubles
for previously shared files.
3. Files shared by a user can be served by any other peer that possess the original content objects of
the files.
4. A single copy of a file should be published regardless of with how many sharing groups the user
decides to share this file.
5. Access control should be provided1 .
1 Given
the short deadline, perhaps this has lower priority before we make the system running
1
3
Naming Rules
A unique ID (UID) is required for each participant when constructing the name prefixes. We use the
hash of a participant’s key as the UID.
The naming rule for Sync Interests and Data is the same as in ChronoSync [2]. An example is:
/ndn/broadcast/chronoshare/shared-folder-name/root-digest.
To make shared-folder-name unique in the broadcast namespace, it is composed by concatenate the
UID the Creator of and the name the create chose for the folder. For example, if user John Smart created a
shared folder ”/Users/jsmart/achievements” on his 15.4” retina-display Macbook Pro, the shared-foldername in the broadcast name would be Hash(John’s-key-on-Macbook-pro)-achievements.
The naming rule for the action data and the file data are different. This is because in ChronoSync [2]
we did not consider a participant to move and use a different name prefix during a synchronization
session. If that happened, the participant would be treated as if he was a different participant. However,
for ChronoShare, the devices are always expected in participating the synchronization session until a
user explicitly “deregister” the device, and the device should be treated as the same one regardless of
what name prefix the user is using for the device.
Thus, we introduce the concept of topology independent name prefix (TIP), or “virtual (or permanent, or private) name prefix” if you will, which the users can use for the action data and file data
regardless of where they get NDN connectivity. As it is “private” (routers does not need to know how
to forward Interests with such prefixes), we have complete control on what it should look like.
For convenience, we assign all participants in a sharing group TIPs with the same prefix: /action/sharedfolder-name for their action data. The form of TIP is /action/shared-folder-name/participant’
s-UID. This gives us a unique TIP for action data per participant per sharing group. For instance,
if user Michael Smith is invited to share John’s achievements, then the TIP for Michael’s Galaxy III
android phone would be /action/Hash(John’s-key-on-Macbook-pro)-achievements/Hash(Michael’
skeyonGalaxyIII).
However, the TIP for participant’s file data does not have such per-shared-folder prefix, and its form
is simply /file. This is because we want to keep only a copy of a file even if it is shared in multiple folders.
Hence, it’s name has to be independent of the shared-folder-name. And we do not need to include the
publisher’s UID in the file name prefix due to the naming scheme we choose for the files. More details
about the reasoning behind this can be found in Section 6.
The benefit of introducing TIP is the cleanness in design. Otherwise, the user may have published
some data in one name prefix and other data in another name prefix, and it is brain twisting consider
about the task of fetching the user’s previously published data. Some data also requires to be published
under the same prefix. For example, the action data tracked by ChronoSync must be published under
the same name prefix, as that’s the design assumption adopted by ChronoSync.
However, in order for Interests to reach the producer (or some entity that stores the content), a
topology dependent name prefix (TDP) must be used. Hence, although TIP names could be used
internally by ChronoShare, it must be translated to use TDP before entering NDN network.
The translation process is defined as following:
/action/shared-folder/UID
/file
↔
↔
/a-participant’s-current-prefix/action/shared-folder/UID
/a-participant’s-current-prefix/file
Table 1: Mapping between TIPs and TDPs
Note that the first component for both action and file data TIPs are the “type” field to help the
process in the mapping to and from TDPs. (Not sure whether this type field will benefit us, i.e.
not sure whether distinguishing TIPs for action and file data will help much).
Ideally, the result TDP should use the UID owner’s current prefix (ChronoSync provides a mapping
between the UID and the current prefix of a participant) to direct the TIP Interest to the original
producer. However, the original producer may not be reachable sometimes (e.g. powered off, no Interest
access, etc.). Hence, it is also legitimate that a different participant’s current prefix be used to construct
the TDP, given the knowledge that this participant possesses the TIP Data packet produced by the
original producer.
2
When receiving TDP Interest, the participant extracts the TIP Interest and, if the TIP Data packet
exists in storage, creates a Link object between the TDP name and the TIP name and reply the TDP
Interest with the Link object. An example of Link object is shown in Figure 1. This preserves the
signature of the original producer so the the request can always be assured about the provenance.
/ndn/ucla.edu/shared-folder/UID
Link: /shared-folder/UID
Signature for Link
/shared-folder/UID
Content
...
Signature
Figure 1: A Link object that links TDP name and TIP name
4
Overview
The ChronoShare system can be divided into four main components with a database as the storage, as
shown in Figure 2.
Topology Aware
Data Daemon
Files / Actions by remote
Files / Actions with TIP
Instruction to fetch / publish
Files / Actions with TIP
Actions with TIP
Mapping between
UID and Prefix
File update from remote
Respond for file history
User Interface
File State Manager
File change by local user
Request for file history
Database
Sync Log
Sync update
{
Direct DB access for old version
User_ID
New_Seq
}
Local User Action Seq++
ChronoSync
Figure 2: The ChronoShare components and interactions among components
ChronoSync tracks the actions by all participants; File State Manager maintains the file state of a
participant by applying the actions; Topology Aware Data Daemon translates the TIP name and TDP
name, and also serves the TDP Interests. User Interface presents user the files in current file state in
3
native file system form; it also provides a tool to inspect the file history and checkout old version of a
file. All four components interacts with the database storage.
The actions among the components and with the storage will be explained in detail in the following
sections.
The most common work flows, publish a new version of a file and retrieving a new version of a file,
are illustrated in Figure 3.
File State
Manger
User Interface
Chronos Sync
Topology Aware
Data Daemon
Database
CCND
File updated
Publish action
with TDP
Publish Action
Save action
with TIP
Sync Seq++
Publish Sync Data
Publish file
with TDP
Publish File
Save File with
TDP, in batch
or other
optimization
(a) Work flow of publishing a file
File State
Manger
User Interface
Chronos Sync
Topology Aware
Data Daemon
Database
CCND
Sync Data
User_ID:
Missing Seqs
Fetch Action
with TDP
Fetch Action
Reply Action
with TDP
Reply Action
Notify change
on file state
Save Action
with TIP
Fetch File
Fetch File
with TDP
Reply File
Save File
with TIP,
in batch
Notify File ready
(b) Work flow of retrieving a file
Figure 3: Two common work flows
5
ChronoSync
There are mainly three changes to the ChronoSync library:
• Track the current prefix of the participant, but use UID as the identifier (rather than prefix as in
current implementation).
• Store Sync Log in database. Optionally, store Sync Tree in database.
• Change API for local update on Sync Tree.
4
The current ChronoSync use prefix as the participant identifier, this is no longer true. UID is a
perfect candidate for identification purpose. However, the current prefix of a participant still needs to
be tracked.
SyncTree
Sync Log
00a12...
001a…
(Mac's key)
2
b18c…
(iPhone's key)
Seq number
State tree modification
<update 001a… to seq number 2>
......
<update b18c… to seq number 3>
......
<update b18c… to seq number 2>
…...
<update b18c… to seq number 1>
3
SyncTree node properties
Key hash
001a… (Alex's
Mac key)
b18c… (Alex's
iPhone key)
State hash
00a12...
Current TDP
3
/ndn/ucla.edu
3
/ndn/ucla.edu
...
SharedFolderTree
img.jpg
b873...
/shared
0755
0644
/subfolder
secret.txt
093a...
Action Log
0755
0644
File properties
Name
Content hash
Permissions
img.jpg
b873...
0755
Key/SeqNo
001a...
2
(Mac)
File action
<add /file/093a… as subfolder/
secret.txt with 0644>
b18c...
(iPhone)
3
<rename /file/093a… from
img.bmp to img.jpg>
b18c...
(iPhone)
2
<add /file/b873… as img.bmp
with 0644>
Folder properties
Name
Permissions
/subfolder
0755
Figure 4: Relations between sync tree, sync log, shared folder tree, and action log
In the current ChronoSync library, the Sync Tree and Sync Log are not persistent, in that they are
maintained in memory only. This is because the library was designed for short-term synchronization
session such as a text chat session.For it to work with ChronoShare, Sync Log should be stored in
the database. Optionally, the Sync Tree could also be stored in database, which will save us work of
constructing Sync Tree by walking through the entire Sync Log when ChronoShare restarts. Note that
if Sync Log is going to be trimmed, then Sync Tree MUST be stored in database.
An additional challenge brought by this change is, when seeing unrecognized Sync Interest, it harder
to walk the Sync Log and get the difference.
The second change is related to the API for local update. Previously, the API does two tasks: publish
the content and increment the sequence number for the Sync Tree node for local participant. Now the
publishing task is going to be handled by Topology Aware Data Daemon. This change is small.
6
File State Manager (FSM)
FSM borrows basic concepts from Git [1].
6.1
Content Tracking
Like Git, FSM is a content tracking system. That is, FSM tracks the blobs (e.g. the raw bytes inside
a file) based on the content, rather than filename or the directory names from user’s original file layout.
The meta data, such as filename or directory names are associated with the blobs in secondary ways. To
5
enable this, FSM addresses the files in the shared folder by the Hash of their contents. For simplicity, we
refer to the hash of a file’s content its FID. The NDN name prefix the a file’s Data packets is /file/FID.
Thus, if two separate files located in two directories have exactly the same content, FSM stores only
a copy of the content in the database storage. FSM also only publishes one copy of the file to NDN in
such case.
6.2
File State Table
FSM maintains a table about the file state, which tracks the latest mapping between FIDs and the file
system meta data (including the mapping for deleted files). The meta data that we care is the file path.
Note that where a file path must be unique on a device, multiple files can be mapped to the same FID.
We also have a status field to indicate whether or not this file has been delete from the file system.
Based on this table, the User Interface should be able to construct the shared folder in the native file
systems with the content stored in the database. So we also need to know the owner of the FID so that
the NDN name prefix for the file Data packets can be obtained.
Table 2 shows an example of such a table. Note that b.txt is a deleted file while a.jpg and /subdir/c.jpg
are actually the same file (same content).
File Path
/a.jpg
/b.txt
/subdir/c.jpg
FID
asf2321a
lkqwe282
asf2321a
Status
1
0
1
Table 2: An example of file state table
6.3
Action Table
The file state table is constructed by applying actions. Actions can either be originated by the local
participant or the remote peers. The action table is used to keep all known actions performed to the
shared folders.
An action has several necessary fields: the shared folder this action is performed, the performer, the
timestamp, the action type, the file path to be operated on and the owner and FID for the file content.
Additionally, we also keep the hashes of the File State table before and after performing the action.
(Alex has to explain how we are going to use these two hashes, and currently they are not
shown in the table below).
Table 3 show an example of such a table. By apply the sequence of actions to an empty file state
table, we would get Table 2.
Shared-folder
achievements
achievements
achievements
achievements
achievements
UID
lqweidq
lqweidq
iouqwer
iouqwer
lqweidq
Timestamp
1356028370
1356028472
1356028490
1356028490
1356028555
Type
Add
Add
Update
Delete
Add
File Path
/a.jpg
/b.txt
/b.txt
/b.txt
/subdir/c.jpg
FID
asf2321a
zxcvklq1
lkqwe282
lkqwe282
asf2321a
Table 3: An example of action table
6.4
File History and Conflict Solving
When requested, FSM should provide history for a file. The action table contains enough information to
construct the history. If the simplest case, we can query the database with ”select * from ActionTable
where FilePath = file-to-be-queried” and do simple processing.
6
More sophisticated processing could also enable us to show continuous history even after a file is
renamed.
Conflict is solved by picking the latest actions to a file based on timestamp.
6.5
Interaction with other components
When User Interface detects file changes, it notifies FSM about the change. FSM updates the file state
table, and create action for the change. It notifies ChronoSync to propagate the action. If the action
results in in a new /file/FID prefix, then FSM also notifies Topology Aware Data Daemon ?? to publish
the file.
When ChronoSync notifies FSM about a new action, FSM fetches the action and perform the action.
It then notifies the User Interface about the changes in the shared-folder. If there is a need to fetch the
file data, it fetches the data and notifies the User Interface when it is ready (Or, should User Interface
be doing this task? )
7
Topology Aware Data Daemon (TADD)
This component serves as a gateway between the application where TIPs are used and the NDN network
where TDPs are required. Other components calls APIs provided by TADD to fetch or publish data.
The APIs should be asynchronous.
7.1
Fetch with TIPs
When receiving TIP Interests, TADD first checks if the corresponding TIP data exists in the database
(Or, not? At least we should do some optimization so we don’t check every time for a large
sequence of TIP names. If not, it queries the mapping between participant’s UID and prefix and
convert TIP Interests to TDP Interests, as illustrated in Table 1.
When the TDP Data comes, it strips the Link and get the TIP Data. After verifying the TIP data,
it stores it to the database and return it to other components who requested it.
Rather than replying on ccnd’s callback table for Interests, TADD will maintain its own callback
table. The benefit is that other components does not need to deal with ccnx directly and there is no
need to let ccnd to redirect TIP Interests from other components to TADD.
7.2
Serving TIP Data
When other components request TADD to publish data with TIP names, it does so and store the resulting
Data packets in database (as well as in ccnd content store).
Storing a single packet to the database per transaction when publishing a large file is definitely going
to suffer. There are two choices: 1) storing multiple packets in a single transaction or 2) combine the
Data packets for the same FID into a single blob, write to the file system and store the blob path into
the database. The first one gives “random access” to any packet of a file. The latter, on the other hand,
is based on the assumption that when a user request for a file, he is likely to request the whole file. So
when seeing the first Interests, we could read the FID blob and serve the later Interests from memory
(this resembles how Git handles file).
Currently, we decided to go with the latter method (Git style storage) instead of storing the packets
in database.
When TDP Interests with the local participant’s prefix comes, TADD takes care. All such Interests
are considered to be the TDP Interests as illustrated in Table 1. That is, they are assumed to request
for TIP names. TADD will extract the TIP name and query the database (or search the Git-like storage
on file system). If no corresponding data is found, then this Interest is ignored.
7.3
Broadcast-Redirect
It is often that a participant publishes some file and then goes offline after a while. In such case, the
default TDP (pointed to the origin publisher) constructed by TADD will not get response. If such
7
Interests can not get response for a certain among of time (e.g. 20 seconds), TADD resort to broadcast
to query who stored the TIP data.
The broadcast query Interest is in a form of
/ndn/broadcast/chronoshare/shared-folder-name/query/TIP-name
Whoever receives this query and has TIP-name in database can reply his current prefix.
After receiving the reply, TADD uses the prefix included in the reply to construct TDP Interests for
this TIP.
Hence, TADD should also maintain a table of mapping between TIP and its redirect TDP. Entries in
the table should be timed out if not used in a certain period (e.g. 20 seconds).
If the broadcast query also gets no response, query interval should be doubled with an upper bound
(e.g. 5 minutes).
This reminds us that perhaps in FSM we should maintain another table for files to be
fetched, or partial files. If the user wants the files, TADD should do the above procedure until the
partial file table is empty.
8
User Interface
Should be mostly the same as what has been implemented, except that we need a tool to inspect history.
9
Bootstrap
9.1
Initial bootstraping (cs init)
During initial bootstraping of the shared folder /path/to/folder, ChronoShare (cs) creates a /path/
to/folder/.cs subfolder with the following content:
• logs.db: sqlite database file to permanently store action log, metadata for files, sync log, and sync
tree.
• HEAD: file storing the current state hash (sync tree root hash)
• config: configuration file (for now empty, but could be useful later)
9.2
Getting access to shared folder (cs clone)
9.3
Updating state after a prolonged period of offline (cs update)
10
Security
Per-FID encryption. TBD.
References
[1] Version control with git. https://www.x.com/devzone/articles/basic-git-concepts.
[2] Zhenkai Zhu, Chaoyi Bian, Alexander Afanasyev, Van Jacobson, and Lixia Zhang. Chronos: A
server-less multi-user chat over ndn. Technical Report 0008, NDN Project, October 2012.
8