Change Facebook to a Peer-to-Peer architecture

         Please send any feedback to me.

Motivation section
New Architecture section
Notes section




Motivation:


With revelations of NSA spying, many people feel any central-server service is bad: it provides a single place, out of the user's control, where all of user's info can be grabbed (legally or illegally, by hacker or corporation or law enforcement or NSA) and all of user's activity can be monitored in real-time.

Distributed, peer-to-peer systems reduce this problem: a user's data can be stored in a place under that user's control, or spread across the system, and there is no one place where N users can be monitored.

The following is a proposed way Facebook Corp could change the Facebook system to a distributed, peer-to-peer architecture.



Anticipated criticisms:
  • "Why not just get rid of Facebook, and everyone move to Diaspora or something ?"

    Facebook users won't move. Facebook works for them, it's good enough, they don't need flashy features, most of them don't care much about privacy or buy into the "Facebook Corp is evil" mantra. They have a big "investment" in Facebook (Friend relationships, photo albums, groups, Likes, conversations in progress, history, knowledge of how to use it, etc); it would be hard for them to move. And they'd have to get all of their Friends and Groups to move simultaneously. Not going to happen.

  • "This change would be a big effort for Facebook Corp; why should they do it ?"

    There are benefits for Facebook Corp, listed in the "Notes" section later. I don't know if the benefits are big enough that they'd be willing to pay the costs to do it.





New Architecture:




  1. Facebook Corp owns and runs a couple of central servers:
    • Ad-server (Facebook Corp needs to make money, after all),
    • ID-server (to manage user names and IDs, user encryption keys, public part of each user's profile, and stop users who are under-age, spammers, scammers, etc).

      Each user will own a public/private key pair, with public key stored in ID-server; anyone can ask to get the key for a user identified by name or UserID.

      Each user will also own 100 encrypted "credentials" stored in the ID-server; the only operation allowed on them is a Content Server submitting a copy of a credential C1 and asking "is credential C1 a valid Facebook credential ?", which gives a yes/no answer.

  2. Any number of companies run Content Servers, storing encrypted Facebook data:
    • Thread: a Post followed by a list of Comments.
    • Group: a list of Members, a list of Posts, a list of Files/Documents, About item, etc.
    • File.
    • Album: a list of Photos.
    • Photo.
    • Page: contents of page.
    Each item has an encrypted owner-Credential / itemID pair associated with it. If someone wants to edit or delete the item, they have to present the encrypted pair to authorize the request. Anyone can read the encrypted items from the server by presenting the itemID, but can't read the encrypted Credential/itemID pairs.

  3. There is a layer of Anonymizers, which prevent the servers from knowing locations or IDs of user machines. Anyone can run an Anonymizer. They just pass requests and responses back and forth.

  4. Each user will run their own little Facebook service and Facebook database on their machine.

    The database contains the user's info:
    • profile (part accessible only to Friends),
    • list of Credentials,
    • list of Friends (UserIDs),
    • list of Group memberships,
    • list of Groups owned,
    • list of Pages owned,
    • list of Likes,
    • list of Albums,
    • list of Notifications,
    • lists of Messages (Inbox, Other, Spam),
    • list of Wall contents: list of Threads, Likes, etc owned by this user,
    • list of Activities: list of operations performed (Posts, Comments, Likes, etc) by this user,
    • list of NewsFeed contents: list of Threads, Likes, etc from other users.
    This data is not known to anyone else or stored anywhere else.

    A single machine could run the services and databases for multiple users (such as in a family) and for multiple Groups.

    How does a user use Facebook if they are not at their home machine, and their home machine is shut down ? They can't. (Unless they take their database with them, on a thumb-drive. And if the home machine IS running, they could access the database across the internet.)

    For people who don't want to run a local service on their machine, or leave their home machine running while they are traveling, hosting services could offer to run services for many users. But this eliminates most of the privacy gain. At least it would spread the total data across many companies, not have it solely in Facebook Corp.

  5. Keep the standard web-based Facebook UI. Just have it connect to the local service. It must not connect to the other servers. The local service will connect through Anonymizers to the other servers.

  6. When user creates a new Post or Comment or Photo, a notification is sent to every one of user's Friends.

    Sending may fail, if the Friend is offline. Notifications may be batched together and sent after some delay, to avoid too much traffic.

    The notification includes a unique key T1 (single-key symmetric encryption) needed to decrypt the contents of the Post or Comment etc, as well as the Content Server ID and item ID needed to retrieve the item.

  7. When user boots up their machine and starts up their peer service, the service must ask all Friends and Groups if there are any new Posts or Comments, any new Notifications, any new Photo's, etc.

  8. Any time a user reads a Thread from a Content Server, the CS contacts the Ad Server to fetch an ad to return with the Thread data.

    Some amount of user-specific and context-specific information will be sent with each ad request, to allow the ads to be targeted. The UserID or User Credential will not be sent; the information will be something like "user is white male 50-60 years old, location is ZIP code 94086, context keywords are Tea Party destroy government".

    The ad must not contain any "active" content, anything that would make the user's browser contact another machine as it renders the ad. If it did, someone could use that to figure out which user machine is accessing which Threads or other items.

    It is okay for an ad to contain links which the user may choose to click on. The links should not contain any info identifying the Thread or the User, unless the user consents.

  9. Applications could be run on central servers, with some user-specific or context-specific information sent to them. Or they could be run locally, on the user's machine ?

  10. Each User has a public key (stored in the ID Server), and a private key (stored in the database on the User machine).

    If a user U1 wants to create a new Thread, they pick one of their 100 credentials at random, generate a new Thread-specific key T1 (single-key symmetric encryption), save both, and send this to a Content Server:
    (Encrypted with CS's public key:)
    • "New Thread" command code,
    • Thread owner Credential,
    • Keywords related to post content (for ad targeting purposes),
    • Text/links of the post content (encrypted with T1).
    • List of Comments (empty):
      • Comment owner Credential,
      • Keywords related to comment content (for ad targeting purposes),
      • Text/links of the comment content (encrypted with T1).
    Thread-specific key T1 is saved in the User's database, not sent to Content Server. Content Server returns a new Thread ID when the operation is complete. When User notifies Friends of the new Thread, the notification contains the Content Server ID, Thread ID and key T1, so the Friends can retrieve and decrypt the content.

    To read the Thread, user U2 generates a one-time key X1 and sends this command to the Content Server:
    (Encrypted with CS's public key:)
    • "Read Thread" command code,
    • Thread ID,
    • General data related to user U2 (for ad targeting purposes).
    • Key to use to encrypt the response: X1.
    Response from Content Server to user U2:
    (Encrypted with key X1)
    • Text/links of the post content (encrypted with T1).
    • List of Comments:
      • Text/links of the comment content (encrypted with T1).

    To edit a Comment owned by U3 in the Thread, user U3 sends this command to the Content Server:
    (Encrypted with CS's public key:)
    • "Edit Comment" command code,
    • Thread ID,
    • Comment number within Thread.
    • Comment owner's Credential,
    • Text/links of the new comment content (encrypted with T1).






Notes:


  • The change from old to new architecture could be phased in:

    1. Create the Ad Server, and the ID Server, and make the old Facebook central server (OFCS) use them.

    2. Make all users run a Facebook local service on their machines, and access Facebook through that.

    3. Have user local services create user encryption keys, storing public keys in ID Server, and private keys in the user local databases. Also have ID Server create credentials, storing them both in user local databases and in ID Server.

    4. Have OFCS create per-Thread encryption keys (T1), use them to encrypt the Thread data in OFCS, and store them in the user local databases. Any new Threads created after this date also get keys generated and data encrypted by OFCS.

    5. Change to do Thread key generation and encryption in the user local services, not in OFCS.

    6. Make the OFCS operate as a Content Server, using the user credentials for authorization, and sending its CS ID out to all users.

    7. Create a network of Content Servers, moving all of the data out of the OFCS to the new CS's, and sending new CS ID's to the Facebook local services. Eventually, the OFCS will store no content (Threads, Photos, etc).

    8. Move Friend lists and Group memberships and Thread lists etc from OFCS to user local databases. Machine addresses (IP addresses) of users have to be sent out as part of the lists. Start doing update notifications peer-to-peer instead of through OFCS.

    9. Shut down OFCS; there is no info left in it.

    10. Create a network of Anonymizers, and change the Facebook local service to use them.

    11. Users can change machine IP addresses (notifying their Friends), and generate new per-Thread keys and re-encrypt the data in the Threads (notifying their Friends). Now Facebook Corp and the CS's won't know the addresses or keys.


  • What has been achieved ?

    • User experience:

      • Facebook client UI and functionality haven't changed at all; no change for users.

      • Facebook existing user base and user content haven't changed at all.

    • Facebook Corp:

      • Facebook Corp still makes money, and controls ads and user accounts and logins.

      • Facebook Corp gets to trumpet itself as a "privacy champion".

      • Facebook Corp spends less effort responding to warrants and subpoenas, because they no longer have the data.

      • Oppressive governments could no longer pressure Facebook Corp into censorship, because they no longer control the data.

    • Privacy:

      • Data for Threads, Groups, Albums, etc is stored encrypted, spread across multiple companies, and not under Facebook Corp's control. The companies storing the data don't have the keys to decrypt the content, or any way to connect data to user.

      • Locations and IDs for user machines are not known to any part of the system except the Anonymizers (which destroy the data after each operation is completed) and the Friends of each user.

      • User's lists of Friends, Posts, Comments, Albums, etc are stored only in that user's machine.


  • Problems and issues:

    • The peer-to-peer nature of the system is weakened by the need to have the underlying Content Servers and ID Server with data in them. But those servers are necessary to make the content available even if user machines are shut down, and to let Facebook Corp control the system and keep making money.

    • Since admins at Facebook Corp can no longer read content directly, they can't root out systematic problems such as spammers and scammers as easily as they could before. They would have to rely on users reporting each bad post or comment (a report would include the Thread ID and Thread key needed to let Facebook Corp read the bad thread). Once a pattern of bad behavior has been established, Facebook Corp can delete the offending user, revoking their credentials.

    • Do the Anonymizers really add much value to the system ? Maybe if a user is in a country with an oppressive government, they do.

    • How do Content Servers get paid ? Get a cut of the ad money from Facebook Corp ?

    • How do Anonymizers get paid ? It's such a lightweight service that maybe people would run them for free ? Or maybe a user would have to pay to use a network of anonymizers.

    • What kind of public-key asymmetric encryption should be used ? Some are RSA and Diffie-Hellman. Are they legal in all countries ?

    • What kind of single-key symmetric encryption should be used to encrypt Threads and other content ? Some are Blowfish, AES, and DES. Are they legal in all countries ? Will the size of the data increase greatly when encrypted ?

    • What kind of encryption should be used to create user Credentials ? Some kind of one-way hash ? Or just a large random number generator ?






Last update: November 2013.



Bookmark and Share

Home       Site Map

Privacy policy