CryptPad's new Secure Cross-Domain Iframe

CryptPad version 1.14 (Codename Ouroboros) has been released and the most exciting new feature is one you cannot even see. As you may remember from the Security Growing Pains post, Content Security Policy is a significant part of CryptPad’s security model, and it is unfortunately incompatible with CKEditor, the Open Source text editor used in CryptPad.

With this release, we have done a significant re-architecture of the CryptPad codebase. Starting with the /pad/ application, the CryptPad UI has begun a process of moving into an iframe which is hosted on a different domain: sandbox.cryptpad.info. Moving the visual content to a different domain means that even in the event of a Cross-site scripting security vulnerability, most of your private information such as the pads in your CryptDrive, will not be at risk.

In this version we updated only the /pad/ application to use the cross-domain iframe because it is the only app which requires inline script. This prevented us from using Content Security Policy to block the most significant vector for Cross Site Scripting attacks but now with the cross-domain iframe, such attacks are mitigated.

Going forward, we plan to implement a standardized CryptPad application API so that new applications can be developed, installed and used in CryptPad. Today, the CryptPad API which is exposed to apps such as /pad/ and /code/ is not standardized and there is no clear line between the apps themselves and the CryptPad internals. As we move toward the a standard app API, we will define a standard representation of a CryptPad application with such additional aspects as the app’s color-scheme and icons.

Fundimentally, this unexciting change to CryptPad begins a new phase in development, we plan to move from a set of integrated prepackaged applications to an ecosystem of applications for collaborating on different types of content with the same encryption under the hood.

But Wait, There’s more

The pictures in this blog post are not hosted on the blog, they are in fact Zero Knowledge files uploaded on CryptPad. They can be seen on this blog because it is using the Media Tag which was developed as part of the UCF Project with the support of Systematic, BPIFrance and the City of Paris.

Media Tag allows files on CryptPad to be included in any website (such as this blog). All you have to do to include a file from CryptPad is simply include the Media Tag loader and then add a Media Tag to your document, just like the following:

1
2
3
4
5
<!-- At the top of your HTML file -->
<script src="https://cryptpad.fr/common/media-tag-nacl.min.js"></script>
<!-- Where you'd like the image to be located -->
<media-tag src="https://files.cryptpad.fr/blob/3c/3c9b5b3fb00b7dc35e15851606132585e8b69b06a51556eb" data-crypto-key="cryptpad:VE4raHL5VFReAXxioTaFZwt6q2jpxX+bdFHAFeoivZQ="></media-tag>

With this you can embed files from your CryptDrive into any website you want.

CryptPad's New Direction

If you are hosting CryptPad, please make sure you are up to date. CryptPad 1.13.0 (Naiad) fixed a major security issue.

CryptPad was born on Halloween 2014, at that time it was a skunkworks project inside of XWiki SAS. The UI was hidious green and white and the only feature was the CKEditor based pad. We have come a long way.

Old CryptPad Main Page

As was mentioned in Building Mututally Beneficial Relationships, CryptPad cannot ever be great without people developing the software as their daily job. We have been able to develop this project with the generous support of BPI France and the OpenPaaS::NG but that support only finances a small team and it will not continue indefinitely.

Starting with this release, we are adopting a new look and a reinforced dedication to making a quality product for people whose time is valuable. We’re starting this by upgrading the logo and the informational pages.

New CryptPad Main Page

Our lovable fist logo was created one night by grabbing a screen shot of an ascii generator. It was a time when I was racing to get something working to prove that CryptPad was an idea worth pursuing. Now times have changed. CryptPad is finally something that I’m starting to feel proud of, and the logo represented the last reminents of a time when everything was a rush and quality was an afterthought.

We also have introduced a lot of new features such as:

  • New front page which allows creating a pad in 1 click
  • Clickable links in pads when viewed in read-only mode
  • File-picker for embedding media in a pad in Markdown mode
  • You can now have your preference between tabs and spaces, when editing in the code editor

Registered users can uploading and view PDF Files

But more than features, we have focused on making CryptPad easier to use. It’s now easier to paste text into the pad without breaking the formatting and we have additional tool-tips to help explain different features of CryptPad.

Going forward from today we plan to make CryptPad easier to use, more secure and more extensible. We are rewriting the pad logic in order to run in a cross-domain iframe which will make use of the browser’s Same Origin Policy as a sandbox to block most of the code from accessing the decryption keys.

This will open the door to 3rd party applications developed for CryptPad which can be protected by the same cryptography as CryptPad and which have limited access to the CryptPad system.

CryptPad Analytics & Privacy - What we can't know, what we must know, what we want to know

CryptPad is a Zero Knowledge cloud application, this means we have designed it such that we do not have any access to the content which is hosted on our server. However, there are other things which we do collect and it is important that privacy-minded users understand what we are collecting and why. There are four types of information:

  • What we can’t know: This is data that CryptPad app encrypts so we will never have access to it
  • What we must see but don’t collect: This is information which we don’t bother to store but because of how the technology works, we necessarily have access to it.
  • What we must know: This is metadata which we cannot help but see because of the way the technology works
  • What we want to know: This is information which we really want to know in order to make CryptPad better every day

We want to know everything about people, we want to know how people use CryptPad, why people use CryptPad and how we can make their experience easier. However, we don’t want to know anything at all about you.

This poses a challenge because we want to collect as much aggregate information as we can in order to make a great web service, but we don’t want to collect data that can be linked in order to tell a story about you.

What we can’t know

There are a few things which the Zero Knowledge design of CryptPad does not allow us to know at all. These include (obviously) your password and the content of your pads, but less obviously, the titles of your pads, the names of the contributors and your username (you can even have the same username as someone else on the system, we won’t know). The types of your pads are also unknown to us though we could make educated guesses by looking at the encrypted data.

It is our promise to you that we will never collect this information.

What we could know but don’t bother to collect

There are also some things which we don’t really want to know but we cannot avoid seeing it anyway. This includes most importantly the IP addresses of people who edited a specific pad. Technically we know your IP address because it’s how you communicate with our server, but most of the actual operations are done using commands sent down a WebSocket. Once the WebSocket is established, we assign you a random ID and this is how you are referenced, what appears in our server logs looks like this:

1
2
3
198.167.222.70 - - [06/Jul/2017:20:47:45 +0200] "GET /pad/ HTTP/1.1"
304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36" "-"

Notice there is no pad ID in there, the pad ID is not in the URL so it doesn’t go in the server logs by default.

Compare this with EtherPad:

1
2
3
4
5
IP Address Pad ID
198.167.222.70 - - [06/Jul/2017:11:54:37 -0700] "GET /p/UNWnpczTkq HTTP/1.1"
200 8920 "https://pad.meshwith.me/" "Mozilla/5.0 (Macintosh; Intel Mac OS X
10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109
Safari/537.36"

You cannot verify that we’re not collecting this so best assume that we are.

What we must know

There are some things which we need to know in order for CryptPad to function properly, we need to know which pads are in your drive in order to impose storage limits on logged-in users and to expire pads which nobody cares about. However, we don’t know much about who you are. Since we don’t know your username, to us you are identified by a public signing key, something like this:

YIBzjPr3beuGgfHNglGfo3xq-dquxsj4Bst-ze7mL9A

We know that YIBzjPr3beuGgfHNglGfo3xq-dquxsj4Bst-ze7mL9A has 392 MB of data in their CryptDrive including a pad of some type which has the ID fe382219b10c0396de63d2bab7942390 and an uploaded which we know as ff2fdf9bb99ecc89d29d780780de10efdac14ed15e93b235. One of these pads that they have is actually their drive itself, but we don’t strictly know which one (again, we can take guesses based on the size of the patches). You can find out what your signing key is by looking at in your settings page.

We also know when each pad was last accessed so that we can know to delete pads which are not in anybody’s CryptDrive and have not been opened in a long time.

Why we can’t avoid collecting IP addresses

Being able to know how many different people are using CryptPad is very important to us. One rather rude person decided to try to crash our server by creating 647,533 pads. They didn’t put much thought into their attack because what they were doing was not actually creating pads, but it illustrates the problem that if we don’t know how many different people are using the server, we don’t have any idea whether we are popular or under attack. Worse, we don’t know what features have widespread support vs. which ones are only popular with a few prolific users.

One obvious thought is to simply run the IP addresses through a hash function the way we traditionally hash passwords. However this sadly cannot work because there are only 4.2 billion IPv4 addresses and constructing a rainbow table to get back the original IP addresses would take only about 1 day of computer time. So in the end we simply log the IP addresses and don’t worry about it.

What a pad looks like to us

A pad is stored as a file which represents a sequence of encrypted patches. These patches change the content of the pad from nothing to whatever it becomes in the end. A typical message looks something like this:

1
[0,"69d46337f826c0ecd881be59c119a527","MSG","fe382219b10c0396de63d2bab7942390","51Q...."]

It starts with a zero and then your temporary random ID, then it contains the word MSG and the ID of the pad which it is sent to, this format is exactly the same as what is sent on the wire. Finally it contains the encrypted patch which tells us essentially nothing except it gives us a rough idea of just how big the change was.

Occasionally the client will send a checkpoint, this is a special patch which removes all of the content and then puts it all back again. To us, a checkpoint looks the same as anything else, it is a big ball of encrypted data, except in this case it is flagged as a checkpoint so the server knows it can send only part of the history of the pad instead of all of it. However, they do give us a good idea of how big the pad actually is at that time.

What we collect because we want to know

What we really want to understand is your experience with CryptPad and how we can make that experience better. So therefore we collect quite a number of data-points about where people click and what their browser supports. For example we collect the dimensions of your browser. Not because we want to know who you are but because we want to know that types of browsers we need to support.

1
2
3
4
198.167.222.70 - - [06/Jul/2017:21:26:15 +0200]
"HEAD /common/feedback.html?DIMENSIONS:752x1440=1499369175085 HTTP/1.1" 200 0
"https://cryptpad.fr/settings/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36" "-"

You can see an exhaustive list of things that we collect by checking out the feedback functionality in the CryptPad source code but as of the time of this writing, we are collecting feedback about the following things (usually we just collect the fact that an event occurred, not more).

  • Clicking “upgrade account”
  • Clicking “support cryptpad”
  • Presentation: clicking on “print slides”
  • Registering and logging in
  • Opening your recent pads as an anonymous user
  • Clicking any CKEditor button such as “bold” or “italic”
  • Displaying the drive as icons or as a list
  • Creating and using templates
  • Showing and hiding the userlist or CKEditor menu bar
  • Whether your browser is missing certain important features like Proxy, isArray or localStorage
  • Which type of pad you are using
  • The dimensions of your browser window
  • When you have changed your display name
  • Whether you have migrated your CryptDrive from the legacy format

If you are worried about what we might do with this data, you can disable feedback collection in your settings page. But keep in mind that if you disable it we cannot help but know, because your IP address will be in the tiny minority of addresses which access the site but don’t send feedback messages.

What we can learn from the data

1. People mostly use CryptPad to make a plain old pad

But the code/markdown pad and the CryptDrive are catching up. Unique IPs per pad type

2. Activity has been on a very slow rise but with a few spikes

This chart shows unique IPs per day hitting CryptPad. You can things are relatively flat over time except for a big day in June and then some increased activity in July after the UI improvements were rolled out. Unique IPs per day

3. Browser window dimensions are all over the map

This chart shows bubbles which are bigger depending on how many different IPs report the same browser window dimensions. Tragically it seems that there is no way to predict what aspect ratio a device using CryptPad is going to have. Browser window dimensions

4. Lots of pads are made and then abandoned

The first chart shows in blue the number of pads created each day and the number of pads which become “abandoned” (have not been touched in 2 weeks). This says that perhaps pads are considered ephimeral and not to be used for the long term. Created vs. abandoned pads

Here we can see the evolution of pads which have been accessed within the last day the last week and the last month. There is slow but steady growth in the pads active in the past month. Number of active pads

5. People use CryptPad for a while, then leave

We measured 15,000 IP addresses which came to CryptPad just to look at one pad and then left, but of the 13,000 who stayed longer than that we analyzed the time when they first arrived and the time when they made their last visit. About 630 IP addressses have been continually using CryptPad for all 45 days. Number of IPs continuing to access CryptPad We want to make CryptPad a useful tool for helping people get organized and make their projects succeed. So whenever people decide that CryptPad is not the right answer for them, we care about what went wrong and how we can make it better.

How we analyze this data

We do all of our analysis ourselves, and we don’t share any of this data with Google or other data companies. We’re thankful to Kibana/ElasticSearch and LogStash for making it possible to do in depth analysis on our own computers without resorting to a cloud service.

CryptPad Jackalope - File Upload, PDF and Pictures

Yesterday we released CryptPad v1.9.0 Jackalope, we have some exciting new features which we’ve been working on for a long time. As part of the UCF project we have implemented a Zero Knowledge media-tag in CryptPad for displaying and downloading encrypted files stored in CryptPad. Starting now, you can upload files by clicking the upload button or dragging them into your CryptDrive. You can also view pictures and PDF files in CryptPad and you can drag-and-drop pictures directly into presentations. In the next release we will hopefully be adding drag-and-drop pictures into the pad.

CryptDrive Upload

Filenames

We also made a significant but less visible improvement to the CryptDrive. When you make a new pad in CryptPad, it has a title, which anyone in the pad can change, and it has a filename which it how the pad is shown in your CryptDrive. Because anyone at any time can change the title of a pad, the only way to know the titles of all the pads in your drive is to load each and every one of them which would take a long time. But the filename is your unique way to refer to a pad, it lives only in your CryptDrive and it is the same no matter what title someone gives to the pad.

Now the CryptDrive UI shows only one name for a pad, this name is just the title of the pad at the last time you’d accessed it unless you assign it your own filename.

Slide Preview

When you’re using the CryptPad slide app to make a quick presentation, now you can see your presentation in the righthand pane while you type. Since presentations are written in Markdown, this means you get a live action preview of what your presentation slides are going to look like.

Slide Preview and Drag & Drop

Try it now

Head over to cryptpad.fr and give CryptPad a try !

Building mutually beneficial relationships

People hosting instances of CryptPad should read at least the Changes in CryptPad section

Thanks to Scott Alexander for some of the ethical foundations of this post.


You ever wonder why Open Source software always seems to be slightly harder to use and slightly buggier and slightly less polished than proprietary competitors?

How about this: Why is it that good people who want to make good things somehow end up making evil things for evil corporations which sell them to other good people who would (presumably) rather buy good things.

It’s all about incentives

It’s hard to talk about incentives without sounding like a miserly tool, but if we’re going to hack ourselves out of a situation that nobody really wants to be in, we’re going to need to understand them pretty well.

  • Why is Open Source habitually 90% of the way there ?
  • Why is Facebook more addictive than it is useful ?
  • Why is it that when you get something for free, even from a well funded government program, it’s reliably worse than something you buy?

It’s all about incentives.

In a restaurant, you’re the customer

I love going to restaurants. I have no car and few possessions so restaurants are the way I spend my income. Not only do I love food but I love the relationship which I have with restaurateurs. When I walk into a restaurant, I want to be fed delicious food and they want to be paid, not only that, they want me to be happy so I will return many times and bring my friends. I want them to be happy so they will give me bigger portions and maybe a little dessert on the house. Our incentives are aligned perfectly. We are practically a team.

In a soup kitchen, you’re just a user

It is hard to deny the importance of soup kitchens to the fabric of society. Part of what makes us able to claim to be civilized is the fact that we don’t let people simply die if they’re down on their luck. Soup kitchens, however, are not restaurants. When you walk into a soup kitchen, you are generally greeted kindly but there is a subtle distinction from a restaurant, at a restaurant you’re the customer and at a soup kitchen you’re just a user. Many soup kitchens are organized around religious groups and evangelizing their belief is a significant part of their motivation, but even secular organizations are motivated by some sort of a higher calling.

Open Source is a soup kitchen

I’ve been developing Open Source both professionally and personally for 7 years and I’m going to tell you something that many Open Source developers won’t admit. Open Source software is not made for you. Sometimes Open Source developers are motivated by the Free Software ideology and they imagine their code as transforming the world, sometimes they just want to solve some problem for themselves and they give away the resulting code. Open Source software is almost never developed for the simple purpose of making another person’s life a little easier.

If you aren’t the customer you’re the product

This aphorism has become popular with the rise of ad-tech and social network websites. The phrase invokes an image of free services coming like free grain because you are, in fact, the pig on his way to slaughter. In some way this is true, Silicon Valley business models are becoming disturbingly like human farming.

However, the phrase also invokes an image of an evil entrepreneur plotting to enslave humanity by creating a slick social network. If 1 in 1000 companies is successful then logic implies there must be thousands of evil entrepreneurs running around everywhere. If this is true then where are all of the failed evil plotters? I’ve never met an entrepreneur who was anything less than an aspiring saint.

I think the real reason why social networks become human farms is because people don’t want to pay for development of web services and stuck between a successful human farm and a failing soup kitchen, entrepreneurs begrudgingly choose to farm.

Breaking out

If we’re ever going to stop living in a world of farms and soup kitchens, we’re going to need to get serious about incentives. Part of my intention in starting the CryptPad project is to build something that is not a farm nor a soup kitchen. I want to have a mutually beneficial relationship with every one of CryptPad’s users, including you. I don’t want to be a charity worker beholden to an NGO or a post office clerk drawing a paycheck from the state. I want you to be my boss, I want to obsess about making your life better, I want fair exchange of value and aligned incentives.

Changes in CryptPad

As you may already know, cryptpad.fr now limits your data storage and allows you to buy an account which will raise that limit. The code for limits and accounts is also in the CryptPad codebase and turned on by default. If you are installing CryptPad, you have three choices.

  1. Leave it exactly as it is: People will be limited to 50MB of storage and they will see a Support CryptPad button. In the development time this donated money buys, we will pay special consideration to the needs of CryptPad admins like you.
  2. Share the revenue: If you specify some configuration parameters and send us an email, the donation button will become an Upgrade Account button, allowing them to take a plan with additional storage quota. When people upgrade their account on your server, we will credit you 50% of the revenue earned. This helps us pay the cost of development and helps you pay the cost of hosting.
  3. Disable the donate button: If you do this, we hope you will help CryptPad in some other way such as by taking an on-premises support contract.

If you run a public CryptPad instance, please don’t increase the 50MB per user storage limit. This limit is what makes people subscribe and what pays for CryptPad development. Running a CryptPad instance which offers a “better deal” is effectively using the project against itself.

Finally, new versions of CryptPad always check for new or expired accounts from our account server. We have added a parameter called adminEmail which will be sent along with the domain and version of CryptPad you’re running. This way we can notify you if we’re aware of any a serious problems with your CryptPad instance. We take your privacy seriously and will never sell your email or send you marketing spam. If, however, you want to keep your CryptPad instance completely hidden from us, you can set this parameter to false and it will never query the account server.

Coming next

Our objective is to help you collaborate, stay organized and get things done faster and easier. We want to provide maximum value to you and we want you to provide value to us so that we can continue doing it. As was said in the previous post, the big issues which we are planning to tackle soon are:

  • File upload for PDF and image embedding
  • Text coloring based on the authors of the document
  • Workgroups for team collaboration
  • Zero Knowledge spreadsheets

As always, we will be continuing to put great effort into understanding your problems, how you go about solving them, and how we can make little changes to make CryptPad fit your needs better.

Caleb