Looking back, looking forward

It was four years ago when I first started working full-time on CryptPad. At that point fewer than 10 people used the service on a weekly basis. Our development team was included in that list, often multiple times since we visited from both our office and our homes.

In those early days the platform was much more of a toy than a tool. There was no CryptDrive for storing documents, no login, markdown rendering, file upload, kanban, or whiteboard. It was the first of four years of a research project in which we were responsible for building a variety of collaborative editors. We mostly used CryptPad to prototype new technologies before committing to a much more complex integration into the larger project. Nobody insisted that our editors include the extra privacy features we designed, yet, among our small team we definitely hoped they would catch on.

We knew that as long as we produced viable editors and passed our project’s yearly reviews we didn’t have to worry about our jobs. It felt like we were supposed to take risks and we certainly did. The stakes were low. Sometimes if we wanted to test the platform together we’d just push our code to our production server. Occasionally we’d edit files directly on the server to cut out additional steps. We did our work as quickly as we could without having to worry about the consequences because nobody was relying on us for their safety.

It was an exciting time.

Two thousand and nineteen

Our situation today is drastically different. Privacy is very much in the public eye, although the news is more often bad than good. In any case, instead of ten weekly visitors CryptPad.fr now supports more than ten thousand.

Many of those that trust us to protect their information have no cause to use our service other than the very reasonable expectation that nobody will access their content without their consent. We’re pleased to be able to offer this peace of mind and we appreciate that we need this demographic and its expectations to become the norm if those with more extreme requirements are to blend in with the crowd. As the saying goes: privacy is a team sport.

As proud as I am of the project’s advancement since our humble beginnings, I still feel as though we’ve been playing this sport defensively in these last 365 days. We began the year with the knowledge that our stable funding was about to dry up and that our efforts to sustain the project via subscriptions and donations were not going to be enough. At the same time, increasingly more of our time was occupied just keeping up with regular issues: answering emails, fixing bugs, and managing a progressively more complex codebase. Meanwhile, we had to consider the effects of every change on those users whose physical safety occasionally depends on their privacy.

Fortunately for us and our community we received an enormous amount of support from Europe’s Next Generation Internet initiative, both in terms of publicity through the presentation of an NGI award and monetary contributions through the NLnet PET grant program. We’ve still had to cope with an endless stream of feature requests and correspondences, but the funding definitely addressed our existential worries for a time.

In the course of our CryptPad Teams project we struggled to balance all the responsibilities of our position and as a result it’s taken somewhat longer to complete the project than we planned. I now have a better appreciation of how much easier a project can appear in its planning compared to its execution. The opportunity to go slightly over budget on a small project has been a welcome learning experience that I hope not to repeat.

Looking forward

At this stage in our project it isn’t enough for our team to try to keep up with tickets on our issue tracker. Reactionary decisions won’t make our project sustainable, nor will they effectively serve the community that has helped us get this far. That’s why in 2020 we’re going to focus on project governance and providing a cohesive vision with the hope of getting more of our stakeholders directly involved in its success.

I spent a large part of this holiday season making small changes to make it easier to correctly configure a CryptPad instance. Starting in January we’re going to continue this effort to support the 300 independent instance administrators with a radical overhaul of our documentation, along with simplified guides for users and more detailed guides for contributors.

Our immediate roadmap will also feature further development of our admin panel to ensure that community instances can be governed by team members lacking advanced technological expertise. Beyond that we’re looking forward to some big improvements to the tools that are most essential to effectively coordinate distributed groups of people, namely our rich text, spreadsheet, and kanban apps.

There’s still a lot of work we can do to improve the social integrations first proposed in our Teams project. We’ll continue to streamline the process of onboarding new team members and add in some even more advanced controls for very sensitive data.

I’ve been hesitant to commit to development time that doesn’t yet have a source of funding but in the coming year I hope to be able to deliver an improved experience for users of mobile and touch-enabled devices.

How you can help

Privacy should not be a luxury item. CryptPad has been built largely with public money and we’re committed to continuing its development as a public good. Continued monetary contributions via donations enable us to offer our services to users regardless of whether they can contribute themselves.

Along with subscriptions to our platform, our independent revenue helps to finance all the minor tasks that don’t easily fit into the narrative of a successful grant proposal. Every cent of these revenue streams go back into development and we do our best to get the most value out of your contributions.

There are, of course, many other ways you can contribute. Any publicity you can generate will free us to spend less time marketing and more time improving the software and its documentation. Sharing our messages on social media with your followers helps a lot, so please follow us on the Fediverse and Twitter. We especially appreciate personal messages that tell the world exactly what it is you love about CryptPad.

We’re also happy to support and publicize offline events promoting the project. If you’re comfortable speaking in public and would like to represent us in your community feel free to contact us about and we’ll see how we can help.

As we produce more documentation we’ll also need help reviewing it and keeping it up to date. Every little bit helps, whether it’s a page or a line of documentation corrected. Finally, we welcome any efforts to translate CryptPad into a new language or to help those already working on our existing translations.

Wishes for 2020

I made a deliberate choice in naming the most recent cycle of releases after extinct animals. We are living through a major extinction event and growing list of crises. More than ever we need a hopeful vision of the future.

I’m personally grateful for the opportunity to offer tools to support these endeavors.

Embrace private spaces.
Connect with those around you.
Organize and build a better future together.

See you in 2020!

Yesterday I made a mess

Normally when I write a blog post it’s because I have exciting news to share. This time it’s not a fun occasion because the only good news I have is that the bad news isn’t permanent.

The bad news is that during some database maintenance yesterday (June 13th) I accidentally removed some of the data from users’ encrypted drives. The good news is that these files were archived, not deleted, and that everything can be recovered.

Before I get into the details of why this happened I’d like to clarify which user data was archived and how to check if your account was one of those affected.

How to tell if you were affected

First off, everything is related to my actions administrating the database of CryptPad.fr. Users of other instances have nothing to worry about unless their administrator did the exact same thing as I did, which is unlikely.

Secondly, the issue is limited to shared folders and non-owned files contained within them. If you don’t use shared folders you won’t be affected.

Thirdly, as far as we can tell you need to have visited CryptPad.fr between May 28th and June 13th in order to have run some incorrect code.

Finally, nothing was archived unless it had not been active within the preceding 90 days. In the case of shared folders, this would mean any change to the content or structure of the shared folder, such as adding or removing a document or renaming or moving any of its contents. In the case of pads, if a user with the rights to edit the document loaded it without making any changes, that would classify it as active.

To summarise:

Some of your data could have been archived if you visited https://CryptPad.fr between May 28th and June 13th (2019) and have one or more shared folders in your CryptDrive which have not been modified within the last 90 days.

Checking if you were affected

It should be fairly easy to tell if your account was affected by opening your CryptDrive. Affected shared folders will be visible in the tree on the left of your drive because they’ll have lost their titles, as highlighted in red below:

archived shared folders

How we’re going to handle this

As I said, none of the data was deleted, just archived. It’s still on the same server that hosts the rest of our database, it’s just been moved to a different location to make it inaccessible.

I’ve already restored all of those files which were archived except for 237 cases. Affected Users that visited CryptPad.fr between the removal and restoration of the data would have automatically created a new folder in the same location as the old one, and that complicates things for us. Since we don’t know whether they might have decided to put new documents in that folder in the meantime, it’s dangerous for us to overwrite the new data with the old.

It’s going to take us a few days to figure out if we can use some fancier methodology to identify what data we can safely reinstate. In the meantime, we’ve already fixed the underlying issues that caused this data to be miscategorized, and developed some new tooling for safely diagnosing and restoring archived data.

Since we know that those affected by this error visited since our last release day and that they had content older than 90 days, we assume they’re going to come back to the platform. If you do come back and see something resembling the image above, please do let us know by emailing us at contact@cryptpad.fr. We can manually restore any files that haven’t already been restored.

I’m very sorry for any inconvenience this might have caused and I’m grateful that the damage wasn’t worse. I’ll take this as an opportunity to prove my commitment to protecting user data, whether it be from surveillance or from my own mistakes.

Post-mortem

With all the practical details addressed for those who only have the time to make sure their own data is safe, I’ll go further into the specifics for anyone who might be interested.

The pinning race condition (May 28th)

On May 28th we released CryptPad Xenops. It introduced notifications for users through the use of something we’ve been calling “encrypted mailboxes”. Each registered user now has a mailbox through which any other user can send messages, currently for friend requests, and soon for other features.

While we were implementing the function which loads new messages from this mailbox we introduced a bug which caused some other functions to be executed in the wrong order. I personally reviewed the code but didn’t see the bug.

Registered users are able to send instructions to the server not to delete data that is relevant to them. We call this process “pinning” and it’s done every time a user uses the service.

What should have happened is that users should have loaded their drive, then loaded their shared folders, then pinned all the contained files. Instead, they loaded their drive and then started loading their shared folders and started the pinning process in parallel. This caused what’s called a race condition, which means that two things happen at the same time, and sometimes they happen in different orders.

Race conditions are especially annoying because sometimes they only occur under certain circumstances, so these bugs tend to slip past basic testing unless you already know what you’re looking for. In our case, losing the race meant that files weren’t pinned and consequently the server didn’t have an accurate notion of which data was worth keeping.

Running out of space (June 3rd)

Several months ago a user contacted us saying that data had disappeared from their drive. This was quite scary from our perspective as for every user that contacts us about something we can generally assume that there are many more that had the same issue, but didn’t say anything.

We spent several days debugging their problem and developing tools which would analyze the history of their drive without exposing any of their encrypted content to us. In the end, it turned out that the files didn’t ever exist in their history, so it wasn’t a matter of us losing that data. Nevertheless, the situation was stressful enough that we turned off all of our scripts for deleting inactive data until we could sort out a more reliable methodology for handling data.

With that regular process not in place, and with increasingly more users visiting our service, our database continued to grow at an accellerating pace. On June 3rd we started receiving automated emails from XWiki’s infrastructure services that we were down to 20% of our disk space. We had been meaning to handle this problem for some time but with 33 emails arriving in our inboxes each day we finally decided to prioritize it.

Replaced the race condition (June 6th)

After the Xenops release we noticed an error that was occurring in our browser consoles fairly regularly and decided to debug it. We tracked it down and fixed it, but since we weren’t looking for the other race condition described above, we managed to change the code in such a way that a functionally identical race condition was still present. We fixed one issue, but pads still weren’t being pinned reliably.

Incorrect data archival (June 13th)

Having proceeded with fixing a variety of other bugs, I turned my attention back to solving our storage issue. Deleting data hadn’t become any less scary than it had always been so I proceeded with caution, implementing an archival system that would move inactive data to what we termed “cold storage” for a set period before removing it permanently.

I implemented some code for iterating over our complete database and used that to create a script for checking the most recent modifications to user data. I read through it a number of times, tested it on my local database and had my colleague review it and test it on his machine. Before using it on our production database I made sure to also write and test a script that would restore archived files in case anything went wrong.

I think I must have sat in front of my laptop and stared at my screen for between five and ten minutes before I hit enter on the command to run the script. I had the code for the script on another monitor, and I double-checked it before deciding to proceed. I reloaded my drive to make sure everything was still there once it finished running, and it was. After twenty minutes or so of testing everything seemed alright, so I went on with my day.

Later on we finally noticed that there was a problem with one of our user accounts, specifically with a shared folder having disappeared. We stayed at the office late into the evening to figure out what had happened, and ended up tracking the problem to the pinning logic before deciding to follow up on it in the morning.

Final debugging and restoration (June 14th)

With as restful a night as I could manage under the circumstances, I came back to the office this morning with a bit of perspective on the issue. I wrote up a pad which collected all the information we had into one place, identifying the circumstances under which we believed the problem could occur.

I reviewed the script which restored archived files, making sure that it would not overwrite any user data if utilized. My colleague implemented a fix for the race condition which contributed to the pinning issue, which I deployed as soon as I could review it.

After writing a few more scripts I was able to determine the number of shared folders which had been replaced with conflicting entries with the same identifiers (237). Knowing this number allowed me to determine how to handle the issue. If the number was significantly smaller it might have been easier to handle, but the order of magnitude is such that we’ll have to figure out an automated way to deal with the issue or else spend the next few weeks responding to emails and manually recovering files.

With a better grasp on the situation and with some confidence that it wasn’t the database processing scripts which were incorrect, I restored the archived files with the exception of those which conflicted with the production database.

Conclusion

If I’ve learned anything in my time working on CryptPad it’s that I should appreciate the reasons why the majority of the software industry doesn’t work with encrypted database as we do. Even on a good day it can be a harder job than it would otherwise be. On a day like today we end up having to reason with what the clientside code would have done under various circumstances and think about what information we can access.

In any case, I’m very happy that we decided to turn off our deletion scripts months ago. Had they still been active, this relatively mild pinning and archival bug would have resulted in data loss.

While we can tell that 237 shared folders were affected, we still have to think about how the absence of that data would be handled by the code for user’s CryptDrives. To further complicate things, we have to think beyond what our code would do and into what users might have done in reaction to what they saw. If they saw and removed the now-empty shared folders in their drive, they no longer have the encryption keys to decrypt them even though we’ve now restored the underlying data. Because we’ve spent so much time trying to protect our users’ privacy we can’t actually ascertain if they’ve interacted with this part of their drive at all.

On one hand, it makes my life that much more stressful to have to figure out the answers to these problems. On the other, I’m hopeful that by doing this work now I’ll help pave the way for more developers to create services which offer similar protection for their users’ data.

As stated above, if this particular mistake affected you, don’t hesitate to contact us. Otherwise, I can only hope that the way we handle it ensures that you continue to trust us with your data.

Our future is collaborative

For anyone that doesn’t have the time or interest to read the rest of this article, the short version is that the CryptPad team has received a 50000 Euro grant from NLnet foundation. This funding will be directed towards the design and development of team-centric features in a project we’re calling CryptPad Teams.

If you’re still reading, I assume you want to know more about our plans and our relationship with NLnet.

Some backstory…

Up until the end of March 2019 our team’s work was funded by the OpenPaaS project, a four-year French research project in which CryptPad was only a minor component. Our role was to produce a set of collaborative editors for the open-source platform. It was never stated that our contributions should be delivered as a standalone platform, but having a self-contained code-base that we could easily update and deploy simplified our job.

CryptPad had already been prototyped as a part of a previous research project, though its scope was considerably smaller than what would be required by OpenPaaS. Since the platform was being developed with businesses and other large institutions in mind, confidentiality was a concern and a stated requirement of the project. Even so, I think it’s fair to say nobody expected us to make privacy such an central part of our design.

In many organizations these design choices might have been seen as digressions. We’ve been fortunate to have had a lot of support from our employer, (XWiki SAS). Consequently, we were able to nurture a prototype such that it grew into a platform, a product, and a community. Still we knew all along that our role in the OpenPaaS project would come to an end, and that without external funding it would be difficult to continue with the momentum we’d established.

Support from our community

As an active member of the community concerned about privacy issues, I know there are a lot of people that are suspicious of government money. While I understand that this distrust is justified by a lot of history, I’m very satisfied with what I consider the European software model of funding work public work with public money, keeping in mind that I’m a Canadian that’s lived in France for the past few years.

Without the social investment we’ve received so far it would have been very difficult to create a product of sufficient quality that anyone would pay for it. I often hear people rebutt this point by saying that a lot of free-software is produced for free by volunteers. Personally I’m in the camp that believes that the people writing that software deserve the same financial stability that is enjoyed by those producing software with proprietary or extractive business models, but that’s a bit beyond the scope of this article.

In any case, before going on to talk about the very generous contribution we’ve received, I wanted to acknowledge the support up until now from individuals and organizations that use CryptPad. Since the end of our last project and the beginning of this new one, we’ve been sustained by a mix of the revenue generated by subscriptions to CryptPad.fr and donations to our OpenCollective campaign. These contributions help to keep CryptPad going in such brief periods when we haven’t secured larger sources of funding as well as providing alternatives should such opportunities cease to be available.

You can expect another post in the near future about the status of our crowdfunding campaign where we’ll go over our crowdfunding campaign in more depth.

NLnet and the Next Generation Internet

You might recall that we recently visited Barcelona to receive an NGI award for privacy and trust-enhanced technologies. Those awards were organized as a part of the NGI initiative, funded by the European Union’s Horizon 2020 research program.

As a part of the initiative, NLnet has been made responsible for distributing a rather large sum of money to smaller projects through the administration of Search and Discovery and Privacy (and trust) Enhancing Technologies. By delegating these enormous tasks to NLnet the EU has recognized their excellent track record for supporting projects that actively contribute towards an open information society.

Naturally we’re very happy to receive the financial support, but beyond that the foundation has offered a variety of other resources which they have at their disposal by way of having played a strong role in the European free software community. They’ve offered expertise in accessibility, documentation, security auditing, internationalization, and legal matters surrounding software licensing, among other things.

Finally, it’s worth mentioning that for all of this support that we’ll receive, the amount of time we’ve spent writing the initial proposal and following up until the point of signing a contract has been remarkably brief. Whether considering the delay between submission and acceptance or the actual time spent on documents and correspondence, they’ve kept the bureaucracy to an absolute minimum. For a small team like ours, this makes a massive difference in our ability to access such funding and to put more of our time towards the activities the money is meant to support.

What CryptPad Teams will entail

This purpose of this grant is to develop technologies which enhance the public’s ability to preserve their privacy. Our contract defines the milestones which we must reach in order to get paid. I voluntarily included a stipulation that we would not consider a goal complete until its components were publicly accessible as source code and in our hosted platform. This was meant to ensure that the outcomes benefit our community of users and developers alike.

Starting with CryptPad 2.23.0 we’ll introduce support for personal encrypted mailboxes for registered users. We’re not looking to replace e-mail or the other platforms which are focused on encrypted messaging, this will just be a simple feature which will allow users to interact with each other more effectively whether or not they are online at the same time.

Our first use-case for this is an improved version of our “friend request” which currently requires that both users be online. You’ll be able to send friend requests from user’s profile pages and they’ll see a notification the next time they visit CryptPad. Going forward we’ll use the same system to offer friends access to documents directly through the sharing menu, instead of having to send URLs over potentially insecure mediums like unencrypted email or messengers. Similarly, friends will be able to request the ability to edit documents that they can view, as well as to request “ownership” over documents which they should be able to delete.

As minor as some of this functionality might sound, we believe they’ll make a positive and significant impact on users’ privacy. We want to minimize how often they have to directly handle the encryption keys which protect the contents of their documents.

After these initial steps we’ll begin offering first-class support for teams within CryptPad, allowing users to define groups of friends so that they can delegate access quickly and effectively. Teams will integrate with shared folders and will eventually offer features targeting various types of groups, whether hierarchical as is customary in many businesses or on a more ad-hoc basis as might be expected with friends or other self-organizing groups. Team members will benefit from better oversight as to who can access particular documents, reducing the likelihood that they’ll accidentally leak private information. We want to offer users better oversight into the activity of documents in their CryptDrives, both to make it easier to quickly join editing sessions with friends, as well as to make it noticeable when access to a document has leaked outside of its intended audience.

The hard part

Different groups have different levels of trust among their members. It’s difficult to build these features in a manner that’s fast to use with friends while still preventing your boss from spying on you. We’re committed to thinking through all of these cases to keep our users safe, and to acting on users concerns if we don’t get it right the first time.

We’re excited to begin this project and grateful to everyone supporting our efforts, financially or otherwise. Teams is the first grant we’ve received explicitly for the development of CryptPad, and we couldn’t have gotten here without help. As always, if you have ideas, concerns, or questions feel free to contact us.

Join the team

We’ve been making a big deal of our funding status for the last while, and for good reason. CryptPad has largely been funded by the OpenPaaS research and development project, funded by BPIFrance. We’re very happy with the results of the past four years of work, but this support will terminate at the end of March 2019.

While this change is a bit scary for us, it also means that we’ll be free to pursue new research projects. Europe is investing in technologies that promote human-centric values, so there are many opportunities that align with our goals. We have been actively seeking funding from a variety of sources, and though things are currently uncertain for us, it’s quite likely that our team will need to expand to prepare for upcoming obligations.

The skills we want

We’re looking for web technologists and product designers with experience in privacy engineering. If you already use CryptPad, encrypted messengers, or other similar communication systems to protect your personal data, that knowledge will be an asset. If you use unencrypted platforms and have a good understanding of the personal and societal trade-offs, that will count in your favour as well.

This field is fairly young, so we’re open to any experience you have, not just what you’ve learned in a professional or academic context.

In terms of technical skills, our daily work typically includes:

  • Clientside Javascript (ES5) and cross-platform browser APIs
  • Nodejs
  • CSS3 and LESS
  • HTML5
  • BASH
  • GIT
  • SSH, information security, and basic system administration

We’re interested in incorporating skills we don’t already have, so don’t panic if you’re unfamiliar with anything listed above.

Perhaps more important than the technical skills are the so-called soft skills:

  • Empathizing with users and prioritizing improvements based on their impact
  • Communicating well within a team (including asking for clarification if your goals are ever unclear)
  • Managing your time well (we avoid micro-managing and working overtime)
  • Reasoning about pragmatic security
  • Consideration of both immediate tasks and long-term goals

What we offer

XWiki SAS has been developing open-source software for the last 15 years, and we rely on the open-source tooling internally. Joining our team means learning how to run a sustainable business while giving away our product for free (without selling user data).

Otherwise you can expect:

  • A relaxed work environment (in Paris, France or Iasi, Romania) with part-time remote work
    • or negotiable full-time if relocation is not possible or desireable
  • To develop portable skills using open-source software
  • International travel (at our expense) when promoting the company or our projects
  • Opportunities for advancement, training, and other benefits
  • The chance to shape the future of an exciting project with your personal view of responsible data handling
  • To become an expert in privacy-enhancing technologies (we’re literally an award-winning team) Awards for XWiki and CryptPad

A special note to researchers

We’re very interested in distributed systems, data science (as an adversary against privacy), and human-computer interaction. If you are knowledgeable about any of these, some intersection, or anything else that might be relevant, that’s great!

If you have recently attained a PhD from an institution recognized by the EU, there are subsidies which can help us pay your salary. We have authored two peer-reviewed papers to date, so we can offer continued involvement in the research community if you desire.

Caveats

Sorting through CVs can be a lot of work, though a little transparency on some issues might help lighten the burden on our side. Below are some things to consider before contacting us.

As stated above, our ability to hire will be based on the status of some pending proposals. We don’t currently know how many positions will be available, and our timeline on when we could hire is fuzzy at best. We’d like to have your profile ready so we can act quickly once we know more.

We can’t compete with the salaries offered by companies in Silicon Valley, though they are comparable to other European businesses. As a consolation, you’ll be directly involved in determining how we move forward, and you’ll gain insight into the exciting European research ecosystem.

Our funding sources tend to place restrictions limiting those funds to residents of European member states. I moved to France from Canada to work on CryptPad several years ago, but things are generally simpler if you’re already here. Don’t let that stop you from contacting us, though!

We understand that talent comes in many forms, and we welcome new ideas. We’re willing to make exceptions for promising candidates, but we’d like to know that you care about the topic. There are probably better options available if you just want a job.

If you are interested…

Contact us at jobs@cryptpad.fr with a recent CV and a brief introduction explaining what you’d bring to the team.