Casual and blanket theoretical characterizations of OT and CRDT algorithms is exactly the kind of folly that the article is trying to address.
For example, it's a logical contradiction to begin the statement with "OT's complexity is _generally_ governed by editing history (H), i.e. O(H)" to conclude that "CRDTs have much better perf than OT", then adding a qualification that argument applies only for the specific off-line editing scenario (which btw is also not true).
It's also inaccurate to base CRDT's complexity on N - the document size, i.e. number of characters visible in the document. You need to include tombstones for the case of WOOT variants or use garbage collection for RGA (which then requires vector clocks). These nuances are described in sections 4.3.
OT's complexity is governed by O(c) where c is number of concurrent operations, period.
For off-line edits, the short answers is that OT's complexity is also not O(H), if H is the number of character edits, because you can easily apply compression. Now, the longer answer: there is pretty strong spatial locality in the distribution of edits over a document -- we don't sporadically and randomly add or delete characters around a document, but intuitively, most of the inserts will be adjacent characters (i.e. strings), and many of the deletes are over newly inserted strings. OT uses string-wise operations, hence it will compress n consecutive character insert ops down to a single string op. In addition you can compress delete edits over newly inserted content: i.e. [insert 'To be or not yoo be' at 0, delete 'y' at '14', delete 'o' at '14', insert 't' at '14'] => [insert 'To be or not to be']. This is basic idea behind "operation compression" which is what OT would use to support off-line, non-real-time, asynchronous editing, however you want to call it. There is a better example here (http://www3.ntu.edu.sg/home/czsun/projects/otfaq/#_Toc321146...).
These optimisations (whole-string operations, compression, etc.) apply equally to CRDTs. See for instance DOI 10.1145/2957276.2957300. See also the blanket optimisations studied by Carlos Baquero's group (which I doubt could carry over to OT because of the complexity of OT theory).
DOI 10.1145/2957276.2957300 is an attempt at using binary-tree variants to improve CRDT's searching performance bottleneck in converting external index position to internal object identifiers. We address these specific complexities in Section 4.2.1.
DOI 10.1145/2957276.2957300 doesn't deal with operation compression. If you believe it does, please point out where we might find it.
It is trivial state opinions like 'OT is complex', but much harder work to back it up with concrete evidence. I'd think folks will find the discussion more informative without the constant hand-waving.
Have you submitted the paper to a peer-reviewed venue? I've read the arXiv version already, and I'd be more than happy to volunteer reviewing it if that's not too late.
It appears you have specific comments and/or feedback? Either post them on HN or send them to the primary contact's email. Either way, we'll make an earnest effort to respond.
This discussion is of interest to folks both inside and outside of academia and the developer community brings valuable hands-on experiences to the table. So, let's keep it inclusive.
Let me prioritize the usual academic processes first to make sure the outcomes are more persistent and visible, given we are talking about a publication under review or a draft. I will reach out to the paper's primary contact.
The time complexity of most OT systems is not related to H.
I've been working in OT systems for years (G Wave, ShareJS, ShareDB, some other stuff). I'm consistently surprised by how badly academic papers predict OT systems will perform. In reality, they perform great. My little C implementation of text OT can handle about 20M text operation transforms / second.
Part of the gap is that many academic papers model text operations as just single character edits. If you do that, copy+pasting some text into an editor can inject thousands of insert operations into the system all at once. But that design is really sloppy - nobody actually designs real world OT systems like that. A much better way to design text OT operations uses a list of skip, insert and delete parts. That way you can arbitrarily compose edits together. This way an entire pasted string just gets inserted in one operation and performance is fine. (Or if the user goes offline, you can merge all the edits they do into a single change, then transform & commit it all at once).
I've still never seen the OT algorithms of anything I've worked on have a meaningful impact on performance. Actually thinking about it I'm not sure if I've ever even seen the OT code show up in profile traces.
My personal experience is different. OT tends to be quadratic in terms of computational complexity and codebase size.
As long as you consider a simple case, 1^2 is 1. Then, add more data types (not just text), faulty servers, faulty network, non trivial network architectures...
All these problems tend to interact. They easily go combinatorial, but again 1! is 1.
I worked with an OT codebase that had C++ templates within C defines and I edited them using regexes (x types of operations => x^2 types of potential transformations).
OT aside, data sync in general is not a solved problem.
Maybe I am especially unlucky, but I have yet to see a chat app that has no sync glitches. A chat is not the most difficult data structure to sync, but I saw bugs in Slack, Telegram, Instagram, Gchat, Skype, others... WhatsApp does not even try to sync...
A lot depends on the diversity of your use cases and the number of the nines you expect from the system...
<- This, concurs with our experience.
When we profiled our string OT implementation on a single 2-3 GHz CPU, with 2-3 CPI (Cycle Per Instruction)), it was able to handle about 10M transformations per second. On a half dozen of OT based system I've worked on, the OT implementation was never where we found the real engineering complexity or the performance bottleneck in working coeditors.
The significance of using string-wise operation model in OT is under appreciated and can't be emphasized enough for real world systems. The first string-wise OT transformation functions can be traced back to Achieving Convergence, Causality Preservation, and Intention Preservation in Real-Time Cooperative Editing Systems, 1998. We also found that skip, insert, and delete can improve compression of string edits.
I fully agree, I don't understand the emphasis on performance both in the literature and in comments on the subject.
Additionally, it is more common to see CRDT libraries that make character-wise operations and create an object for each character (eg. y-js.org, github.com/google/ot-crdt-papers), which is not ideal.
Obviously there are also libraries such as Atom's Teletype that have string-wise operations.
The cynic in me feels like the CRDT vs. OT war misses the forest for the trees. What matters is lacking features, and the feature that is most needed and least described is a systematic way to offer a diff editor matching the normal experience. Indeed, after having been offline a while, one wants to see and select how their changes will integrate the shared resource.
I think you'll find the article supports your sentiment to an extent.
One of the key messages we tried to get across is that academic research for co-editing ought to go beyond theoretical analysis but rather drive research by pushing the envelope in new areas of application and/or features. OT's evolution has been symbiotic with new applications, while we haven't seen this in CRDT (for coediting) research. The reality is that you'll find new CRDT papers every year that still (a) make fairly broad claims of benefits are either theoretically dubious or not backed/validated by application/system implementations, (b) dwelling on technical issues that were resolve years ago. To move beyond this, we want to have put these issues in context (a general transformation approach) and dispel the most common fallacies in theoretical analysis.
Related to the 'visual-merging' feature you mentioned, when I worked on making MS Word collaborative as a research project, one of the UX research questions was how to allow users to visualize the different combined effects of updates to objects (e.g. if you changed the background of object X to red and I change it to yellow at the same time). We came up with a multi-versioning technique to display the potentially different combinations and help users select one that's the most desirable. It is definitely a interesting and challenging problem to consider for text documents.
My understanding of OT is that each change requires traveling backward though the edit history to find the first state that is consistent between sites, then traveling forward to transform the new change against past edits. In the worst case, an edit requires transformation against the document’s origin state.
Is that not the case?
Most OT systems only transform concurrent operations.
That's true. My performance claims are biased toward offline-capable editors, where the last consistent state between sites may be thousands or millions of edits ago. For online-only editors the set of edits between now and the most recent consistent state may be much smaller.
Here's the key sentence: "concrete implementations of CRDT in co-editors revealed key missing steps in CRDT literature." This paper may be correct for academic CRDTs but it is very wrong when looking at industry implementations.
My hunch is that because CRDTs are so much easier to grok than OT, engineers are empowered to make use case-specific improvements that aren't reflected in academic literature.
For example, the Logoot/LSEQ CRDTs have issues with concurrent insert interleaving; however, this can be solved by dedicating a bit-space in each ID solely for followup inserts. The "Inconsistent-Position-Integer-Ordering" problem is solved by comparing ids level-by-level instead of in one shot.
Fundamentally, CRDTs have strong strategic advantages over OT. Given a document size N and a document edit history H:
* CRDT size is O(N) where OT is O(H)
* CRDT updates are O(log N) where OT is O(H^2)
In any nontrivial document, H ≫ N. This means CRDTs have much better perf than OT. Additionally, the best CRDTs (like Logoot/LSEQ) don't require tombstones, garbage collection, or "quiescence." The complexity burden is far lower.
To top it off, CRDTs are offline-capable by default.
Thanks! I think a unification of OT and CRDT is a very promising space, both for practical implementation and for deeper academic understanding. To be honest, I only skimmed the posted paper, but it didn't grab my attention. It's very difficult these days for me to motivate spending time in this space.
TTF is a CRDT-like approach. They both maintain an extra data structure including tombstones. And TTF-operations are only applied on the designed data structure. To be applied on the document visible to users, TTF-operations need to be translated like CRDT.
Your solution is to combine GOTO with TTF. Actually, it is a combination of OT and CRDT.
If you're interested in this stuff, you might also like Raph Levien's writing on it:
"Most OT algorithms have been proved incorrect": a better reference is https://doi.org/10.1016/j.tcs.2005.09.066
Representative CRDTs, at least Logoot and TreeDoc, were never truly proved. The correctness of Logoot and TreeDoc are claimed under the assumption that the required property of ids is preserved by the CRDT design. However, the existing id generation algorithms cannot achieve the desired property.
To be specific, I will enumerate two examples.
First, in Logoot (https://hal.inria.fr/inria-00432368/document",
the function generateLinePositions in section 4.2 has a serious error, which could lead to the failure of generating new identifiers. This flawed design can be also found in the following research (Logoot-Undo, https://hal.inria.fr/hal-00450416/)
Second, in TreeDoc (https://hal.inria.fr/inria-00445975/document), the function newPosID described in section 3.2 may generate incorrect ids. This function is to generate an id r between any two ids p and q (p<q) so that p<r<q. However, in line 5, if pn=0, then we would get r<p. Readers can refer to Figure 3, in the tree, the character dY precedes the character dZ in infix order, however, the id of dY (10(0:dY)) is greater than the id of dZ (100(1:dZ)), thus their positions are inconsistent their ids.
In the literature, CRDT researchers always claim their CRDT designs are formally proved.
However, there are various scenarios whereby CRDTs would have inconsistency issues as stated above.
In my opinion, I don't think there are much validity for these claims.
Well, RGA has been proved formally [DOI 10.1145/2933057.2933090]. Regarding Figure 3 of the Treedoc paper, I believe the IDs of dY and dZ are in the correct order, according to the rules in the paper.
This paper follows the same reasoning as in http://hal.inria.fr/inria-00071213/, so it invariably arrives at the same bogus conclusion "Most OT algorithms have been proved incorrect". Please feel free to explain if you feel otherwise, but quoting yet another publication doesn't really help its case here.
What did catch my eye in this paper is that the authors claimed to have used CoWord as a case study (and of course, claimed that the theorem prover showed CoWord doesn't satisfy TP2). I worked on CoWord, and it was never open-sourced so how the authors got hands on CoWord's transformation functions (which were inside a compiled C++ binary OT-engine) for evaluation is truly troubling. Care to explain?
No points were missed - the paper just showed why the points are wrong. I'd be happy to discuss them if you care to read it: there are three concrete correctness problems with CRDT (Logoot) in sections 4.2.4 - 4.2.6.
Re: the articles you cited: Molli et al.'s http://hal.inria.fr/inria-00071213/ (A) and their earlier paper: https://www.lri.fr/~mbl/ENS/CSCW/2013/papers/Imine-ECSCW03.p... (B), are unfortunately wrong about OT correctness, demonstrably so.
The formalism can obscure the underlying argument, but both A and B uses the same reasoning, which goes something like this:
1. Assumption: An OT system's transformation functions must always satisfy TP1 and TP2, independent of the OT algorithm or protocol, otherwise they are incorrect.
2. Proof: take any existing OT system, disregard the OT algorithm or protocol, just look at the OT transformation functions in isolation, and show that the functions do not satisfy TP1 and TP2.
3. Conclusion: All known OT systems are incorrect
The starting assumption of the proof is a false premise: OT's transformation functions do not always require TP2. The proof itself, of evaluating correctness on a subset of OT in isolation, then broadly generalize to OT as a whole, is a sleight of hand (and sloppy theory work). Hence, the conclusion is plainly wrong.
TP2 is only required under a very specific pre-condition, and meaning that avoiding this pre-condition means removing the need to preserve TP2 at the level of transformation functions. TP2/CP2-avoidance strategy is commonly used in many OT algorithms and protocols: GOT, JUPITER, G-WAVE/DOCS, COT, TIBOT etc., which are provably correct (see https://www.computer.org/csdl/trans/td/2016/03/07060680-abs..... Transformation functions capable of preserving TP2/CP2 for string-wise text editing (with verified correctness and without tombstones or other meta-data)are also available (see https://dl.acm.org/citation.cfm?doid=2998181.2998252)
What I also find really dubious is the verification method used in A and B: paper B starts with using a theorem prover to definitively prove that all previously known OT systems are wrong. Then the paper proposed new transformation functions that the same theorem prover showed to be definitively correct. Paper A, which came after paper B by the same authors, used another theorem prover to show again that all previously known OT functions to be wrong, definitively, including the ones that they proposed to be "definitely correct" from B. So A basically contradicts the results in B, and both were definitively "proven correct".
The argument of Sun's paper seems to be that CRDTs have hidden performance costs. Perhaps this is true.
This completely misses the main point. OT is complex, the theory is weak, and most OT algorithms have been proven incorrect (see http://hal.inria.fr/inria-00071213/). AFAIK, the only OT algorithm proved correct is TTF, which is actually a CRDT in disguise.
In contrast, the logic of CRDTs is simple and obvious. We know exactly why CRDTs converge. There are several papers proving that specific CRDTs are indeed correct.
Furthermore, I'm somewhat doubtful of the performance discussion, since Attiya et al. proved that there is a lower bound on the complexity of concurrent editing, independently of the technology used. See http://dx.doi.org/10.1145/2933057.2933090.
Disclaimer: I did not read the paper in detail, just skimmed over it.
Except for WOOT, which CRDT for co-editor does not require causal ordering?
RON CT/RGA may consume inputs in arbitrary order.
Some background on the Causal Tree (CT) that gritzko is referring to: http://archagon.net/blog/2018/03/24/data-laced-with-history.
In the original paper of RGA, RGA does require that operations are received with respect to the causal ordering. And RGA preserves causality by state vector technique.
Are you the inventor of CT?
"most other OT solutions ... do not require a central server to do (any part of) the OT work, but only require the use of an external causal-order-preserving communication service (the same as most CRDT solutions). "
CRDTs don't require an external causal-order-preserving communication service. That's kind of the whole point of CRDTs. At the same time this is what imposes certain limitations on how they can be used. But so is automatic conflict resolution in collaborative editing.
The article above says sharedb is originally written by a Google Wave developer.
waves Y'all are talking about me. I was indeed on the wave team, although I joined Wave right near the end. (And then I stayed on to help opensource the whole thing.)
We wrote ShareDB at Lever, which was in the 2012 YC batch (iirc). We wrote it to allow realtime collaborative editing in our application of all our data fields by default. I'm still really proud of that work. ShareDB primarily uses JSON-OT, which lets you do realtime OT over arbitrary application data. The lever team has been maintaining & improving sharedb for the last few years, which is really lovely to see as I've moved on to other projects.
One nice thing about OT is that its much easier to implement. The original quilljs author (Jason Chen) wrote the rich text OT implementation. ShareDB works with any OT code (so long as operations can be JSON stringified / parsed). And its all hooked up in quill, which is super neat. And looking at the quill github, it looks like David Greenspan has been maintaining quill recently. David Greenspan is one of the original authors of Etherpad, which is one of the first web based text OT engines.
Thanks a lot for the background on sharedb! Great work, indeed.
Only tangentially related, but today I was reading about collaborative editing using Quill.js (a rich text editor) and found a really nice solution based on sharedb (using ottypes/rich-text):
Before I dive knee deep into this, does anyone have any opinion on sharedb or better alternatives? (it was listed in this post under it's old name, sharejs, as one of the more successful OT implementations out there so that's a good sign)
Enjoyed reading your thinking around what "Intention preservation" means for CoCalc. I think "Intention" was left open-ended since it was always meant to be defined in the context of a specific application, e.g. image vs text editing would like have different notions of user-intentions and what it means to preserve them.
TP2 has been made into this monster that seems to scare anyone starting to look at OT. You don't need to worry about TP2 with a suitable protocol with a server or fully peer-to-peer, and either solution are simple enough to implement. A fine point is that TP2 and intention-violation are distinct properties: you can easily achieve convergence by serializing the operations through a total order at every site (but your won't have intentions preserved), on the other hand, if you insert 'a' and I insert 'b' at the same location simultaneously, it's reasonable for both results 'ab' or 'ba' to be intention preserved, but without transformations we'd each see a different result (thus divergence).
We've documented plenty of CRDT's performance and correctness issues in that article. I think a bigger app design question I would ask is if you want to "lock-in" your apps internal data model with a consistency-maintenance scheme -- there is real value in separation of concerns (SoC) .
I enjoyed reading this paper. I also recently talked with Chris Colbert about his new plans to use a CRDT approach to collaborative editing of Jupyter notebooks. This made me curious again about how CoCalc’s collaborative editing is related to OT and CRDT approaches, so I wrote up a blog post about that just now. http://blog.sagemath.com/2018/10/11/collaborative-editing.ht...
I agree. I don't understand why they are attacking CRDTs at all.
I'm not accusing the authors of this, but some people seem to have knee-jerk reaction against CRDTs because from far away they look like magic. "Use this magic data structure and all your conflicts shall disappear." Of course, once you look closely you see there are plenty of tradeoffs, just like anything else.
Be specific. Which statements are attacking CRDTs?
These algos with unresolved issues would mislead researchers and engineers. Detecting and revealing these issues are significant.
An article of disappointing quality from well known OT authors.
Like,algo x has issue X, algo y has issue Y, z has Z, so CRDT has issues X, Y and Z... and many things like that.
Recently one guy reverse engineered Apple Notes. It is a CRDT. I personally made a CRDT sync for Yandex. There is also list on Github... https://github.com/ipfs/research-CRDT/issues/40
Good work. Which CRDT alg was adopted in your product?
Are there any CRDT-based industrial co-editing apps? Please, do not mention the toy implementations and applications unrelated to co-editing.
> I'm the leading industry and open source expert on CRDTs
And so humble too!
Why OT cannot be applied on P2P networks? Does Google Docs use a transformation-based server mean that all OT need a central server?
Yeah, I thought the whole point of OT was that it allowed any client to do resolution (because all operations transform other operations in a way to make them all resolve the same way no matter their order) Which is what made it seem like such overkill for Google Docs where there is a central authoritative server, so a much simpler architecture could have done the job...
Simple OT algorithms do work a lot better with a centralized server.
With a centralized server, you can consider every resolution as happening between 2 nodes (server and client). This means transform / catchup is as simple as a for loop.
In contrast, in decentralized context OT code needs to support Transform Property 2 in order to converge correctly in all cases. TP2 dramatically complicates the OT implementation, and you need a much more complicated resolution algorithm to merge arbitrary changes between nodes.
For text, this means you need:
- Tombstones (or something like it) - eg https://github.com/josephg/TP2/blob/master/src/text.coffee
- Either an operation prune function (anti-transform) or full history, forever.
- A resolution algorithm that can flatten DAGs of operations into a transformed list. This stuff gets really hairy, and hard to implement efficiently. This is my attempt from a few years ago. Its correct, but super slow: https://github.com/josephg/tp2stuff/blob/master/node3.coffee
Implementing high performance OT with centralized servers is easy. Implementing OT in a decentralized network is hard to do correctly, and much harder to implement in a highly performant way. For decentralized systems, CRDTs are a much better approach imo.
Very insightful, thank you for the detailed comment. About the DAG flattening algorithm you mention - in what way does this differ from a topological sorting? (https://en.wikipedia.org/wiki/Topological_sorting ) E.g. does it need to know about expensive or cheap combinations of operations, and try to avoid the former?
Oh! I didn’t know that had a name. Yes - it is indeed a topological sort. At least the way I’ve implemented it, the performance problem is that moving an operation from position K to position N in the output requires O(|K-N|) steps. CRDTs can do this work in O(log(|K-N|)) steps.
Several decentralized OT systems  can avoid the TP2 constraint on transformation functions.
There are a few things related to TP2, aka Convergence Property 2 (CP2), that needs to be unpacked here.
Do your system always need to preserve TP2/CP2 to converge correctly? Contrary to what many folks and papers will claim, answer is categorically no.
There are two general approaches to arrive this desired outcome: (1) avoiding it or (2) preserving it. Let's look at the basics of what you need for each:
(1) Avoiding it - this can be done with a centralized server, as well as in a fully distributed, decentralized, peer-to-peer context. So I'm in agreement with josephg in that a central server will avoid TP2/CP2, but I'd like to amend that with a decentralized avoidance strategy is available and just as easy to adopt. To back this up, this paper  looks at TP2/CP2-avoidance systematically in representative protocols and algorithms: JUPITER, NICE, Google Wave, COT, TIBOT/TIBOT2. TIBOT/TIBOT2 are two fully decentralized algorithms/protocols that avoids TP2/CP2.
In addition,  also answers the more interesting question of why these systems were able to avoid TP2/CP2, by giving the general conditions for TP2/CP2-avoidance. You can apply the conditions to design a new protocol or algorithm that avoids TP2/CP2.
(2) Preserving it - By TP2/CP2 preservation, it usually refers to the task of writing transformation functions (e.g. between insert and delete, or delete and delete) that satisfies the aforementioned condition. This is a solved problem for String-wise operations, from 2016 :
- You can take these String-wise transformation functions and use them with many of the existing OT algorithms in fully-decentralized context and satisfy all TP1 and TP2.
- The transformation functions are free of Tombstones (or its variants) or other schemes (DAG etc) to ensure TP2.
Happy to say more about it if someone has an interest.
You can implement correct and performant OT with both a centralized server or with a fully decentralized topology, using TP2/CP2-avoidance and/or with TP2/CP-preservation. I've built multiple production systems with all the varieties. I generally do prefer OT with TP2/CP2-avoidance on a server, but it's not because OT can't avoid TP2/CP2 without a server or preserve TP2/CP2, it is that there is not much real-world evidence showing clear benefits moving a collaborative editing system to decentralized topology, esp when most of the CRDT-based p2p proposals in papers that I've come across involves a server one way or another.
TP2/CP2 in OT is often stood up as a straw-man so folks can throw stones against its correctness, it is really a solved problem. The unfortunately outcome is that you see a ton of papers adopting this strategy to get published and overtime the conversation becomes really convoluted.
We discuss these ideas and try to dispel many of these confusions in detail in the arxiv manuscript. A good starting point would be Section 4.1.3: OT Solutions to the CP2 Issue and the FT Puzzle
Thanks! That was super informative.
It's not clear which of the authors' claims you are reacting to here.
Hey, the authors focus on CRDT for co-editors.
Either everyone is wrong, or you are wrong. There is only 1 absolute truth.
Source: All your comments are downvoted to hell. Either you idiot or you next Galileo.
OT requires centralized servers to do intention resolution.
CRDTs are a set of operations that can work on any peer and resolve correctly, without extra coordination or central resolution.
The two things are about opposite as they can get. Some CRDTs can be used to make Google Docs style OT, but are not specific to OT use cases.
Source: Me, I'm the leading industry and open source expert on CRDTs which power the popular https://github.com/amark/gun decentralized database that the Internet Archive, D.tube (1M+ monthly visitors), notabug.io (1K/daily users) use. CRDTs have let us scale to 3K tx/sec on $99 hardware in P2P mesh networks.