Difference between revisions of "Coders/Collaborative real-time editor"

From FreekiWiki
Jump to navigation Jump to search
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
{{Delete}}
 +
 
Notes about collaborative real-time editors, a project I'd like to contribute to.
 
Notes about collaborative real-time editors, a project I'd like to contribute to.
  
Line 23: Line 25:
 
Various protocols for communicating insert and delete operations have been used. [http://gobby.0x539.de/trac/ Gobby] uses straight sockets, [http://codingmonkeys.de/subethaedit/ SubEthaEdit] and [http://ace.iserver.ch/ ACE] use the IETF [http://beepcore.org/ BEEP] protocol, [http://synchroedit.com/ SynchroEdit] uses HTTP and projects exist to use [http://docsynch.sourceforge.net/ IRC] and [http://xmppcollaborate.wordpress.com/ XMPP]. The advantage of using HTTP is a collaborative real-time editor could be implemented as a CGI script in a variety of scripting languages and run on almost any web server. It wouldn't need it's own daemon process or its own TCP port to listen on. It could be added to any existing MediaWiki installation without additional system requirements. It could also be used by collaborative real-time editor web apps because modern browsers support JavaScripts which make HTTP requests, a technique called AJAX. JavaScripts can't generally use other protocols or open straight socket connections for security reasons. Finally, using HTTP it might be possible to harmonize collaborative real-time editing with WebDAV, which also uses HTTP. WebDAV's purpose is to enable distributed authoring and versioning of HTTP resources. Collaborative real-time editing is one mode of distributed authoring and versioning, so it might be possible to create a real-time WebDAV extension.
 
Various protocols for communicating insert and delete operations have been used. [http://gobby.0x539.de/trac/ Gobby] uses straight sockets, [http://codingmonkeys.de/subethaedit/ SubEthaEdit] and [http://ace.iserver.ch/ ACE] use the IETF [http://beepcore.org/ BEEP] protocol, [http://synchroedit.com/ SynchroEdit] uses HTTP and projects exist to use [http://docsynch.sourceforge.net/ IRC] and [http://xmppcollaborate.wordpress.com/ XMPP]. The advantage of using HTTP is a collaborative real-time editor could be implemented as a CGI script in a variety of scripting languages and run on almost any web server. It wouldn't need it's own daemon process or its own TCP port to listen on. It could be added to any existing MediaWiki installation without additional system requirements. It could also be used by collaborative real-time editor web apps because modern browsers support JavaScripts which make HTTP requests, a technique called AJAX. JavaScripts can't generally use other protocols or open straight socket connections for security reasons. Finally, using HTTP it might be possible to harmonize collaborative real-time editing with WebDAV, which also uses HTTP. WebDAV's purpose is to enable distributed authoring and versioning of HTTP resources. Collaborative real-time editing is one mode of distributed authoring and versioning, so it might be possible to create a real-time WebDAV extension.
  
 +
One reason it would be difficult to harmonize WebDAV and real-time collaborative editing is because WebDAV models HTTP resources as arbitrary streams of bytes, without additional semantics (except the MIME type). Consistency algorithms like Jupiter, however, rely on semantics like the resource represents plain text and character boundaries to perform operation transformation. Consequently real-time editing may not fit well with WebDAV's model.
 +
 +
Another challenge of implementing collaborative real-time editing using HTTP is receiving notification of update events on the server. HTTP/1.1 is inherently one way: Requests can only be initiated by the client. Collaborative real-time editing requires bidirectional communication for the server to send updates to the client, as well as the client to the server. Achieving bidirectional communication using HTTP requires either the client polling the server for updates, or the client opening long lived connections to the server, [http://tech.groups.yahoo.com/group/ydn-javascript/msearch?query=connection+manager+and+chunked+transfer&submit=Search&charset=ISO-8859-1 allowing the server] to respond to the client as soon as server events occur, a technique popularly called [[Wikipedia:Comet (programming)|Comet]]. One advantage of Comet is that events are propagated to the client as soon as they occur, rather than on the next poll interval. Another is that it avoids the traffic incurred by polls which return no events. Disadvantages are that web servers may only support a few concurrent requests. Applications using long lived requests may not scale because the number of clients is limited by the server's support for concurrent requests. Idle long lived requests can prevent the server from accepting new short lived requests for other content. Some HTTP proxies don't start forwarding responses until the entire response is received.
 +
 +
Libraries for Comet.
  
 +
XMPP XEP0124.
  
HTTP nice because you can do it with AJAX and you can implement it in a script like PHP, so no need for a daemon. How could you build real-time features into WebDAV?
+
XEP0124 implementations.
  
On the other hand HTTP is not designed for real-time events from the server. You need to use some back channel, a technique called [[Wikipedia:Comet]]. There's the draft RFC from 1999, the COMETd project and the WHATWG spec. The draft RFC is WebDAV-esq, but not very clean. The COMETd project is very gritty. If the WHATWG spec is ever implemented, user agents will have far more power, apparently socket programming in JavaScript?
+
GNOME Telepathy.
  
XMPP is a nice modern XML spec which is designed for messaging, so it wouldn't have HTTP's problems. Presumably there're already projects which do XMPP using AJAX, but I don't think you could implement it with a web script. A good project might be adding an XMPP backend to ACE.
+
Loudmouth library and link local XMPP.
  
Other protocols: Projects use BEEP and IRC or straight sockets.
+
Loudmouth library and XEP0124.
  
Issues: locking vs. consistency algorithm like Jupiter.
 
  
Goal: Collaborative real-time mode for emacs and a collaborative real-time AJAX editor for MediaWiki, both of which interoperate.
+
On the other hand HTTP is not designed for real-time events from the server. You need to use some back channel, a technique called [[Wikipedia:Comet]]. There's the draft RFC from 1999, the COMETd project and the WHATWG spec. The draft RFC is WebDAV-esq, but not very clean. The COMETd project is very gritty. If the WHATWG spec is ever implemented, user agents will have far more power, apparently socket programming in JavaScript?
  
Perhaps time is needed for things to evolve?
+
XMPP is a nice modern XML spec which is designed for messaging, so it wouldn't have HTTP's problems. Presumably there're already projects which do XMPP using AJAX, but I don't think you could implement it with a web script. A good project might be adding an XMPP backend to ACE.
  
 
=== TODO ===
 
=== TODO ===

Latest revision as of 12:42, 17 May 2013

deletion

This page has been requested to be deleted.
If you disagree, discuss on the talk page.
Whenever possible, could an Admin please remove this page?


Notes about collaborative real-time editors, a project I'd like to contribute to.

This could be on the topic of open standards and collaborative real-time editors.

Lots of information about collaborative real-time editors in Wikipedia.

Integration of MediaWiki and desktop environments is desirable because it lets users edit pages with powerful desktop tools instead of limited HTML text fields. Users can also use familiar command line tools like grep, sed and wc and avoid learning all the features of MediaWiki's interface. Using desktop tools is particularly advantageous to users without modern web browsers since they can still participate using low-fi text editors. Using desktop editors also avoids loosing changes if the web browser crashes or the user accidentally navigates away from the page. Desktop tools are often more reliable and have better data recovery features than web browsers.

A widely supported open standard for distributed authoring and versioning on the web is WebDAV. It's a set of extensions to HTTP/1.1 using XML. It defines new request methods, message headers and XML message bodies. Essentially it adds locking and metadata to HTTP. It maps well to file system operations and most modern operating systems support mounting WebDAV resources. A WebDAV interface to MediaWiki is particularly appropriate because MediaWiki supports page revisions and the WebDAV protocol defines an interface for versioned resources. It would be interesting to see how desktop applications interact with MediaWiki page revisions.

An increasingly popular feature of editors is real-time collaborative editing. It allows multiple users to edit the same document at the same time and observe each other's changes in real-time. The principle challenge of real-time collaborative editing is maintaining consistency between the texts in all the editors when updates can arrive at each editor in different orders due to network issues. This isn't an issue if all updates must pass through a central server before being applied since the server can impose and absolute ordering, however this imposes a delay which is generally intolerable to the user, who expects the changes they make to be applied immediately.

Many papers and algorithms on the subject of maintaining consistency have been written since 1989, well summarized by Mark Bigler, Simon Räss and Lukas Zbinden for the ACE project. They conclude that the most suitable algorithm for implementation is the Jupiter algorithm. It operates on each of two editors editing the same text in real-time and uses operation transformation to transform updates received possibly out of order such that the resulting text is consistent on both editors. It uses a two dimensional history of operations to transform updates. Each editor may follow a different path through this history of operations, corresponding to different operation orders, but will arrive at the same consistent point. Jupiter is extended to support more than two collaborating editors using a tree topology. Each internal node acts as a bus, forwarding updates to its children and its parent. There exist at least two open source implementations of Jupiter: ACE and Gobby.

Because MediaWiki is intended to facilitate collaborative editing of pages, collaborative real-time editing would be a particularly useful feature for MediaWiki to support. Anecdotally, I participate in an open source project which maintains design documents in MediaWiki. We often discuss these documents in meetings on IRC, however during these meetings we need to nominate one person to edit the MediaWiki page with each of our contributions, since if we all edited it at the same time, all contributions would be replaced by the last person to save the page. Collaborative real-time editing support would enable us all to edit the page at the same time and see each others changes in real-time.

The open source project SynchroEdit adds collaborative real-time editing support to MediaWiki. It's a web app implemented in JavaScript which works on modern browsers across multiple platforms. It uses AJAX techniques to communicate updates to and from the server using HTTP. Because HTTP is one way (requests are always initiated by the client), SynchroEdit regularly polls the server to be notified of update events from the server. It doesn't use Jupiter to maintain consistency, instead it locks regions while users are editing them. I'm not sure this is guaranteed to maintain consistency; I don't know what happens when two users start editing the same region at the same time and the messages to lock the region are not received in the same order.

There are currently no collaborative real-time editors which interoperate with other collaborative real-time editors. To participate in a collaborative real-time editing session, all users must be running the same software. Consequently SynchroEdit doesn't integrate with desktop collaborative editors and it's not possible to edit a MediaWiki page in real-time using tools like Emacs. For collaborative real-time editors to interoperate, they would need to use the same protocol for communicating updates and the same algorithm for maintaining consistency.

Jupiter has been systematically proven to maintain consistency if the only operations are string operations insert and delete. I don't know how difficult it would be to extend Jupiter to support additional operations. Jupiter is particularly well suited to MediaWiki and editors like Emacs because MediaWiki pages are all represented as plain text and all Emacs operations can be modeled as inserts and deletes. Supporting editors with visual formating might be difficult because one would either need to extend Jupiter to support visual formating operations or model visual formating operations as inserts and deletes (for example inserts and deletes of markup). The result of operation transformation on inserts and deletes of markup, however, might not be valid markup. An open standard could either specify the use of Jupiter by all participating editors, or specify a system for editors to advertise which consistency algorithms they support, similarly to MIME type or HTTP supported encodings.

Various protocols for communicating insert and delete operations have been used. Gobby uses straight sockets, SubEthaEdit and ACE use the IETF BEEP protocol, SynchroEdit uses HTTP and projects exist to use IRC and XMPP. The advantage of using HTTP is a collaborative real-time editor could be implemented as a CGI script in a variety of scripting languages and run on almost any web server. It wouldn't need it's own daemon process or its own TCP port to listen on. It could be added to any existing MediaWiki installation without additional system requirements. It could also be used by collaborative real-time editor web apps because modern browsers support JavaScripts which make HTTP requests, a technique called AJAX. JavaScripts can't generally use other protocols or open straight socket connections for security reasons. Finally, using HTTP it might be possible to harmonize collaborative real-time editing with WebDAV, which also uses HTTP. WebDAV's purpose is to enable distributed authoring and versioning of HTTP resources. Collaborative real-time editing is one mode of distributed authoring and versioning, so it might be possible to create a real-time WebDAV extension.

One reason it would be difficult to harmonize WebDAV and real-time collaborative editing is because WebDAV models HTTP resources as arbitrary streams of bytes, without additional semantics (except the MIME type). Consistency algorithms like Jupiter, however, rely on semantics like the resource represents plain text and character boundaries to perform operation transformation. Consequently real-time editing may not fit well with WebDAV's model.

Another challenge of implementing collaborative real-time editing using HTTP is receiving notification of update events on the server. HTTP/1.1 is inherently one way: Requests can only be initiated by the client. Collaborative real-time editing requires bidirectional communication for the server to send updates to the client, as well as the client to the server. Achieving bidirectional communication using HTTP requires either the client polling the server for updates, or the client opening long lived connections to the server, allowing the server to respond to the client as soon as server events occur, a technique popularly called Comet. One advantage of Comet is that events are propagated to the client as soon as they occur, rather than on the next poll interval. Another is that it avoids the traffic incurred by polls which return no events. Disadvantages are that web servers may only support a few concurrent requests. Applications using long lived requests may not scale because the number of clients is limited by the server's support for concurrent requests. Idle long lived requests can prevent the server from accepting new short lived requests for other content. Some HTTP proxies don't start forwarding responses until the entire response is received.

Libraries for Comet.

XMPP XEP0124.

XEP0124 implementations.

GNOME Telepathy.

Loudmouth library and link local XMPP.

Loudmouth library and XEP0124.


On the other hand HTTP is not designed for real-time events from the server. You need to use some back channel, a technique called Wikipedia:Comet. There's the draft RFC from 1999, the COMETd project and the WHATWG spec. The draft RFC is WebDAV-esq, but not very clean. The COMETd project is very gritty. If the WHATWG spec is ever implemented, user agents will have far more power, apparently socket programming in JavaScript?

XMPP is a nice modern XML spec which is designed for messaging, so it wouldn't have HTTP's problems. Presumably there're already projects which do XMPP using AJAX, but I don't think you could implement it with a web script. A good project might be adding an XMPP backend to ACE.

TODO

  • File comet feature request against YUI.
  • Research XMPP AJAX tools.