Package groupthink :: Module stringtree :: Class SimpleStringTree
[hide private]
[frames] | no frames]

Class SimpleStringTree

source code

a class implementing an editable string, represented internally as a dependency tree of Changes.

Purpose

The logical structure of a SimpleStringTree is derived from common version control systems such as CVS. In a version control system, each edit is represented as a patch, also known as a diff, that can be applied to the previous version of a document to generate the next version. These patches are typically generated on a line-by-line basis, with the line to be edited described either by its absolute position, or by its position relative to some "context" (i.e. nearby lines).

When two people submit patches that modify the same line, the two patches are said to "conflict", because they cannot both be applied, at least not while retaining a semblance of their original meaning. These conflicts are reported to the user and must be resolved manually.

When editing computer source code, a conflict typically indicates that if both edits are applied, the program will crash. For human languages, however, the requirements are not so strict.

In real-time collaborative text editing, conflicts can most easily be resolved by applying both edits and then fixing the text within the editor. This procedure only works if all editors see the same text when a conflict occurs. Thus, the purpose of SimpleStringTree is to ensure that, no matter what sequence of edits is made, all users who have seen all edits are presented with the same string.

API

SimpleStringTree provides a "native" API based on insert() and delete(). To ease testing, it additionally provides a number of methods such as seek() and write() that mirror the API of python's StringIO module. These methods are in place principally to enable direct testing against python's StringIO module, which should produce identical results when these methods are used.

The name

SimpleStringTree is "Simple" in that it supports only Insertions and Deletions. Handling References, while valuable, has proven quite difficult, and so will not be addressed by this data structure.

Some code for handling Removals will be left in for the moment, since it is easy enough, even though it is not presently used.

Instance Methods [hide private]
 
__init__(self, initstring='') source code
 
__repr__(self) source code
 
getvalue(self, r=None) source code
 
next(self) source code
 
flush(self) source code
 
close(self) source code
 
read(self, size=inf) source code
 
readline(self, size=inf) source code
 
readlines(self, sizehint=None) source code
 
seek(self, offset, whence=0) source code
 
tell(self) source code
 
truncate(self, size=None) source code
 
write(self, text) source code
 
writelines(self, sequence) source code
 
insert(self, text, k=None)
Insert text into the string, without overwriting any existing text.
source code
 
delete(self, k, n)
Starting at a point k (0-indexed), delete n characters
source code
 
get_range(self, rec, point, m) source code
 
move(self, rempoint, n, inspoint)
Move a substring.
source code
 
add_change(self, c)
Apply a change to the string.
source code
 
_effective_parent(self, uid, parentid, position)
The effective parent of an insertion is defined as the (uid, loc) pair that causes it to be inserted in the right location in the string.
source code
 
_delete_listonly(self, position, length)
Given the position in _sourcelist, and the length of the deletion, perform the deletion in the lists
source code
 
_insert_listonly(self, uid, target, length)
Make a new insertion into the lists with uid and length at position in _poslist
source code
 
_add_change_treeonly(self, c) source code
 
get_insert(self, k, n=0, parent=None)
FIXME: This method could be useful, but it no longer works get_insert finds and returns the youngest insertion containing positions k to k + n in the total coordinates of parent.
source code
iterable(Change)
get_changes(self)
get_changes provides a depth-first topologically sorted list of all Changes in this Tree.
source code
bool
is_ready(self, c)
Determin whether a Change c may safely be added to the Tree.
source code
Method Details [hide private]

insert(self, text, k=None)

source code 

Insert text into the string, without overwriting any existing text.

Parameters:
  • text (str) - the text to insert
  • k (int >= 0) - the location at which to insert the text. If k is None, then the current seek position is used
Raises:
  • None - raises an error if k is greater than the length of the string.

delete(self, k, n)

source code 

Starting at a point k (0-indexed), delete n characters

Parameters:
  • k (int) - the position at which to begin deletion. The character at position k will be deleted.
  • n (int) - the number of characters to delete
Raises:
  • None - raises an error if the deletion extends beyond the end of the string.

move(self, rempoint, n, inspoint)

source code 

Move a substring. Equivalent to read(), delete(), and insert(). In StringTree, move() would ideally coherently copy a section of text, such that any subsequent edits appear in the new location, not the old. In SimpleStringTree, this is not possible, so move() just falls back to Deletion and Insertion.

Parameters:
  • rempoint (int) - the point at which to start deletion
  • n (int) - the number of characters to move
  • inspoint - the position at which to insert

add_change(self, c)

source code 

Apply a change to the string.

Parameters:
  • c (Change) - a Change to make to the string. c.parent MUST already have been added.

_effective_parent(self, uid, parentid, position)

source code 

The effective parent of an insertion is defined as the (uid, loc) pair that causes it to be inserted in the right location in the string. This is only relevant in the event of a conflict, in which case the conflict edits are required to be ordered from least uid to greatest uid. The effective parent of a conflicted edit, then, is either the original pair, or (u_next, 0), where u_next is the uid of the sibling directly to the right of the input uid. That is to say, it is the least u greater than uid that has the same position.

get_insert(self, k, n=0, parent=None)

source code 

FIXME: This method could be useful, but it no longer works get_insert finds and returns the youngest insertion containing positions k to k + n in the total coordinates of parent. If parent is unspecified, then it is taken to be the root, meaning the coordinates in question are the current document coordinates.

The return value is a tuple (rec, k_unmodified, k_modified), meaning the record containing k and k + n, the position of k in rec's unmodified coordinates, and the position of k in rec's modified coordinates.

"containing" is defined so that a record with n characters (labeled 0...n-1) can still be returned as containing k...k+n. In other words, each Insertion contains both endpoints. However, an Insertion that has been totally deleted is ignored entirely.

get_changes(self)

source code 

get_changes provides a depth-first topologically sorted list of all Changes in this Tree.

Returns: iterable(Change)
a list of changes to the tree, ordering according to a topological sort of the dependency tree. This ordering guarantees that no change appears until after all of its dependencies (i.e. ancestors).

is_ready(self, c)

source code 

Determin whether a Change c may safely be added to the Tree. Specifically, it checks whether c.parent is already known.

Returns: bool
c.parent is already present in the tree.