Class SimpleStringTree
source code
a class implementing an editable string, represented internally as a
dependency tree of Changes.
Purpose
The logical structure of a SimpleStringTree is derived from common
version control systems such as CVS. In a version control system, each
edit is represented as a patch, also known as a diff, that can be
applied to the previous version of a document to generate the next
version. These patches are typically generated on a line-by-line
basis, with the line to be edited described either by its absolute
position, or by its position relative to some "context" (i.e.
nearby lines).
When two people submit patches that modify the same line, the two
patches are said to "conflict", because they cannot both be
applied, at least not while retaining a semblance of their original
meaning. These conflicts are reported to the user and must be resolved
manually.
When editing computer source code, a conflict typically indicates
that if both edits are applied, the program will crash. For human
languages, however, the requirements are not so strict.
In real-time collaborative text editing, conflicts can most easily
be resolved by applying both edits and then fixing the text within the
editor. This procedure only works if all editors see the same text
when a conflict occurs. Thus, the purpose of SimpleStringTree is to
ensure that, no matter what sequence of edits is made, all users who
have seen all edits are presented with the same string.
API
SimpleStringTree provides a "native" API based on insert()
and delete(). To ease testing, it additionally provides a number of
methods such as seek() and write() that mirror the API of python's
StringIO module. These methods are in place principally to enable
direct testing against python's StringIO module, which should produce
identical results when these methods are used.
The name
SimpleStringTree is "Simple" in that it supports only
Insertions and Deletions. Handling References, while valuable, has
proven quite difficult, and so will not be addressed by this data
structure.
Some code for handling Removals will be left in for the moment,
since it is easy enough, even though it is not presently used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
insert(self,
text,
k=None)
Insert text into the string, without overwriting any existing text. |
source code
|
|
|
|
delete(self,
k,
n)
Starting at a point k (0-indexed), delete n characters |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
_effective_parent(self,
uid,
parentid,
position)
The effective parent of an insertion is defined as the (uid, loc)
pair that causes it to be inserted in the right location in the
string. |
source code
|
|
|
|
_delete_listonly(self,
position,
length)
Given the position in _sourcelist, and the length of the deletion,
perform the deletion in the lists |
source code
|
|
|
|
_insert_listonly(self,
uid,
target,
length)
Make a new insertion into the lists with uid and length at position
in _poslist |
source code
|
|
|
|
|
|
|
get_insert(self,
k,
n=0,
parent=None)
FIXME: This method could be useful, but it no longer works get_insert
finds and returns the youngest insertion containing positions k to k
+ n in the total coordinates of parent. |
source code
|
|
|
iterable(Change)
|
get_changes(self)
get_changes provides a depth-first topologically sorted list of all
Changes in this Tree. |
source code
|
|
|
bool
|
|
|
Insert text into the string, without overwriting any existing
text.
- Parameters:
text (str) - the text to insert
k (int >= 0) - the location at which to insert the text. If k is None, then the
current seek position is used
- Raises:
None - raises an error if k is greater than the length of the string.
|
|
Starting at a point k (0-indexed), delete n characters
- Parameters:
k (int) - the position at which to begin deletion. The character at
position k will be deleted.
n (int) - the number of characters to delete
- Raises:
None - raises an error if the deletion extends beyond the end of the
string.
|
|
Move a substring. Equivalent to read(), delete(), and insert(). In
StringTree, move() would ideally coherently copy a section of text, such
that any subsequent edits appear in the new location, not the old. In
SimpleStringTree, this is not possible, so move() just falls back to
Deletion and Insertion.
- Parameters:
rempoint (int) - the point at which to start deletion
n (int) - the number of characters to move
inspoint - the position at which to insert
|
|
Apply a change to the string.
- Parameters:
c (Change) - a Change to make to the string. c.parent MUST
already have been added.
|
_effective_parent(self,
uid,
parentid,
position)
| source code
|
The effective parent of an insertion is defined as the (uid, loc) pair
that causes it to be inserted in the right location in the string. This
is only relevant in the event of a conflict, in which case the conflict
edits are required to be ordered from least uid to greatest uid. The
effective parent of a conflicted edit, then, is either the original pair,
or (u_next, 0), where u_next is the uid of the sibling directly to the
right of the input uid. That is to say, it is the least u greater than
uid that has the same position.
|
|
FIXME: This method could be useful, but it no longer works get_insert
finds and returns the youngest insertion containing positions k to k + n
in the total coordinates of parent. If parent is unspecified, then it is
taken to be the root, meaning the coordinates in question are the current
document coordinates.
The return value is a tuple (rec, k_unmodified, k_modified), meaning
the record containing k and k + n, the position of k in rec's unmodified
coordinates, and the position of k in rec's modified coordinates.
"containing" is defined so that a record with n characters
(labeled 0...n-1) can still be returned as containing k...k+n. In other
words, each Insertion contains both endpoints. However, an Insertion
that has been totally deleted is ignored entirely.
|
|
get_changes provides a depth-first topologically sorted list of all
Changes in this Tree.
- Returns: iterable(Change)
- a list of changes to the tree, ordering according to a
topological sort of the dependency tree. This ordering
guarantees that no change appears until after all of its
dependencies (i.e. ancestors).
|
|
Determin whether a Change c may safely be added to the Tree.
Specifically, it checks whether c.parent is already known.
- Returns: bool
- c.parent is already present in the tree.
|