ddaa bbloggs: /

Repositories, branches and trees

These are words you are sure to find in the documentation of any version control system. In CVS, Subversion or GNU Arch they are very distinct objects. In git and Mercurial, they are conceptually distinct but are represented by a single object on disk.

In Bazaar, you can pick and choose, as separate or as conflated as you wish. This allows it to support configurations familiar to CVS and Subversion users, as well as the convenience of all-in-one trees.

In the simplest case, bzr init creates a standalone tree. A tree that contains a repository and a single branch, like this:

my-tree
 | .bzr
 |  | repository
 |  | branch
 |  ` checkout
 | hello.c
 ` README

You can also create a shared repository for multiple branches to use the same storage. This save disk space and data copying. This is done with bzr init-repository. And that looks like this:

my-repo
 `.bzr
   ` repository

Then you can start creating branches in the repository. Using commands like bzr branch my-tree my-repo/my-branch:

my-repo
 | .bzr
 |  ` repository
 ` my-branch
    | .bzr
    ` branch

Note how my-branch neither contains a repository nor a working tree [1]. Other branches created in my-repo will use the same common storage.

[1] At least in bzr-0.13. In earlier versions, init-repository created by default a repository where the branches had trees. This behaviour was provided by init-repository --no-trees. At least up to bzr-0.13. Developers are considering to change the default of init-repository to create a repository where the branches have trees.

To extract a tree from the repository, use a command like bzr checkout --lightweight my-repo/my-branch my-light. That would give you a so-called lightweight checkout:

my-light
 | .bzr
 |  | branch (reference)
 |  ` checkout
 | hello.c
 ` README

The branch in a lightweight checkout is a special branch reference that contains no actual data, just the URL of the actual branch in the repository.

Note that, if you are so inclined, you can create a lightweight checkout from standalone tree.

If you want to keep all your trees handy, and still benefit from the disk space and performance gains of using a repository, you can create the the repository using bzr init-repository --trees. Creating a branch in this repository would yield this result:

my-repo
 | .bzr
 | ` repository
 ` my-branch
    | .bzr
    |  | branch
    |  ` checkout
    | hello.c
    ` README

Note that you can also achieve this result by running bzr checkout (no further argument) in a branch that does not already have a tree. Conversely, you can remove the tree from a branch using bzr remove-tree.

So which combinations did we see?

repo branch tree  
yes yes yes standalone tree
yes no no shared repository
no yes no repository branch
no no yes lightweight checkout
no yes yes repository checkout

Are other combinations meaningful?

repo branch tree  
yes yes no standalone branch
no no no pure branch reference
yes no yes ???

Standalone branches are self-contained branches without a tree. They are created by bzr push when the push destination is on a remote URL (like sftp, ftp, http). The bzr push command does not update working trees at remote URLs to improve performance and to prevent the occurence of merge conflicts. So it may as well not create a tree in the first place..

Pure branch references are exotic, but they are actually useful. Launchpad uses them to implement a redirection mechanism.

The last combination is a repository, without a branch, with a tree. I do not think it would be really useful, but I can imagine how it would work. That would be a lightweight checkout at the root of a shared repository.

10 Dec. 2006 Repositories, branches and trees (2 comments)

Community Bazaar Hosting on Launchpad

You could already register bzr branches on Launchpad, provided you had some web space to put them on. Now, you can host your bzr branches directly on Launchpad.

You can register a bzr branch you are hosting elsewhere using your web browser. This is called an external branch, and will be automatically mirrored on a daily basis.

To ask Launchpad to host a branch for you, use the bzr tool to connect to the bazaar.launchpad.net SFTP server and upload your branch. This is called a hosted branch.

I have discussed external branches before. The rest of this article discusses how to set up and use a hosted branch.

For this recipe you will need:

  • A Launchpad account. If you do not have one yet, you can register.
  • A Bazaar branch. If you do not have that yet, there are tutorials to get you started.

Branches are uploaded to Launchpad using the SFTP protocol. Authentication is done using the SSH public key system. The Ubuntu wiki has a good SSH Howto covering public key authentication and ssh-agent.

Once you have a SSH public key, register it in Launchpad: go to your account page, click on SSH Keys in the top left corner, and enter your public key in the form.

To push a branch to Launchpad, you need three pieces of information.

  1. The name of your account: this is the last part of the URL of your account page, and you can modify it in the Personal details form accessible from your account's page.
  2. The name of the Launchpad product the branch belongs to. The name of the product is the last part of the URL of the product page.
  3. The name of the branch to publish. This name must be unique for an owner and product, but the same name can be used by different users in the same product and by the same user in different products. If you have no idea, "dev" is a good name for an all-purposes development branch.

In the following examples, substitute account with your account name, product with a product name, and branch with the branch name.

For convenience, you can record your Launchpad account name in your ~/.ssh/config file, by adding the following lines:

Host bazaar.launchpad.net
    User account

Then you can push a branch on Launchpad using the following command:

bzr push --create-prefix sftp://bazaar.launchpad.net/~account/product/branch

Update: As Jamesh Henstridge reported, it is no longer necessary to use the --create-prefix option when pushing to bazaar.launchpad.net.

bzr push sftp://bazaar.launchpad.net/~account/product/branch

After the push, the branch appears on the Launchpad web site and you can use it to set a title, description, and various other attributes of the branch. The Launchpad page for the branch will be found at this URL:

https://launchpad.net/+people/account/+branches/product/branch

Update: The URL structure of the web site changed, /+people/name was replaced by /~people.

https://launchpad.net/~account/+branches/product/branch

After pushing, the branch data will be published at this URL, advertised on the branch page:

http://bazaar.launchpad.net/~account/product/branch.

There is some latency between the time a branch is pushed to SFTP and the time when the data published by HTTP is updated. This latency is currently one day, but new code will be deployed soon bringing this down to a few minutes.

Update: The latency reduction work has been deployed

It can be a few minutes between the time a branch is pushed to SFTP and the time when the data published by HTTP is updated. Usually, this delay is two minutes.

Now, a more negative note, the things you cannot currently do with branches on Launchpad:

  • Push to a external branch that was registered on the Launchpad web site.
  • Convert a branch between hosted and external. Once the branch is created, it will either be mirrored from an external URL, or published from the SFTP area.
  • Use the bzr repository feature. At the moment, each hosted branch must be self contained.

A branch can also be owned by a team, allowing multiple users to commit. More on this next time!

Update: James Henstridge talked about shared branches and Andrew Bennetts emphasized how Bazaar checkouts are useful.

Thanks to Malcolm Cleaton, James Henstridge, and Stuart Bishop for proof-reading this article and suggesting numerous improvements.

15 Jun. 2006 Community Bazaar Hosting on Launchpad (5 comments)

Trying Xubuntu 6.06

These last few days I have given a serious try to Xfce, the lightweight and fast desktop environment. I used Xfce4 as provided by Xubuntu 6.06.

It all started with being annoyed with Evolution crashes and sluggishness. I decided to give Thunderbird a serious try, and while I was trying to get the most snappiness out of my desktop, I temporarily abandoned Gnome for Xfce.

First the good things: Xfce is really snappy, it's not much in absolute numbers, but having everything respond in tenths or hundredths of seconds has a real effect on the overall feeling. For example, when browsing the Gnome menu, there is a small delay (a few tenths of a second) between the selection of a sub-menu item and the appearance of the sub-menu. In Xfce, everything feels instantaneous. I also very much liked the way the Xfce panel behaves.

I also liked Thunar a bit more than Nautilus. It was one of those things that felt really snappy and no-nonsense in small little ways that are hard to describe. There was also many things that just worked, or looked like they would just work if I needed them, but that I did not actually need during the trial period, for example CD burning.

One thing that bothered me at first was losing the calendar feature of the Gnome clock applet. Clicking on the clock applet displays a small calendar with information from the Evolution calendar. Happily, I was able to get the same feature using the "orage clock applet". Of course, it uses the calendaring information of Orage, the Xfce calendar application, and not Evolution. I will try to keep on using Orage in Gnome, even if only to keep Evolution at bay.

The system monitor applets of Xfce are not as pretty as the Gnome System Monitor applet. The network monitor applet does not provide a shortcut to the network configuration application. Xfce has a weather applet, but it is not smart enough to display the temperature on the side when in an horizontal panel, so enabling temperature display always significantly increasing the width of the enclosing panel. There also appears to be no way to bind a key to the activation of the Xfce panel menu, or to operate the panels using the keyboard.

Xfce tends to force the user to deal a bit more with system internal: for example, there is no UI to configure the keyboard switching applet, one has to use the infamous xkb directly. Also, except for the window manager, there are no canned desktop actions to bind to key commands, all desktop key bindings must call into shell commands (like with xbindkeys). Also, the inability to activate the panel menu from the keyboard forced me to use the "run command" much more often than I do in Gnome.

More troublesome were the window manager key bindings. I love to have excellent control from the keyboard, and only use the mouse when I really have to. Metacity has excellent support to resize and move windows from the keyboard. I was not really satisfied by the equivalent functionality of the Xfce window manager. It is not possible to resize a window by extending it to the left or top, only by extending to the right or bottom. Also, keyboard resizing and moving of windows does not snap to other windows and screen borders like the mouse operations do.

Serious annoyances begun when I was unable to use gnome-gpg. I tried enabling the "Run gnome services at startup" option of the Xfce session manager and re-logging in, but it did not help. So tried using gpg-agent, but gpg reported "gpg: problem with the agent - disabling agent use" and asked for the pass-phrase every time. I knew I could have dealt with the problem using quintuple-agent, but I postponed it.

But the thing that really made me switch back to Gnome was the lack of keyboard control for the sound volume. A lot of keyboards nowadays have special keys to control or mute the speaker's volume, and Ubuntu's Gnome comes with complete support for those, and even integration with my ThinkPad's BIOS-based volume keys.

Overall, I had a very good impression of Xfce. I loved the very pleasant snappiness feeling, and the non-nonsense panel behaviour. I could deal with the more "bare metal" aspects of the environment. I plan to give Xubuntu another try some day, and hope that they have found a way to improve their keyboard accessibility, and in particular the keyboard volume control.

11 Jun. 2006 Trying Xubuntu 6.06 (4 comments)

lsprof meets KCachegrind

You can now visualize the output of the lsprof Python profiler using KCachegrind, an excellent visualization tool for profiling data.

One useful feature of the hotshot Python profiler is the existence of the hotshot2calltree conversion filter, that produces output suitable for KCachegrind. So after the s/hotshot/lsprof thread occured on the python-devel mailing list, and that Bazaar-NG added support for lsprof, I was itching to use KCachegrind with lsprof data.

In good libre software style, I eventually scratched my itch and wrote a patch to add calltree support to lsprof. The lsprof maintainers are welcome to apply it.

With lsprof and KCachegrind I was able to quickly identify a performance bug in Bazaar-NG that should be easy to fix. The data generated by hotshot completely misses that issue, at least when displayed by KCachegrind. I am not sure why.

One thing I know for sure is that hotshot2calltree has a serious bug. It causes KCachegrind to confuse multiple functions with the same name and from the same Python file. That completely messes up the call graph and greatly reduces the pertinence of the data.

Specifically, the Bazaar-NG code uses two decorators, needs_read_lock and needs_write_lock, defined in the same module, which return closures of local functions called decorated. Each decorator uses a different decorated function, but hostshot2calltree produces data that causes KCachegrind to only see one decorated function. In the example I'm looking at, it's the one in the needs_write_lock decorator.

KCachegrind seems to make C/C++ assumptions, and seems not to expect two different functions with the same name and in the same file. My lsprof patch works around the issue by including the Python module names and line numbers into the function names in the calltree output. Incidentally, that also makes for more informative names.

27 Jan. 2006 lsprof meets KCachegrind (5 comments)

Basic Bazaar Support in Launchpad

The initial bzr support in Launchpad finally got its last missing piece bolted on.

Those who had the chance to attend UBZ may remember the presentation titled Launchpad Branch Management that Gustavo and I performed, with the assistance of an innocent user. Okay, maybe you do not remember... The one with the cows! You remember now? Right, so the functionality we demonstrated there is finally live on Launchpad.

During that presentation, we demonstrated some prototype code that allowed you to do a few things in Launchpad:

  • Browse for branches by product or people.
  • Click the "Add Bazaar Branch" link, fill out a form with the URL of a bzr branch and some descriptive information.
  • Wait "some time" until all the daemons involved get around to the data. Right now it takes about one day, but we will eventually lower the latency to a few minutes.
  • Watch as the Launchpad page for the branch now displays a small table with the date, author and summary of the ten latest revisions.

That's not all that impressive when told that way, but the presentation was great, really, people loved the cows. And our innocent user managed pretty well.

What that really shows is that Launchpad knows bzr branches, and has some critical infrastructure to deliver a bunch of really sexy features. For example, it automatically mirrors your branch on bazaar.launchpad.net. As such, Launchpad can already behave as a directory and mirror for all the decentralised development branches out there for your project. [1]

Stay tuned, I will brag about the new and improved features as they roll out.

[1]Launchpad also has some other "small" features, like a bug tracker, an on-line software translation service, and two flavours of kitchen sinks.

27 Jan. 2006 Basic Bazaar Support in Launchpad (2 comments)

What it's like to work at Canonical

Paul Graham recently wrote an essay titled What Business Can Learn from Open Source. One of things he talks about is how traditional "professionalism" is actually quite harmful, and advocates decentralised work. As an employee of Canonical, the company that created Ubuntu and is developing Launchpad and Bazaar, I almost felt like this essay was talking about me. Working at Canonical is quite a lot like what Paul Graham describes.

The company has little in the way of actual office space. The development is done by people scattered over four continents, most of them working at home. Developers are recruited in the libre software community based on their current and past activities, the basic hiring philosophy of the company is to get people to work on what they would do for free. Of course, a paycheck always comes with some associated tedium, and sometimes one has to work on totally boring things.

Canonical is an interesting company to work for, if only because the way work actually gets done is quite similar to the way a community project would work. The main communication tools are (in no specific order) e-mail, IRC chat, wikis, and a decentralised version control system. Some people in the company like to say that we work "pants free". After all, when your office is next door to your bedroom, there is little use for suits and ties.

Even though "on the internet, nobody knows you are a dog" I can confidently assert that all the Canonical staff at least looks and sounds somewhat human. The whole company congregates at least three times a year, for the Ubuntu developers summit, and specific development groups attend additional "sprints" as needed. Every time, the interesting question is "where?". The whole-company meetings I have attended were held in Oxford, MatarĂ³ and Sydney and I have attended sprints in London (at Mark's flat) and Brazil (at Async offices), while others had sprints in Cape Town, Montreal, and probably a few other places.

Another fun thing in working for Canonical, is that it tends to elicit "Wow!" reactions from some people, but the real nice thing is the people you get to work with. That is really an elite company, with a lot of very bright hackers and quite a few with excellent communication skills. Most people there also tend to have quite interesting personalities or backgrounds. And also, we get to go to night clubs with Mark Shuttleworth at the end of Ubuntu summits, and sometimes we fly on his plane on our way to one conference or another.

13 Aug. 2005 What it's like to work at Canonical (2 comments)

A Look at Gobby

For a couple of months now, I have been looking into implementing a collaborative text editor similar to SubEthaEdit, but libre. This personal project is probably to take another direction now that I have discovered Gobby, and found out that it does work.

Some other libre collaborative text editors

The initial research taught me about some related projects in the free software world, in particular the Gnome Office Collab initiative that was discussed on the Abiword developers mailing list. This discussion included a very valuable post linking to the excellent computer science paper [sun98achieving] that was purportedly implemented by SubEthaEdit.

But as far as libre collaborative text editors went, I did not find much.

  • The Collaborate application seemed like a good candidate, but it just did not work correctly. Though the web page looks good, it's just an aborted student project.
  • The Network Text Editor [handley97network] appears more mature, but it's an MBone application and is all about making collaborative text editing work with such an unreliable transport. That might make for good academic publications, but that seems to me like the kind of contortions you would only perform to fit within an imposed research program.
  • At the time, I also noticed this intriguing Debian package, called libobby-1.0. However, that did not seem to be actually used by something and it had the annoying property of enforcing a centralised model. It turns out that this is the core library used by Gobby.
  • I have also recently been pointed at ACE, a Java application.

Centralised or decentralised?

What do I mean when I say: centralised model? In short, that means that all editing actions goes through a central, fixed, server instead of being transmitted among peers.

Centralisation in a real-time collaborative environment is problematic: the server network bandwidth can become a bottleneck, latency is increased as any two users have to communicate through the server, and the editing session cannot outlive the server. These problems are aggravated by the fact that the server often runs on the workstation of one of the participants, who may want to leave the editing session, and whose network connectivity may be less than carrier-grade.

When several people are collaborating on the same data over the internet, there are technical challenges in ensuring that all users see the same document, and that the changes appear to each user in a natural order. What makes the problem difficult is that, if you want to provide smooth editing experience, you must accept that different users may have slightly diverging document states.

When you type a word in a collaborative text editor, you want the word to appear immediately on your screen and then be sent to other participants in the editing session. Then another user can type another word before receiving what you typed, and that should also be displayed instantly on the other user's screen. In that scenario, you and the other user's document state diverge for a short while. Among other things, the collaborative text editor must ensure that everybody ends up with the same document state after a short while. That is the convergence problem.

Other technical challenges of concurrent document editing include causality preservation, and intention preservation. Causality preservation means that if you type "Hello", and a second user types "World" after seeing "Hello", then a third user must always see "Hello" appear before "World". Intention preservation simply means that edits always have the desired effect, which is not as simple as it sounds because document states may diverge.

Convergence and causality preservation seem difficult to achieve without arbitration. The usual solution to arbitration problems is to centralise the system, a single server process on a single computer decides in which order things really happened and breaks ties. Special relativity explains that two causally independent events may appear to have occurred in different orders for different observers, so it often looks like there is no way around centralisation when arbitration is needed. But the sun98achieving paper shows that centralisation is not required for collaborative text editing, and explains how to do without.

The need for a decentralised messaging abstraction

Centralisation, even if it problematic, avoids some other difficult problems. The sun98achieving paper makes a good job of explaining how to handle editing operations, but it says nothing about session negotiation. In particular, if you want to prevent two users from using the same nickname, or the same text colour, or you want documents in the session to have unique names, you have a difficult distributed resource management problem.

The Spread Toolkit solves a related problem, providing a messaging abstraction for decentralised networks. Unfortunately, it is very oriented towards clustering applications and requires that network participants be declared in a configuration file. That makes it completely unsuitable for use in a user friendly collaborative text editor. Also, its license has an attribution clause that makes it GPL-incompatible. However the Spread Message Service Types look like they would be a good foundation for a generic distributed messaging system.

Another problem with decentralised collaborative text editing over the internet is that to really reap the benefits of decentralisation in robustness and performance, you need to build a mesh of network connections which touches another difficult research problem (pointers gladly accepted): the construction of a robust mesh without going for a fully-connected network. In addition to this theoretical problem, special care would be needed to transparently and reliably establish connections to other session participants in the present of NAT.

My personal experiments were based on a decentralised model, and I got very tangled into the distributed resource management problem. I think that before we can have a usefully decentralised collaborative text editor, we will need a good decentralised messaging abstraction that provides a foundation for resource management and a transparent fault-resilient mesh network of peers.

Gobby, it works!

A few days ago, a colleague pointed me at Gobby. The collaborative text editor from the 0x539.dev group. I downloaded and compiled the application, played a bit with it, read the (still quite short) mailing list archive, and chatted a lot on the #0x539 IRC channel. And I have to say I was quite positively impressed!

In short, Gobby is a multi-platform (Linux, Mac, Windows) collaborative text editor written in C++ and using the GTK toolkit. Despite the low version number (0.2.0 was recently released) it's already quite usable though it still has a few annoying limitations:

  • When a user loads a document, the whole document appears with the background colour of that user, making it impossible to see the text that user has typed during the session.
  • The Undo mechanism does not distinguish between local and remote edits. There is no way (yet) to undo your actions without first undoing more recent actions from other users.
  • The networking implementation is still less than ideal, in particular, the application freezes while trying to establish a connection.
  • The carets and selections of other users are not visible.

On the upside, most of these limitations are likely to be fixed in the near future, except multiple-caret and multiple-selection display that would require a non-trivial amount of GTK programming.

Most of the features you would expect are already there:

  • Lock-less collaborative text editing that actually works, providing all three of convergence, causality preservation and intention preservation.
  • Each user is associated to a background colour that marks the text the user has typed during the session.
  • A built-in chat service.
  • Syntax highlighting provided by the GtkSourceView widget.
  • Zeroconf networking using Howl.
  • A sane build system, based on the GNU Autotools suite.
  • Localisation support using gettext.

My main issues with this project is that it is written in C++, which is not a language I would be programming in for fun, and that it enforces a centralised model. However, centralisation appears like a reasonable choice when you consider the additional complexity involved by useful decentralisation.

08 Aug. 2005 A Look at Gobby (4 comments)