Wednesday, 30 May 2007

Improved vCard parser

I'd like to focus on the vCard parser and export in libebook. Some things I've noticed:
- v3.0 import is quite well supported, with the major exception of charsets #240756 tracks all the v3.0 bugs
- v2.1 import was well supported, but is gradually getting worse since there is a common codebase for 2.1 and 3.0, and most people care about 3.0
- v3.0 export is good, with minor exceptions (CRLF at end of card for example)
- v2.1 export is non-existent
- performance is important, as vCards are used in the file-backend. From a performance perspective, that's horrible, but it has its advantages too. For large vCards, the poor performance easily kneeled the system, for example with a medium sized photo. With
#433782 it got much better, but there's still a lot of potential for improvement

I've created a patch (against svn trunk) that improves the performance of the parsing itself (only v3.0) and adds some other fixes like CRLF at the end of the card. The patch is supposed to be non-intrusive, and will not break public APIs, but mainly create new internal methods
that will only kick in for vCards with VERSION:3.0 in the second line. Other vCards will be parsed as before.

After doing the patch, I created a test suite to test my own patch and the current implementation. I used a different approach than Ross in eds-dbus. Instead of creating classic hand-coded unit tests, I compare a parsed file with a file that has the expected format. That way, new tests can be added with much less effort, without writing any code. The downside
is that not all aspects of parsing can be tested. For example, if a list separated with comma was read as one chunk, it probably wouldn't detect that it should have been separated at the commas. Anyway, I think the test suite makes sense and can be supplemented by classic
unit tests. Try it out and add more tests!

To run the v3.0 tests, for example, add the .vcf's in vcard/valid-3.0 as parameters to src/test-vcard-suite. -e outputs detailed error messages and -r 100 will repeat the parsing 100 times for benchmarking. A typical command I use is:
LD_LIBRARY_PATH=/opt/evolution-data-server/lib -e src/test-vcard-suite vcard/**/*.vcf

A major weakness in the vCard 3.0 specification is its inability to tag vCards in files with charset. The only ideal solution, as I see it is to ask the user which charset he wants to use and maybe also display a preview. For vCards in emails (anything MIME), the charset
can already be specified. What the parser needs is support for converting from the specified charset. I added an extended patch that does this too. It breaks the API and will need extensions to the UI to be of any practical value.

For now, focus on the patch without charset support. Ross' patch at #312581 is related, and efforts should ideally be joined. What we need is a strategy (Ross and Srini) of where to end up. I see three possible roads:
- An optimised v3.0 parser with fallback to a "quirk-mode" parser for
v2.1 and buggy v3.0 (my patch goes down this road)
- One v2.1 and one v3.0
- One parser to rule them all, but has to be very, very clever to maintain high performance and at the same time support quoted-printable (basically what we have now, minus some performance)

Independent of the choice of strategy, there are a couple of obvious
spots to improve.
* Export is quite streamlined, but the method doing escaping can be improved
* Whenever e-d-s requires glib 2.12 anyway, maybe glib's base64 can be used for improved performance(?) and reduced code complexity/maintainability?

That was a lot, but the gist is "everyone, give some attention to the vCard parser - improve it, test it or add test cases for whatever doesn't work for you"!

Nokia Customer Care (but do they really care?)

I got my shiny, new Nokia N800 some three months ago. I purchased it on the Nokia Shop web pages, and within 48h it was delivered from Belgium. Sweet.

Less than two months later, the charger started to malfunction. It seemed like the wire broke, right next to the little box you plug into the wall. No surprise really, since the whole thing seems very delicate and fragile. But I had been so careful. Mostly keeping the charger at home, and never winded the wire up on the charger itself.

Customer care to the rescue. After a phone call to "Nokia Care", I was redirected to "Customer Care". To make a long story short, I called Customer Care 8 times over the next month. I also had to explain that this was not a mobile phone and point a highly doubting customer service representative to the nokia.no page to prove that Nokia actually sells N800s. I'll also mentioned that I was directed to my reseller which was shop.nokia.com, a local service center which sent me back. All in all very frustrating, but after a month, I actually got the charger without having to return the broken one.

Be careful to your chargers!

Now I'm only waiting for more reviews of the Navigation Kit..

Thursday, 5 April 2007

Evolution address book TODO

Some suggested improvements to the contacts component in Evolution (somewhat ordered by priority):
  • Synchronise with devices and other clients
  • Better looking display panes (integrated search and proper context menus)
  • Better search UI
  • Working LDAP support
  • Working GAL support (and maybe groupwise)
  • Do something about the slow vCard parsing
  • Replace CORBA with DBUS

Monday, 2 April 2007

libebook scalability

Ever had a monster-addressbook in Evolution? During some performance testing of libebook this weekend, I found that it behaves very badly with large addressbooks. A bit of digging revealed the cause - in the way it uses GList. When fetching multiple contacts from the standard file backend, g_list_append is used instead of g_list_prepend, which scales way better. A tiny patch fixes the problem.



The "libdb direct" numbers stem from a routine that bypasses evolution-data-server altogether and connects directly to libdb. Consider that almost the ultimate "time to beat", given the current database format. There is no doubt room for improvement.

All tests are run with a number of equal contacts with only name and email filled in. The database was stored on a tmpfs drive (ramdisk). For those who are interested, to create 1 million contacts take several hours.. I don't think that anyone sane will have tens of thousands of contacts, but for testing purposes, for batch processing and large enterprises, good scalability is a must. The address book has more potential for peformance improvements, but sooner or later the verbose data structure will be the limit.

I'm sure g_list_append is used in a similar way a lot of places around e-d-s and GNOME in general. A quick-win is to replace with g_list_prepend wherever n is large.

April 1st

My first post was probably the worst April's fool ever, but couldn't help it since it happened to be April 1st.. ;)

I expect posts here to cover topics such as Evolution, as I am one of the Debian maintainers, and also Nokia N800 as I'm the happy owner of one.

Sunday, 1 April 2007

I got myself a new, shiny N800!

Maybe no surprise as the title of this blog says it all, but I got the super-gadget directly from Belgium February 7. It took only one and a half day from I entered my VISA-number until TNT tried to deliver it in Oslo. TRIED. For some stupid reason, TNT tries to deliver the parcel at home addresses during office hours without prior notice. I'm never at home at that time, and had to call them to redeliver at work after the weekend. I'm not the first one to notice this madness.


The device looks good, but tastes better. Especially the shiny high-res screen is appealing, as many bloggers have pointed out. Size-wise, it's bigger than a Palm PDA and smaller than a normal tablet. It didn't occur to me before that the Palm was designed to fit in a shirt pocket. The N800 does not. It fits in a jacket pocket though. For heavy web browsing it's a bit small and as a music player and PDA, it's a bit big. As a remote control and for GPS navigation, it's just perfect.

It quickly came to me that this device is perfect as a remote control. Yes, you heard me right: remote control. A very expensive one, though. But if you think about it, not that much more expensive than a decent iPod. What you can't do with an iPod is sit comfortably in the sofa with no wires and control the music on the stereo. With mpd and gmpc, you have a jukebox that can be controlled with a rich interface from multiple sources. I'd die to have the latest version of gmpc ported to maemo, though. Anyone?

First post!

This is the first post of a new blog that will feature updates from tractor pulling shows, bunad-fashion, and a weekly review of Bavarian Weißwurst.

Enjoy!