
Today I've been fighting Planet, the well-known blog aggregator tool. After a while I had found out how/why it was scrambling Atom feeds horribly.
I'm not sure if actually is a planet bug - maybe it is fine with older python versions. The SGML parser of python2.4 however fails on tags such as <br />, a very common case in blogs and thus in atom feeds. Strage additional > brackets appeared in the output.
The reason is, that the SGML parser as of Python2.4 is looking for <tag/foo/ as an equivalent to <tag>foo</tag>, and thus treats <br/><br/> the same as <br>><br<br>> with the inner chars somehow magically escaped...
The fix is quite simple: add
sgmllib.shorttag = re.compile('<([a-zA-Z][-.a-zA-Z0-9]*)/(/*)>')
to your feedparser.py file in the obvious place (next to sgmllib.tagfind).
This will break support for these true SGML short tags, but I've never heard of
a blog feed using them anyway.I told you that I'm not really sure whether this is a planet bug: It might be a bug of pythons sgmllib, too. But maybe Planet should just use a XML parser for XML files, and fallback to an SGML parser (or maybe a robust XML parser) for other files (unfortunately, many blogs - including mine - do not ensure correct XML). And Planet could use some proper XML handling, too, anyway... Right now, the code is so string-array-based, it makes me sick.
You might also want some extra magic to re-fold <br/> tags to not confuse older browsers.
Some time ago I tried skype... now I wanted to use it to call someone, tried starting it... the main window came up, and immedeately closed itself again.
Nothing I can do about it, no error message, nothing I could do differently.
So I was looking for a different/newer version, and found out that Skype itself provides downloads for Skype on Debian. Except they are uninstallable.
What a crap! Use an open standard, like SIP, which works much better - and where you can use different applications such as linphone or kphone.