<?xml version="1.0" encoding="iso-8859-1" ?>
<rss version="2.0" 
   xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" 
   xmlns:html="http://www.w3.org/1999/html" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
<channel>
   <title>Blog of Erich Schubert</title>
   <link>http://blog.drinsama.de/erich</link>
   <description></description>
   <language>en</language>
   <copyright>Copyright 2007 by Erich Schubert</copyright>
   <ttl>60</ttl>
   <pubDate>Mon, 01 Mar 2010 18:11 GMT</pubDate>
   <managingEditor>n/a</managingEditor>
   <generator>PyBlosxom http://pyblosxom.sourceforge.net/ 1.3.2 2/13/2006</generator>
<item>
   <title>Geo-Temporal visualization</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2010030101-geotemporal-visualization</guid>
   <link>http://blog.drinsama.de/erich/en/web/2010030101-geotemporal-visualization.html</link>
   <description><![CDATA[
<p>I've been playing around a bit with Geo-Temporal visualization. Here's a
screenshot of an experimental visualization on Google Maps:</p><p><img src="http://www.mucl.de/~erich/blogdata/geotemporal.jpg" width="278" height="318" alt="Geo-Temporal visualization" /></p><p>The icons are placed on approximate coordinates; multiple events in a small
area are aggregated into a single marker. The red sectors correspond to
temporal information: to the right is the current day, a full turn corresponds
to a duration of 7 days. Typical events listed on this map cover 1 to 4 hours
in the evening of a day, resulting in a rather small sectors in typical angles
corresponding to the seven days of a week. There are three larger events, one
being a weekend workshop in Hamburg (covering the saturday and sunday sectors),
a Friday to Saturday in Leipzig and an event incorrectly set for all tuesday
in Dresden. M&uuml;nchen on the other hand seems to take a day off on
Saturday (in fact they have a full-week workshop on Lanzarote, on a part
of the map not shown ...).</p><p>While this visualization is quite fancy and can scale to arbitrary time window,
I will not be able to add it to the public version of this map (which can
be tried out on
<a href="http://swing.vitavonni.de/">http://swing.vitavonni.de/</a>).</p><p>The rendering of so many polygons with Google Maps is just way to slow for
all the browsers I tried. Maybe I could use cached png images instead and
traditional overlays to improve performance.</p><p>For some visualizations, it would also make sense to turn the sectors into
a spiral, for example where the angle corresponds to the day of the month and
the distance from the center corresponds to the month.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Mon, 01 Mar 2010 18:11 GMT</pubDate>
</item>
<item>
   <title>Database streaming API</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2010030101-database-stream-api</guid>
   <link>http://blog.drinsama.de/erich/en/2010030101-database-stream-api.html</link>
   <description><![CDATA[
<p>For my research at the university, I've become a lead developer for
<a href="http://www.dbs.ifi.lmu.de/research/KDD/ELKI/">ELKI</a>, a framework
for developing data mining algorithms along with index structures, to be
able to one one hand quickly implement new algorithms (by being able to
reuse a lot of code, in particular index structures, parsers, ...),
but also on the other hand to evaluate the interaction of index structures
with different algorithms, distance functions, and so on.
The new version 0.3 which adds a lot of new outlier detection methods will be
published beginning of April at
<a href="http://dasfaa2010.cs.tsukuba.ac.jp/">DASFAA 2010</a>.
(Note that this is designed for research and teaching use; code extensibility,
readability etc. instead of maximized performance. You might want to do a
rewrite in C for maximum performance, since Java does give you quite a memory
and performance overhead in these setups.)</p><p>After this release, we will be doing a major redesign of the database and
index layer, to allow better comparison of different index structures in
parallel; right now it's hard to use more than one at a time, and building
e.g. a combined index structure is a larger effort than we'd like it to be.
During the process of redesigning the database layer, I'll also be improving
the database update or "streaming" API.</p><p>If you know of a nice API for streaming databases, please send me an Email
to <em>erich -at- debian -dot- org</em>.</p><p>Note that I'm looking for a programming API, i.e. "interfaces". Not for random
data sources such as Twitter. Also I really need an API able to model
<em>data changes</em>, not just "<em>events</em>" aka "instances". So please
don't point me to what Twitter calls a "streaming API". What I'm looking for is
a nicely-designed API to allow programmers to react to <em>Database
changes</em>, such as insertions, deletions, updates, bulk operations etc. and
update their index structures and algorithm results accordingly. An example
would be the "oracle streams" API I believe; MySQL probably has a logging (used
in their replication hacks) which can also be seen as a database stream. But
these are designed around a RDBMS view, and not so much for data mining.
Weka/MOA Streams seems to be just event/instance streams, where there is no
such thing as a bulk insert or even a deletion. Of course there are many use
cases where you will be happy with working on just the previous n instances.
The more general case however handles arbitrary inserts and deletes, instead of
having just inserts (and implicit delitions by an expiry strategy). And yes,
of course you can in turn wrap database events as instances into an event
stream (with "insert", "update", "delete" events) ...</p><p>Yes, I'm aware that this kind of setup was abandoned for what is called a
"stream processing engine" (SPE) in many use cases, that do not care about
"old" data or deletions in general. We'd like to be able to support both
approaches, also to be able to do fair comparisons.</p><p>Of course you can also point me to badly done APIs, and explain me where they
fall short for you. We're not so much interested in copying some API, but we'd
just like to design a good API for people to do research on stream processing
in a database context (e.g. index support for streaming data, online
algorithms, ...) Or you can write me a mock-up API that you would deem
useful.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Mon, 01 Mar 2010 17:08 GMT</pubDate>
</item>
<item>
   <title>Maps-Calendar Mashup</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2010020101-maps-calendar-mashup</guid>
   <link>http://blog.drinsama.de/erich/en/web/2010020101-maps-calendar-mashup.html</link>
   <description><![CDATA[
<p>Well, I'd not call it a Mashup - it's actually backed by a custom database,
a Xapian index for full text search and so on. To me, a true mashup would
work without own server side code.</p><p>Anyway, what it does is this:
<ul>
<li>It gathers data from two dozen Google Calendars for the next few weeks</li>
<li>Geocodes them and does full-text indexing (including the Geo information,
so you can search for that, too.)</li>
<li>Applies some magic formatting to the calendar data (making links clickable,
allowing some basic styling)<li>
<li>Pushes them as KML feed to a Google Maps application</li>
<li>Markers on the map are aggregated and "clustered" (which is a simple
proximity merging, which I wouldn't call clustering)</li>
<li>The map is pre-centered and panned using Google geo-location information on
the visitor to show the best region with hits. So if you are in a city I have
data for, it show you your city. If you are in the US, it should try the
whole US next, then fall back to the whole world.</li>
</ul></p><p>It's using the Maps V3 API, currently in public testing, which seems to give
quite some extra speed compared to earlier versions. I've also added two extra
controls, a search box at the top center, and a "Go to" menu on the left,
which uses the visitor position from Google.</p><p>The data is coming from swing dancing calendars, so it's real world data, and
you should get different results every day. Most of the data is from Germany,
so that is where you can see the marker aggregation and these things in effect.</p><p><a href="http://swing.vitavonni.de/">Here's the prototype</a>.</p><p>There is still lots of things to do, but this is just my free time project,
when I'm not at work, dancing or with my friends.
<ul>
<li>Timezones need to be handled right. So I don't give you any guarantee
that any event has the correct time shown. I believe it only works right for
events in CET and visitors in CET right now. It's okay: I havn't though of how
to handle time zone differences yet, and the JavaScript Date API is worthless
anyway, so that requires quite some effort.</li>
<li>Events in info windows aren't sorted by time yet</li>
<li>Info window UI is bad, need to do pages there</li>
<li>There is no way of choosing a different time query window except the next
week. The backend does this already, it's just not in the UI.</li>
<li>There is no "temporal navigation" tool</li>
<li>No list view (well, actually there is one, but that is the old
version)</li>
</ul></p><p>I don't know yet if this will remain online, it's more of a toy project for me.
Still it's cool to see where there are swing dancing events, and it's cool to
be able to just zoom to another city and see where you could "hop by" for a
dancing event while you're there. But there are just a lot of UI issues to
solve to get this really usable, and I'm not much of an UI guy...</p><p>P.S. if it doesn't work, that probably means I'm currently working on it.
There is no staging, and no "production system".
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Mon, 01 Feb 2010 01:20 GMT</pubDate>
</item>
<item>
   <title>Sun Java - happy 9th birthday, user-affecting rendering bug.</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/2010012201-java-user-support</guid>
   <link>http://blog.drinsama.de/erich/en/linux/2010012201-java-user-support.html</link>
   <description><![CDATA[
<p>It seems that Sun doesn't care much about getting bugs fixed in Java.</p><p><a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4404179">This
bug</a> for example causes rendering artifacts in Apache Batik, and is
very visible with many SVG files. It causes circles to be rendered as
approximated diamonds. It has been reported 9 years ago (the first time,
there duplicates).</p><p>I understand that there are both more important bugs, and that one must avoid
introducing new bugs when fixing bugs. But there should be little dependencies
on a broken circle rendering routine, so please just fix this cosmetic bug,
too. One of the reports is even staged "Fix understood" ...</p><p>A more important issue with Sun Java (known since 2005) is
<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6342561">this
bug</a>, which effectively breaks Java IPv4 networking on Debian unstable
now (which recently changed the IPv6-to-IPv4 fallback behaviour).
So far, Sun has rated this as "request for enhancement". WTF?</p><p>Sure, you can work around the bug easily - change
<tt>/etc/sysctl.d/bindv6only.conf</tt> to use the value of 0 instead to
re-enable IPv4 fallback - but after all, IPv4 networking is pretty much an
essential Java feature.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux</category>
   <pubDate>Fri, 22 Jan 2010 17:00 GMT</pubDate>
</item>
<item>
   <title>Facebook Scam Groups</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/20100109-facebook-scam-groups</guid>
   <link>http://blog.drinsama.de/erich/en/web/20100109-facebook-scam-groups.html</link>
   <description><![CDATA[
<p>Facebook seems to have little interest in protecting its users from a huge
flow of common scam/spam. Sure they do get active when accounts are mass
hacked, and I havn't seen a "Facebook virus" for some time. Their JavaScript
filtering is pretty neat, and they have implemented dereferrer pages they
can use to quickly stop URLs from spreading.</p><p>However, some of my friends keep on joining very dubious groups and installing
very dubios applications. No wonder "FarmVille" is sometime nicknamed
"<a href="http://www.techcrunch.com/2009/10/31/scamville-the-social-gaming-ecosystem-of-hell/">ScamVille</a>". There still is a lot of money to make in dubious ways.</p><p>The big problem with Facebook is that everyone can set up groups and
applications that look like they might be real. This is why people keep on
installing "Mafia wars gifts" applications that have nothing to do with the
actual game except the name. And sometimes not even realize they don't actually
get these gifts in the real game.</p><p>Even worse are the "pimp" groups. It's a classic pyramid scheme. Invite all
your friends to the group, then you get extra Mafia points. Facebook really
needs to stop that.</p><p>A quick search for "invite proof" - these groups usually require you to post
"proof" of having invited all your friends - turns up <b>246</b> groups,
almost all of which promise you Mafia stuff.</p><p>Searching for "getElementsByTagName" in Facebook turns up "over 500" groups.
This string is a JavaScript command commonly used to auto-invite all your
friends to a group. A typical mass-spread group will use this in its
"join instructions".</p><p><b>Facebook needs to combat this kind of spam/scam<b>. And it's not too hard.
<b>Just actually check user complaints/reports</b>, do simple searches like
the ones I posted above, and have some employee go through them and <b>just
delete all these dubious mass-join groups</b>. Pyramid schemes likely violate
the Facebook TOS, and they definitely are illegal in at least Germany.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Sat, 09 Jan 2010 16:24 GMT</pubDate>
</item>
<item>
   <title>Enigma in Debian</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/debian/2009122801-enigma-in-debian</guid>
   <link>http://blog.drinsama.de/erich/en/linux/debian/2009122801-enigma-in-debian.html</link>
   <description><![CDATA[
<p><a href="http://enigma.nongnu.org/">Enigma</a> is a great game, with a unique
mixture of puzzles with mouse skills and action. If you know the discontinued
game <a href="http://en.wikipedia.org/wiki/Oxyd">Oxyd</a>  originally on the
Atari ST in the 90s (also on Amiga and one version on DOS), then you know the
principle of Enigma. Except that it has tons of more levels and is Open Source.</p><p>Some weeks ago, I uploaded a 1.10 pre-release (approximately milestone 5) to
Debian experimental. This is the soon-to-be-released new version, using a new
level file format (with a much extended API to make level development even
easier, ~50% less code per level now), new levels (of course), updated
graphics (including support for new graphics modes), ...</p><p>Unstable still contains version 1.01; the reason is simple that I knew there
would be another 1.01 maintainance release coming. However I believe it
doesn't offer much against the current unstable version; it largely marks
an upstream release containing patches already in the Debian package (since
communication with upstream is really good).</p><p>So I have now two choices: refreshing the Debian unstable package to the
"probably last" 1.01 release upstream, or going straight for the 1.10
milestones to give enigma some extra testing.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux/debian</category>
   <pubDate>Mon, 28 Dec 2009 15:54 GMT</pubDate>
</item>
<item>
   <title>Duplex on HP OfficeJet Pro 8000 seriously messed up</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009122601-broken-duplex-in-hp-officejet</guid>
   <link>http://blog.drinsama.de/erich/en/2009122601-broken-duplex-in-hp-officejet.html</link>
   <description><![CDATA[
<p>My parents needed a new printer, and after some research I decided to
recommend them an <a href="http://www.amazon.de/gp/product/B001T9N4DQ?ie=UTF8&tag=vitavonni-21&linkCode=as2&camp=1638&creative=19454&creativeASIN=B001T9N4DQ">HP OfficeJet Pro 8000</a>. Today I gave it a try, by printing some CD covers
for a CD to give away for christmas to some friends.</p><p>HP failed in a very subtle way: I had printed the covers, cut them, produced
the CDs for them. Then I wanted to put the printed covers into the CD cases.</p><p>Despite the graphics being 12cm x 12cm in size, HP managed to print them in
12cm x 11.4cm. Without any notice (or giving me a choice) it had decided to
scale them on the y axis. Which makes them completely unusable, since they
don't fit the 12cm height of the CD case now.</p><p>After some more experiments, I decided to retry <em>without duplex</em>, and
voila: 12cm x 12cm.</p><p><b>Duplex on HP OfficeJet Pro 8000 is only usable for draft printing,
since it will distort your pages!</b></p><p>(See also <a href="http://h30434.www3.hp.com/t5/Printer-All-in-One-Software-and/HP-OfficeJet-Pro-8000-messing-with-the-margins-lt-cause-found/m-p/124221">this devidence in the HP forums, of people with the same issue</a>,
<a href="http://h30434.www3.hp.com/t5/Printer-All-in-One-Software-and/OfficeJet-Pro-8000-duplex-printing-problems/m-p/172509">an attempt to investigate the
margin messup happening</a>, <a href="http://h30434.www3.hp.com/t5/Printer-All-in-One-Software-and/Printer-driver-for-Officejet-Pro-8000-definitely-broken/m-p/172508">a report that the DJ990c driver can print duplex on this printer without messing with the margins, but is slower and offers less print quality</a>. So it seems that this is an HP driver problem. And technically, it must be caused by the driver; at least it should be able to compensate for this!)</p><p>I also noticed another issue with the print. The bottom right corner of the
graphic didn't get enough ink, it looks like the printer stopped printing a
bit too early. I don't know if this also happens in non-duplex, since I worked
around this by adding a header and footer to the page.</p><p>Seriously, we should send back the printer. On my first try to use it, I
already encountered two bugs. I wonder how many bugs I would see if I'd use
it every day?
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Sat, 26 Dec 2009 18:01 GMT</pubDate>
</item>
<item>
   <title>Media Players</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/2009122501-media-players</guid>
   <link>http://blog.drinsama.de/erich/en/linux/2009122501-media-players.html</link>
   <description><![CDATA[
<p>Somehow, I'm still lacking the optimal media player application. Many popular
ones are totally overloaded (e.g. amarok). Others like totem seems to be just
a minimalistic frontend for a particular backend.</p><p>My current choice:
<ul>
<li>Single-shot playback: to view a random song or video I usually open them
with Totem (the GNOME default) and that works okay</li>
<li>Library: I use <a href="http://www.musicpd.org/">MPD</a> as player because
it just seems to be rock stable. As UI I currently use Sonata, but I don't use
it for much more than choosing a song from the currentl playlist.</li>
<li>Editing: <a href="http://code.google.com/p/quodlibet/">ExFalso</a> seems
to have the best ID3v4 support, in particular it also allows multiple genre
fields. (Note that Vorbis even suggests you should use multiple artist fields
instead of the common "Arist A & Artist B" way of filling the fields)</li>
</ul></p><p>However, there is one thing I'm really not satisfied with: when putting
together a CD compilation for friends (say, as Christmas present), they are
quite useless. A key issue here is the <em>total playlist length</em>.
Guess what, I want to make sure it fits on a single CD. So I really need to
know the total playlist length. Why do so many media players (e.g. totem,
alsa-player-gtk, xfmedia4, vlc, mplayer, ...) not show you the total playlist
length? They did read all the files to get artist and title. Many even have
the individual song lengths, just not the total sum.</p><p>In the past I've been using old XMMS1 to check for the total length, or a CD
burning application like K3B by repeatedly importing my current folder.</p><p>Right now, I'm using <a href="http://code.google.com/p/quodlibet/">Quod
Libet</a> (since I like the tag-editing component exfalso a lot) to arrange
the playlist. It also gives me the total length, albeit I belive I've had
incorrect song lengths in it before (broken VBR files?), and it's not perfect,
too: being database-driven it has really long startup times for occasional
users (because of updating the database) and is much more heavyweight.
I also believe I've lost some playlists because I had moved my files around
once ... so I'm a bit sceptical.</p><p>Anyway, there are still hundreds of media players I havn't looked at. Don't
bother me to send me an email about one I havn't mentioned!</p><p>But if you are developing a media player, please consider the
<a href="http://en.wikipedia.org/wiki/Use_case">use case</a> of putting
together a music CD for your friends. In particular, for users that do not
use your player all day.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux</category>
   <pubDate>Fri, 25 Dec 2009 14:59 GMT</pubDate>
</item>
<item>
   <title>Highlighting links to your own site in Google search results</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009120801-google-search-highlight</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009120801-google-search-highlight.html</link>
   <description><![CDATA[
<p>The following
<a href="http://www.mozilla.org/unix/customizing.html#usercss">User stylesheet
snippet</a> can be used to highlight particular search results (such as your
own domain, if you want to quickly find it in Google search results):
<pre>
@-moz-document url-prefix(http://www.google.com/search)
{
a[href^='http://www.vitavonni.de/'] { background-color: yellow; }
}
</pre>
You might also want to add a copy for your localized Google domain:
<pre>
@-moz-document url-prefix(http://www.google.de/search)
{
a[href^='http://www.vitavonni.de/'] { background-color: yellow; }
}
</pre>
Or you could go the heavyweight way:
<pre>
a[href*=vitavonni.de] { background-color: yellow !important; }
</pre>
to even highlight any link to your domain.</p><p>This modification obviously only applies to your browser; it's meant to help
you finding links to your own site more easily.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Tue, 08 Dec 2009 16:12 GMT</pubDate>
</item>
<item>
   <title>Eclipse TPTP on Debian unstable/AMD64</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009120801-eclipse-tptp-on-debian</guid>
   <link>http://blog.drinsama.de/erich/en/2009120801-eclipse-tptp-on-debian.html</link>
   <description><![CDATA[
<p>For a Java project, I wanted to give the Eclipse profiler a try. It didn't work,
because it was missing a library (open the "Error log" view to see such things)</p><p>The corresponding library - libstdc++-5, and old C++ library - is no longer
available in Debian unstable, so you need to grab the package from
<a href="http://packages.debian.org/lenny/amd64/libstdc++5/download">lenny</a>.
It will install fine on unstable.</p><p>Things may or may not be different on other architectures.</p><p>[Update: But TPTP is far from stable for me. It freezes Eclipse pretty much
all the time.]
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Tue, 08 Dec 2009 10:11 GMT</pubDate>
</item>
<item>
   <title>Making pyroman IPv6 capable</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/2009120601-making-pyroman-ipv6-capable</guid>
   <link>http://blog.drinsama.de/erich/en/linux/2009120601-making-pyroman-ipv6-capable.html</link>
   <description><![CDATA[
<p>I'd like to make <a href="http://pyroman.alioth.debian.org/">pyroman</a> IPv6
capable. That is actually the one big thing before calling it a version "1.0".</p><p>I must admit that I havn't been very active on Pyroman (or Debian in general)
the last years. This goes even so far as that "pyroman" was considered
"abandoned" by Fedora or so. It is not; I use it on all my servers. It's still
in use at the network I developed it for (after all there is not that much
benefit for a workstation setup, where a 10 line iptables script will do the
job just perfectly.).</p><p>Anyway, I'd like to get IPv6 support into pyroman, but there is one big issue
here: I don't have any machine using IPv6, so I havn't used ip6tables myself
yet, so I don't know about all the magic involved ...</p><p>So if you use IPv6, it would be very cool if someone would jump in to get full
IPv6 support into pyroman. Madduck had already done some preliminary stuff,
but I didn't get around to have a look at the integration or completeness yet.</p><p>The '--no-act' and '--print' modes of pyroman should even allow development
without any IPv6 support or root permissions in the system.</p><p>Other things remaining on my pyroman wishlist:
<ul>
<li>Fully automatic <a href="http://blog.drinsama.de/erich/en/linux/2007040901-visualizing-iptables.html">iptables firewall visualization</a></li>
<li>Keeping traffic counters over firewall reloads</li>
<li>Configuration UI</li>
<li>A fancy 'arsonist' icon and a web page design</li>
</ul>
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux</category>
   <pubDate>Sun, 06 Dec 2009 18:38 GMT</pubDate>
</item>
<item>
   <title>Tracking outgoing links with Google Analytics</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009120401-tracking-outgoing-links-with-analytics</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009120401-tracking-outgoing-links-with-analytics.html</link>
   <description><![CDATA[
<p>Here's a code fragment to track outgoing links with Google Analytics.
As usual, use it <em>at your own risk</em>. I can not give you support for
Google products, for obvious reasons.</p><p>To use it, you <me>need</em> at least understand where to put it (call it
in a try-catch in onLoad) and how to adjust the variable name of your page
tracker (I'm not using the default).</p><p><pre>
function trackLinks(){
  var as=document.getElementsByTagName("a");
  var ig=["mydomain.tld","google-analytics.com"];
  for(var i=0; i&lt;as.length; i++) {
    var ignore=false;
    var oc=as[i].getAttribute("onclick");
    if(oc!=null){
      oc=String(oc);
      if(oc.indexOf('urchinTracker')&gt;=0
      || oc.indexOf('_trackPageview')&gt;=0
      || oc.indexOf('javascript:')&gt;=0)
        continue;
    }
    if(as[i].href.indexOf("mailto:")&lt;0){
      for(var j=0;j&lt;ig.length;j++){
        if (as[i].href.indexOf(ig[j])&gt;=0)
          ignore=true;
      }
    }
    if(!ignore){
      as[i].onclick = function(){
        var o=this.href.replace(/:\/*/,"/");
        pt._trackPageview('/out/'+o)+";"
        + ((oc!=null)?oc+";":"");
      };
    }
  }
}
</pre></p><p>This code tries to attach an onload handler to any outgoing link, ignoring
internal links or links that use JavaScript. If such a link is clicked,
it generates a virtual page access with an "/out/" URL that can be analyzed
in Google Analytics.</p><p>A side benefit (apart from knowing which links are interesting to your
visitors) is that you should get more accurate "time on page" statistics for
your pages.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Fri, 04 Dec 2009 15:19 GMT</pubDate>
</item>
<item>
   <title>Tracking Google image search in Analytics</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009112801-tracking-image-search</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009112801-tracking-image-search.html</link>
   <description><![CDATA[
<p>I do not really understand why they don't support this themselves, but
Google Analytics will not track keywords for Google image search. Instead
it just shows up as "referrer". A site I'm webmaster for,
<a href="http://www.swingandthecity.com/">Swing and the City</a>, gets a lot
of image search exposure (funnily for an image that is gone since August,
Google also needs to work on their index, too), so it was a bit odd to have
images.google.com show up as top referrer but not "organic search".</p><p>Here's the code I use to fix this:
<pre>
var r=document.referrer;
if(r.search(/images.google/)!=-1 &amp;&amp; r.search(/prev/)!=-1){
 var e=new RegExp("images.google.([^\/]+).*&amp;prev=([^&amp;]+)");
 var m=e.exec(r);
 pt._addOrganic("images.google","q",true);
 pt._setReferrerOverride("http://images.google."+m[1]+unescape(m[2]));
};
pt._addOrganic("maps.google","q",true);
pt._addOrganic("forestle.org","q",true);
pt._trackPageview();
</pre></p><p>Note that image search is more complicated than the maps and forestle search
engines I also add for keyword tracking. The original query is encoded in the
"prev" parameter, and the easiest (or only?) way to get working tracking is
to use the ReferrerOverride function of analytics.</p><p>Note: this is not a straight copy &amp; paste, since I use this code in a
compressed and encoded (for injection into the page via DOM ops) form.
So no guarantee of syntax completeness. You'll need to adjust it to your
variable naming anyway (I use "pt" instead of "pageTracker"). This is just
to show you the use of unescape on the "prev" parameter for this purpose.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Sat, 28 Nov 2009 14:01 GMT</pubDate>
</item>
<item>
   <title>Identifying Link Spammers via nofollow links</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009112601-identifying-link-spammers</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009112601-identifying-link-spammers.html</link>
   <description><![CDATA[
<p>I wonder if it's possible to identify link spammers (you know, these bots that
mass-submit a link into as many blogs/etc they can find in order to boost their
page rank) by the simple measure of how many of the links to their site are
marked 'nofollow'.</p><p>Say, a regular page should have less than 5% (and less than 20) nofollow links;
a site that goes significantly above this value probably employs some spam bot.</p><p>The only really hard thing is how to avoid attacks on a site using this ...
say, I write a bot that spams links to Microsoft on as many sites as it can
find that DO use 'nofollow', in order to get that site above the limit, and
have google penalize it.</p><p>So in general I don't think Google would automatically penalize such things,
still it could be used to e.g. have a human check the destination site for
useful content, and then only blacklist when it doesn't seem to be useful.</p><p>P.S. Which BTW is a reason why some of the SEO "do nots" are bullshit: it would
be too easy to deliberately use these to blacken a competitor. So a 'link farm'
will at most do nothing to raise your ranking; but Google must not allow you to
actually lower a competitors ranking by setting up a link farm to him!)</p><p>P.P.S. On another side note: Who guarantees that Google actually ignores
"nofollow" links? They could also just be assigned a lower weight or a penalty,
so that a "nofollow" link from a strong site such as Wikipedia would still be
worth a lot, while the average blog comments page link goes down to 0. Say a
"nofollow" link from a PR 6 site is as much worth as a regular link from a PR 4
site, and PR 2 becomes PR 0. Would already do much of the trick in discouraging
the use of blog spam bots. Because after all, ignoring the links on Wikipedia
for page rank would be quite stupid. In German Wikipedia, the page contents are
even "sighted" (aka: peer reviewed); this is a rather trustworthy source,
especially when you take time effects into account. A link being constantly in
Wikipedia on a popular page for more than a month very likely is good.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Wed, 25 Nov 2009 23:31 GMT</pubDate>
</item>
<item>
   <title>Lost an ext3 filesystem</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/2009112502-lost-ext3-filesystem</guid>
   <link>http://blog.drinsama.de/erich/en/linux/2009112502-lost-ext3-filesystem.html</link>
   <description><![CDATA[
<p>These days, something happened to one of my external USB drives that I so far
only knew from ReiserFS (which I since called ReisswolFS, German word play on
"shredder" ...). But, it's not ext3 which I blame.</p><p>Short story what happened:
<ul>
<li>Resumed the system from 'suspend'.</li>
<li>I copied some files onto the first file system.</li>
<li>I copied the same files to a second external disk (dual backup...)</li>
<li>I copied some files from the first disk, which caused an
access-beyond-end-of-disk, mounting the filesystem read only</li>
<li>Unmounted the filesystem, started e2fsck</li>
<li>Started copying the files from the secondary filesystem</li>
<li>Got <em>the same error</em> on the second disk.</li>
<li>Cancelled e2fsck doing more damage to the first disk.</li>
<li>Shutdown and reboot</li>
<li>Memcheck, three iterations. Nothing.</li>
<li>Checked second disk, no errors in filesystem (!), copied the files I had
issues accessing just fine.</li>
<li>Filesystem on disk #1 seriously trashed.</li>
<li>Had ext2fsck try to recover filesystem on disk #1</li>
<li>Pretty much all data on disk #1 is now in lost+found, it seems as if all
major folders were corrupted. Lots of corrupted file entries (character devices
with random permissions and numbers) there, too.</li>
</ul>
What I will do now:
<ul>
<li>Reformat disk #1, and restore it from the <em>other backup</em> (Extra
backup for teh win! I also have a 3rd copy of about 2 months ago off-site)</li>
</ul></p><p>As you can see, something was wrong with the system, not with the file system.</p><p>I have a strong suspect to have caused this. In case you wondered why I
included "resumed from suspend" above: I've been having system stability
issues with resume ever since upgrading to the Intel driver 2.9.0 and KMS
(Debian unstable+testing) with kernels up to 2.6.31. In about 1 out of 5
resumes, I get a Xorg or system lockup after anything from 1 to 60 minutes.
Sometimes I also experience video corruption after a few minutes, trashing some
terminal emulation until the next redraw. Just before writing this email I had
a typical lockup: when scrolling the terminal emulator. This has been a typical
trigger for lockups.  On contrast I havn't seen any such crashes (or screen
corruption) on a fresh boot.</p><p><a href="https://bugs.freedesktop.org/show_bug.cgi?id=22886">Freedesktop bug
reporting the same issue</a> closed as "not our bug, blame it on the kernel".</p><p>Note that 2.6.32 release candidate Changelog contain many changes for the
intel DRI kernel driver. So the bug might already be fixed in the RC kernels.</p><p><a href="http://bugzilla.kernel.org/show_bug.cgi?id=13811">Same report in
Kernel Bugzilla</a> is still 'NEW' though.</p><p><a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=534422">Related bug
report in Debian</a>, blaming it on KMS.</p><p>[Update: I've disabled KMS and upgraded to 2.6.32-rc8 and not had such a crash
since. But I can't pinpoint it to one or the other yet.]</p><p>[Update: just tried another external harddisk ...
<pre>
[305032.148616] EXT3-fs: mounted filesystem with ordered data mode.
[305066.061708] usb 1-8.3.3: reset high speed USB device using ehci_hcd and address 27
[305081.132471] usb 1-8.3.3: device descriptor read/64, error -110
...
[305147.468857] sd 4:0:0:0: Device offlined - not ready after error recovery
[305147.468880] sd 4:0:0:0: [sdb] Unhandled error code
[305147.468886] sd 4:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
...
[305147.473500] WARNING: at /build/buildd-linux-2.6_2.6.32~rc8-1~experimental.1-i386-g1b8iG/linux-2.6-2.6.32~rc8/debian/build/source_i386_none/fs/buffer.c:1159 mark_buffer_dirty+0x20/0x7a()
</pre>
It seems as if the USB disk stack still doesn't really survive suspends?
Let me try on a fresh boot later on.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux</category>
   <pubDate>Wed, 25 Nov 2009 21:39 GMT</pubDate>
</item>
<item>
   <title>Google Wave rolling out?</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009112501-google-wave-rollout</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009112501-google-wave-rollout.html</link>
   <description><![CDATA[
<p>When I got my Google Wave account, it took the invitation about a week to
arrive. A few days ago, I got my first own invites, and invited some colleagues
(in an attempt to actually find a <em>use</em> for Google Wave beyond "rich
media live messaging"). Within a few minutes they were "in". Now I just got
my second set of invites. So is Google Wave now getting ready for mass
opening, rocketing user numbers?</p><p>As you might have already guessed, I'm not convinced by Google Wave. It's
technically interesting and well-done. The demos are all nice. It's just that
the UI in the browser is a bit fragile and cumbersome, and the big question
so far is:
<blockquote>
What does Google Wave allow you to do that you couldn't do before?
</blockquote>
To me, there has been little actual use so far. Wave can do everything, but
isn't optimal in any of them:
<ul>
<li>You can use it for <b>mail</b>, but it only works with
other users of Wave and lacks good offline operation.</li>
<li>It beats pretty much any <b>instant messaging</b> in functionality, but the
UI isn't well for running in background. Most IM clients have a great UI for
"background" operation.</li>
<li><b>Collaborative editing</b> - I prefer having a real editor and real files
for that. Check out <a href="http://gobby.0x539.de/trac/">Gobby</a> for that.
I've heard Wave is good for remote <b>brainstorming</b>, though.</li>
<li><b>Social networking</b>, read "facebook". Wave doesn't have all the
filtering stuff that Facebook is still trying hard to get useful. Just wait
until someone releases "Mafia Wars" for Wave ...</li>
<li><b>Blogs</b>. Sure, I could do a 'Blog Wave' and invite my friends there.
Makes sense for small-audience private blogs; not for blogs like mine where
I mostly write to people that I do not know.</li>
<li><b>Games</b>. This probably is the current killer app on Wave: Sudoku.
Although the (widespread) implementation sucks somewhat. Magnetic Poetry is
a nice idea, but doesn't even work in Chrome for me properly ...</li>
<li>All the <b>web 2.0</b> stuff just gets on my nerves. I'm not going to use
it for my blog; I by design do not have comments on my blog, either. Being
able to web 2.0 everything doesn't make up for a lack in benefits.</li>
</ul></p><p>Yes, I'm aware that you should differentiate between the <em>protocol</em>
and the <em>ui</em>. Still pretty much everything is currently designed for
the web browser with full JavaScript and Flash capabilities.</p><p>Of course this isn't the end yet, Google Wave will evolve. Maybe into something
cool, maybe it will remain just a niche thing. Maybe some cool apps will just
use Wave as protocol. But I figure, I'll mostly wait for these things to
happen first before I become a frequent user of Wave.</p><p>The biggest thing I see is the "spam" (this especially includes 'Quiz', Mafia Wars and similar <a href="http://www.techcrunch.com/2009/10/31/scamville-the-social-gaming-ecosystem-of-hell/">Scamville</a> type of 'apps' that surely will show up in no time, once Wave is open to the public). What will Wave provide to me to handle this flood of worthless information that I'm getting more and more?</p><p>P.S. Please don't bother to ask for invitations to Wave.</p><p>P.P.S. <a href="http://linuxart.com/log/archives/2009/11/25/google-wave-native-scrollbars/">here's how to replace the odd scrollbars with the regular OS scrollbars</a> with a really simple user style (CSS).
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Wed, 25 Nov 2009 19:08 GMT</pubDate>
</item>
<item>
   <title>If you are in Bavaria, sign up for the smoking ban vote!</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/politics/2009111901-smoking-bans</guid>
   <link>http://blog.drinsama.de/erich/en/politics/2009111901-smoking-bans.html</link>
   <description><![CDATA[
<p>Starting 01/01/2008, Bavaria had introduced a quite hard smoking ban, which
also included bars and restaurants. It however contained a backdoor by
excluding non-public locations, which led to the creation of 'smoker clubs'
where you had to become a member to be admitted. At some point, most clubs
were of this kind.</p><p>In August 2009, however, the law was changed to exclude beer tents
(Oktoberfest ...) and small bars. Many people belive that this was to get
votes on the elections in september 2009 (which ended up in a minus of 6-7%
compared to the previous election and a historical low for the biggest party).</p><p>This caused several organizations to call for a public vote on restoring the
smoking ban to the 2008 state (without the 'smokers club' backdoor). In order
to force a public vote on a law (without the governments support!), we need
10% of the voters to register as supporters for the vote. You have to register
at your registered home town. For Bavaria, this means about 940.000 supporters.</p><p>If you are registered voter in Bavaria, <b>please drop by your municipality
and sign up. You need an ID and 5 Minutes, that's all</b>. 940.000 supporters
is an incredible lot of people to get to the offices, take along your friends!</p><p>When we get enough supporters, the Bavarian government has two options:
accepting the changes as proposed (and thus making the initative obsolete),
or conducting a public vote on it, offering an alternative (e.g. the current
law, no change) and have the voters decide (which is quite expensive, so if
many many people sign up, they might save that money and just pass the
proposed change themselves).</p><p>For more information (<em>german only</em>), check the
<a href="http://www.nichtraucherschutz-bayern.de/">Nichtraucherschutz
Bayern</a> Website, including the sign up office locations.</p><p>P.S. In other European countries, the introduction of a strong smoking ban has
led to a 10-15% decrease in heart attacks (20% for non-smokers). The german
constitutional court has also already ruled that the protection of non-smokers
and employees from passive smoke weights stronger than the individual's freedom
to smoke in enclosed spaces.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/politics</category>
   <pubDate>Thu, 19 Nov 2009 17:18 GMT</pubDate>
</item>
<item>
   <title>DebConf 2011 in Munich</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009111601-debconf-to-munich</guid>
   <link>http://blog.drinsama.de/erich/en/2009111601-debconf-to-munich.html</link>
   <description><![CDATA[
<p>We'd like to host DebConf 2011 in Munich, Germany.</p><p>However, this is a far from trivial challenge:</p><p>Rent in Munich, in particular for conference rooms, is far from cheap. In my
opinion, unless we get some really big sponsor (and I'd still prefer spending
sponsor money to fund developer trips to the DebConf instead!), the only chance
we have is to get some rooms at the university.</p><p>However given the development of the recent years (budget etc.), it has become
a lot more difficult to actually get rooms at the university for such events.
Unless the event is considered to be fully a part of the universitys "work",
we might have to pay rent to the university. Which again isn't that affordable.</p><p>Anyway, if you are in Munich, working at one of the universities, or in any way
interested in supporting DebConf 2011 in Munich, please join the
<a href="http://lists.debconf.net/mailman/listinfo/debconf11-germany">DebConf11
Germany mailing list</a>. Also check our meetings scheduled on the 
<a href="http://wiki.debian.org/LocalGroups/DebianMuc">DebianMuc Wiki page</a>,
currently every Monday, 18:00, at the new LiMux offices in Sonnenstr.</p><p>P.S. There will also be a Bug Squashing Party in Munich end of November:
<a href="http://wiki.debian.org/BSP2009/Munich">Munich BSP November 2009</a>
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Mon, 16 Nov 2009 15:42 GMT</pubDate>
</item>
<item>
   <title>Facebook tweaks</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009102501-facebook-tweaks</guid>
   <link>http://blog.drinsama.de/erich/en/2009102501-facebook-tweaks.html</link>
   <description><![CDATA[
<p>Every time Facebook changes anything, people complain. Most of the time just
because something has changed, without knowing actually what changed.</p><p>The october layout change for example isn't too big in fact. As far as I can
tell it's not much more than turning the "hot" items that were in the
right sidebar into a special tab (and breaking the refresh for the live feed,
but I guess they'll fix that soon). The "live" tab is basically all
information (see below for getting rid of certain restrictions); the "News"
tab tries to reduce this amount of information by only showing you certain
posts Facebook magic considers to be "important". If you are a heavy user you
will probably prefer the "Live" feed, if you are a casual Facebook user, go
with the "News" feed to have less <strike>crap</strike> posts to read.</p><p>Still there are some things you should be aware of when you are a facebook
user (not all of these are new):
<ul>
<li><b>Privacy settings</b> in Facebook. You really should have a look at
them. Be aware that if you put them all to the maximum, it may be next to
impossible for not-yet-friends to add you. So I recommend to leave the
"search privacy setting" to "all" but only have it show your photo and
the option to add you as friend. This is enough for people to be able to
actually add you as friend. (Setting it to "Friend of a Friend" was not
sufficient for me.)</li>
<li>Friend lists. These are really useful when you use Facebook for promoting
events. Make lists of regional and topic grouping, for example I have a
"swing dancers in Munich" list. If you bother people with irrelevant
invitations they'll just ignore or unfriend you!</li>
<li>Friend lists can also be used as a privacy device. I have a friend list
called "privacy" with only those I consider to be really close. Photo markers
etc. are restricted to these friends.</li>
<li>The "live" feed will not show all your friends. By default it is restricted
to 250 friends. The "Options" button at the end of the live feed page will
allow you to increase this setting to 5000 and allow you to check which friends
activities facebook had been hiding from you. (And indeed some people were I
wondered if they had stopped using facebook had just been hidden from the
feed by facebook ...)</li>
<li>I wrote a <a href="http://userscripts.org/scripts/show/44687">Greasemonkey
script</a> (Greasemonkey is a Firefox extension, but by now also available for
many other browsers, apparently even IE) to move the filters from the left to
the right column, expanding the main feed column. Much more senisble layout
IMHO.</li>
<li>Avoid using applications where possible, especially all those Quiz
applications and games. Remember: you will give the application access to most
of your data <em>and</em> most of your friends' data as well. So make sure you
can trust that application! (See this <a href="http://blog.aclu.org/2009/06/11/quiz-what-do-facebook-quizzes-know-about-you/">Post by the American Civil Liberties Union for details</a>)</li>
</ul></p><p>Also you should never forget that all the data you put online is hard to get
rid of again. Just don't put anything there you don't want everyone to know.
Facebook can be really powerful when used right for example as promotion
channel. But the way you should be using it is to first consider what you
want people to have an impression of you, then try to present yourself this
way. Don't just throw everything that comes to your mind there. (This even more
applies to blogs and web sites, obviously, that don't have any privacy control)
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Sun, 25 Oct 2009 13:15 GMT</pubDate>
</item>
<item>
   <title>Friends update - LiveDash, HoneyWish, Amiando</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009090401-friends-update</guid>
   <link>http://blog.drinsama.de/erich/en/2009090401-friends-update.html</link>
   <description><![CDATA[
<p>A short update on some friends of mine.</p><p>First of all, <a href="http://people.ischool.berkeley.edu/~patrick/">Patrick F.
Riley</a> - I worked with him on some projects when I was visiting the UC
Berkeley, one of which was a predecessor to his latest thing:
<a href="http://www.livedash.com/">LiveDash</a>. It's really cool: it allows
you to search almost in realtime in TV feeds. It also live-indexes Twitter,
blogs, news sources etc.</p><p>Secondly, <a href="http://www.honeywish.net/">HoneyWish</a> (currently
only available in German) is a service for a "honeymoon travel gift list"
thing. It works like the traditional gift lists, except that instead of putting
all kind of household stuff on it, there are all the parts of the honeymoon
trip on the gift list. This makes much more sense these days: people tend to
get married later; they might even be sharing a house for some time before
getting married. So they don't need much silverware anymore, but they for sure
will enjoy their honeymoon trip - so what could be a better gift for them?</p><p>Third, <a href="http://www.amiando.com/">Amiando</a> a web-based ticketing
and event management service. Founded already some years ago by some friends,
it has been growing and coming along nicely. Every now and then, it won some
award, many of them in the "top startup" category.</p><p>There are of course many more projects of friends I'd like to point out,
but these three definitely are highlights.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Fri, 04 Sep 2009 01:46 GMT</pubDate>
</item>
<item>
   <title>Embedding Flash: don't forget wmode="transparent"</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009090301-embedding-flash</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009090301-embedding-flash.html</link>
   <description><![CDATA[
<p>If you are doing a complex web layout (such as my
<a href="http://www.swingandthecity.com/">Swing and the City</a> layout
which features alpha-transparent fixed layers), and want to embed Flash
(e.g. on the <a href="http://www.swingandthecity.com/swing/">Was ist Swing?</a>
page - German: What is Swing), make sure you add the attribute
<tt>wmode="transparent"</tt> to your <tt>embed</tt> tag, and
<tt>&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;</tt> to
your object. Otherwise, a layer - in particular popup menus - might end up
below the flash.</p><p>This includes you, YouTube. In HD view, the user popup menu only has the
top 3.5 entries out of 5 accessible for me.</p><p>The following XSLT stylesheet can be used to find such embeds in a bunch
of XHTML files using the command line
<tt>xsltproc findNoWmode.xslt $( find -iname '*.html' )</tt>
<pre>
&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:html="http://www.w3.org/1999/xhtml"&gt;
&lt;xsl:output omit-xml-declaration="yes" indent="no"/&gt;
&lt;xsl:template match="/"&gt;
  &lt;xsl:call-template name="t"/&gt;
&lt;/xsl:template&gt;
&lt;xsl:template name="t"&gt;
  &lt;xsl:copy-of select="//html:embed[not(@wmode) and (count(param[@name='wmode']) = 0)]"/&gt;
&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;
</pre></p><p>You can of course also write a XSLT stylesheet to insert the wmode statements
whenever there is none, to make transparent your default.</p><p>[Update: I've received comments that this comes at qutie a performance
cost for Flash, and that this might be the reason why YouTube doesn't use it
- in particular for the HD videos. Also it isn't supported by WebKit based
Browsers so far (so Safari neither?) and nor does it seem to be working
in Gnash, an opensource flash plugin. So you have to choose between multiple
evils if you are using Flash...]
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Thu, 03 Sep 2009 21:17 GMT</pubDate>
</item>
<item>
   <title>Swing and the City relaunched</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009080601-satc-relaunched</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009080601-satc-relaunched.html</link>
   <description><![CDATA[
<p>We've opened a completely redesigned
<a href="http://www.swingandthecity.com/">Swing and the City</a> web site
today. The layout was quite a pain to get working because of transparency
and non-scrolling parts. But on my last tests, it was working quite well
in all of the major browsers. But if you notice any issue, please tell me
(email: erich AT debian org)</p><p>I'm aware that the red-yellow border on the left doesn't line up right.
I'm waiting for fixed graphics from the designer for that. There is also a
glitch with clicking the logo when scrolled just a little bit down. These
are on my to-do. At some point I also want to increase the use of CSS spriting
to further reduce page load times. Oh, and Internet Explorer sucks, btw.</p><p>The web site is about Swing dancing in Munich (so no tech today), and at
this time only in German. At a later stage, we might add English, too.</p><p>During August we'll also be building our own studio, "Cats Corner", which will
actually be somewhat similarly decorated. :-) Congratulations to Christine
for doing all that for the Lindy Hop scene!</p><p>P.S. <a href="http://www.bringdownie6.com/" rel="nofollow">Bring Down
IE6.com</a>, <a href="http://www.ie6nomore.com/" rel="nofollow">IE6 No
more.com</a></p><p>P.P.S. See <a href="http://blog.drinsama.de/erich/en/web/2009042801-internet-exploder.html">this blog post</a> on how it is impossible to use the CSS "clip" property in a way that both IE7 and IE8 will understand. While only one is W3C standard, Firefox just accepts both ... but at least IE8 goes with the official standard now.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Thu, 06 Aug 2009 13:38 GMT</pubDate>
</item>
<item>
   <title>LoOP: Local Outlier Probabilities accepted at CIKM'09</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009072701-loop-accepted</guid>
   <link>http://blog.drinsama.de/erich/en/2009072701-loop-accepted.html</link>
   <description><![CDATA[
<p><blockquote>
Hans-Peter Kriegel, Peer Kröger, Erich Schubert and Arthur Zimek<br />
<b>LoOP: Local Outlier Probabilities</b><br />
</blockquote>
Has been accepted at CIKM 2009 (The 18th ACM Conference on Information and
Knowledge Management), November 2-6th 2009, Hong Kong. And will appear in
the conference proceedings published by the ACM Press.</p><p>It's an outlier detection method based on LOF (Local Outlier Factor) but a
bit more statistically robust and with an easier to interpret score. Given
the statistical backing, it works reasonably well on samples such as data pages
of an appropriate index structure, reducing complexity to linear for the
approximative version.</p><p>This publication is a bit special to me: I suggested the approach to my
colleagues and they gave me the abstract and title for my birthday. :-)
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Mon, 27 Jul 2009 09:19 GMT</pubDate>
</item>
<item>
   <title>At the SSTD09 conference</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009070901-sstd09</guid>
   <link>http://blog.drinsama.de/erich/en/2009070901-sstd09.html</link>
   <description><![CDATA[
<p>This week I'm at the
<a href="http://sstd09.cs.aau.dk/">11th International Symposium on Spatial and
Temporal Databases</a> in Aalborg, Denmark.</p><p>My demo was yesterday, titled:
<blockquote>
Elke Achtert, Thomas Bernecker, Hans-Peter Kriegel, Erich Schubert, Arthur Zimek:<br/>
<b>ELKI in Time: ELKI 0.2 for the Performance Evaluation of Distance Measures for Time Series.</b>
</blockquote></p><p>While this release visible only adds a small piece - some distance functions
for time series and some related visualization code - it still marks a major
milestone in <a href="http://www.dbs.ifi.lmu.de/research/KDD/ELKI/">ELKI</a>
development. Large parts of ELKI were reorganized and rewritten (such as all
the output handling code) and lots of stuff added, including a lot of
visualization related code that is not yet completely used in this release.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Thu, 09 Jul 2009 20:56 GMT</pubDate>
</item>
<item>
   <title>Swing music at Amazon</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/dancing/2009060901-amazon-swing-music</guid>
   <link>http://blog.drinsama.de/erich/en/dancing/2009060901-amazon-swing-music.html</link>
   <description><![CDATA[
<p>As many will know, Swing dancing and -music has become my big hobby and love.
I'm co-teaching classes every week, and of course people ask me where to get
some music to dance to.</p><p>For this, I'm trying out the Amazon "aStore" functionality. Basically you
setup a few categories and add Amazon products to these categories for people
to choose from. Of course Amazon will also show other products it considers
relevant etc.</p><p><a href="http://astore.amazon.de/vitavonni-21">My Swing music on Amazon
"store"</a> (on Amazon.de, but I guess it will also somehow take you to other
Amazon sites?)</p><p>The (editor) UI is not very convincing yet. For example, it lacks an obvious
way of moving "products" from one category to another, and you can't see more
than 9 entries on a page in the editor, reordering is via entering sequence
numbers etc. - that definitely could use some improvements.</p><p>Anyway, some people might find this useful.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/dancing</category>
   <pubDate>Tue, 09 Jun 2009 16:56 GMT</pubDate>
</item>
<item>
   <title>Fun with Wolfram Alpha</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009053101-fun-with-wolfram-alpha</guid>
   <link>http://blog.drinsama.de/erich/en/2009053101-fun-with-wolfram-alpha.html</link>
   <description><![CDATA[
<p><a href="http://www.wolframalpha.com/">Wolfram Alpha</a> was often hyped as
the latest and greatest search engine.</p><p>I wouldn't call it so. It's just a very minimalistic search frontend to a
nice database with lots of numerical facts.</p><p>Yes, it can give you the height of the eiffel tower (because that's a fact
in its databases). It can even compute for you what Pi times the height of
the eiffel tower is. But that is about as far as you can go in combining.
In my tests, I wasn't able to compare the temperature in Munich with the
temperature in Berlin (both of which WA will visualize you with a pretty
graph, so these are facts in WA) - their query parser just doesn't get my
question.</p><p>The funniest reply so far however was to the question:
<blockquote>
<a href="http://www.wolframalpha.com/input/?i=How+many+cars+in+germany%3F">How
many cars in Germany?</a>
</blockquote>
The answer of WA (which btw is copyrighted by WA):
<blockquote>
No
</blockquote></p><p>Seriously, I doubt that there are no cars in Germany.</p><p>At least it also offers an explanation why it comes to this conclusion:</p><p>Cars is a town in south-western France (which as you might guess currently is
not a part of Germany. :-) ) - so for WA, there are at least cars somewhere
in Europe, but not in Germany!
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Sun, 31 May 2009 22:26 GMT</pubDate>
</item>
<item>
   <title>If you're wondering why your circle is a diamond ...</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009052501-java-bug</guid>
   <link>http://blog.drinsama.de/erich/en/2009052501-java-bug.html</link>
   <description><![CDATA[
<p>... you might be bitten by this
<a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6294396">Java bug
rendering arcs as straight lines at large zoom levels</a>.</p><p>It looks like a classic to me: in order to improve rendering performance, you
approximate arcs with straight lines at small resolutions (if it's just 2
pixels big, nobody will be able to tell the difference). Except of course,
when you end up doing the same approximation at a large zoom value - of course
a 100-pixel circle looks different from a 100-pixel diamond.</p><p>Reported in 2005, still not fixed in current Java (we're in 2009 now).</p><p>Sun is <em>really slow</em> at fixing Java bugs.</p><p>See also a related
<a href="https://issues.apache.org/bugzilla/show_bug.cgi?id=41294">Apache Batik
bug report</a>. Fortunately, this only applies to Java rendered graphics -
SVG export, PDF, Postscript are all fine.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Mon, 25 May 2009 15:16 GMT</pubDate>
</item>
<item>
   <title>Adobe GoLive question</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009051601-golive-question</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009051601-golive-question.html</link>
   <description><![CDATA[
<p>Is there any way to provide an alternate CSS stylesheet for GoLive CS2 only,
not for regular browsers? Because there are some things in that layout that
are too difficult for the GoLive renderer, it doesn't display them right.
The pages are still editable (just plain XHTML), it's just not looking
right in GoLive (advanced CSS).</p><p>The site already has alternate stylesheets for browsers such as the broken
Internet Explorers, so if I could convince GoLive to use their stylesheet it
might be looking a lot better in the editor, too ...</p><p>I am aware that GoLive CS2 has been abandoned in favor of DreamWeaver.
Still it's going to be used in a project I help with the web templates.</p><p>(Other options would be <a href="http://kompozer.net/">Kompozer</a> and
<a href="http://www.w3.org/Amaya/">Amaya</a>, but none of them seem really
fit for production use: Amaya was just removed from Debian because it had
some security issues and the maintainers had the impression the code was such
a mess that there will be much more such issues. And Kompozer seemed to be a
mostly dead branch of a Gecko hack (although there has been a new alpha release
this year) ... is there some reliable opensource non-source HTML editor that
I'm missing?)</p><p>P.S. Sorry, no comments in this blog. Use Email: erich AT debian ORG
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Sat, 16 May 2009 14:21 GMT</pubDate>
</item>
<item>
   <title>Java hacks: Generics and toArray</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009051201-java-hacks</guid>
   <link>http://blog.drinsama.de/erich/en/2009051201-java-hacks.html</link>
   <description><![CDATA[
<p>Arrays and Generics in Java do not mix very well. In order to create an
array, you need to know the object class the array is supposed to store.</p><p>Arrays in Java are special: they can efficiently store primitive data types.
The expected difference in efficiency between <tt>byte[]</tt> and
<tt>Byte[]</tt> is pretty big (of course some good VM might optimize) for
obvious reasons (think of: references, garbage collection, pointer sizes, ...).</p><p>This is probably why you need to know the type before creating an array
(because an array of primitive types such as <tt>byte</tt> will be different
from one that stores objects of some kind).</p><p>In particular, the following Java code
<pre>
  String[] foo = (String[]) new Object[0];
</pre>
results in a run time error ("[Ljava.lang.Object; cannot be cast to
[Ljava.lang.String;"). But it gets more confusing when you introduce generics:</p><p><pre>
public static &lt;T&gt; T[] test() {
  T[] te = (T[]) new Object[0];
  System.err.println(te.length);
  return te;
}</p><p>String[] foobar = test();
</pre>
will print "0", then throw the same run time error in the <tt>foobar</tt> line.</p><p>What happens here is that in the <tt>test()</tt> method, <tt>T</tt> actually is
replaced with "<tt>Object</tt>" at compile time. Thus the array type works just
fine, and so does the call to <tt>te.length</tt>.
Upon returning, it is then cast into a <tt>String[]</tt> array and fails.</p><p>Now here comes a crazy Java hack:
<pre>
public static &lt;T&gt; T[] test(T... ts) {
  T[] te = (T[]) java.lang.reflect.Array.
      newInstance(ts.getClass().getComponentType(), 0);
  System.err.println(te.length);
  return te;
}</p><p>String[] foobar = test();
</pre></p><p>The exception is gone, foobar is of the proper type now!</p><p>A result of discovering this hack are these two methods:
<pre>
public static &lt;T&gt; T[] newArrayOfNull(int len, T... ts) {
  // Varargs hack!
  return (T[]) java.lang.reflect.Array.
      newInstance(ts.getClass().getComponentType(), len);
}</p><p>public static &lt;T&gt; T[] toArray(Collection&lt;T&gt; coll, T... ts) {
  // Varargs hack!
  return coll.toArray(ts);
}
</pre></p><p>Notice how elegant the last method looks - and it finally allows you to do
<tt>toArray(collection)</tt> instead of
<tt>collection.toArray(new WhateverClassTheCollectionHas[0])</tt>.</p><p>Note that this is still a <em>hack</em>, and may or may not work with all Java
compilers, JREs and/or Java versions.</p><p>Update: Note that this 'hack' is also not transitive. The context calling
<tt>toArray</tt> needs to know the object type <em>at compile time</em>. So it
doesn't save you much more than writing "new KnownClass[0]" etc.</p><p>Update: So I'm actually not using this - it's just a hack, and often quite
hackish. The problem is that when you call e.g. toArray in an Generics context,
it will actually create an array of "Object", so it makes much more sense
to verbosely specify the class you want to use for the arrays (and get some
reliability in use back).
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Tue, 12 May 2009 11:25 GMT</pubDate>
</item>
<item>
   <title>Dropbox experiences?</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009051201-dropbox</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009051201-dropbox.html</link>
   <description><![CDATA[
<p>Has anyone experience with <a href="https://www.getdropbox.com/">Dropbox</a>?</p><p>It seems to be an interesting web storage service, with 2 GB of free storage.</p><p>However, the Linux client seems to be closed source
(which is understandable, it seems to have a lot of neat features) - so I
intend to use the web interface only (at least for now).</p><p>Update #2: There is a
<a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=499874">RFP bug
for Debian</a>, <a href="http://www.getdropbox.com/downloading">some Source is
on the download site</a>. And while this sort (except the images) is GPL, it's
just the nautilus integration part, not the daemon you also need.</p><p>Did you try Dropbox? Does it work well? I know some people (especially Windows
users) who could benefit a lot from a service like that, so I wonder if I
should recommend them Dropbox. Or is there some better alternative (it should
allow sharing of files though - synchronization is not as essential, it is a
lot about exchanging files too large for usual email in small user groups;
still synchronization probably is a comfortable way of transferring the files
without having to think about it yourself)?</p><p>No comments in this blog - email me via erich AT debian ORG.</p><p>P.S.
<strike>I know there is some referral program to get more storage, feel free to
send me your referral link - I'll remove this PS once I've signed up.</strike></p><p>P.P.S. There also is <a href="https://ubuntuone.com/">Ubuntu one</a>, but as
far as I can tell Ubuntu only so far. Looks very similar.</p><p>P.P.P.S. So far, I've received a lot of praise for DropBox.</p><p>P^4.S.
<a href="https://www.getdropbox.com/referrals/NTExMjc1MDM5">My own referral
link</a>, feel free to use this to sign up (+256 MB for you, too!) and
"upgrade" my account.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Tue, 12 May 2009 08:18 GMT</pubDate>
</item>
<item>
   <title>Supporting Internet Exploder</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009042801-internet-exploder</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009042801-internet-exploder.html</link>
   <description><![CDATA[
<p>Quoting <a href="http://msdn.microsoft.com/en-us/library/ms530748(VS.85).aspx">MSDN</a> on the <a href="http://www.w3.org/TR/CSS2/visufx.html#clipping">CSS "clip"</a> property:
<blockquote>
As of Internet Explorer 8, the required syntax of the clip attribute is identical to that specified in the Cascading Style Sheets (CSS), Level 2 Revision 1 (CSS2.1) specification; that is, commas are now required between the parameters of the rect() value.<br/>
...<br/>
In Internet Explorer 7 and earlier (and in Internet Explorer 8 or later in IE7 mode, EmulateIE7 mode, or IE5 mode), the commas should be omitted.
</blockquote>
... and if you want to support both?</p><p>I see a few options:
<ul>
<li>Giving both rules, hoping that the browsers will just ignore one they can't
parse (and not revert to no clipping) - <em>untested</em></li>
<li>Use <a href="http://en.wikipedia.org/wiki/Conditional_comment">Conditional
Comments</a> to use a "ie7fixes.css" file (<em>works</em>)</li>
<li>Do not use clipping (the most commonly used solution)</li>
<li>Ignore the existance of Internet Exploder 7 and just follow the CSS 2.1
standard (that's the way it should be)</li>
</ul>
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Tue, 28 Apr 2009 11:09 GMT</pubDate>
</item>
<item>
   <title>php3bb Captcha cracked</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009042101-phpbb3-captcha-cracked</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009042101-phpbb3-captcha-cracked.html</link>
   <description><![CDATA[
<p><a href="http://www.darkseoprogramming.com/2008/07/21/phpbb3-code/">DarkSEO
has some code to attack php3bb captchas</a>. (Note: I didn't even look at the
code, it could be a virus or anything).</p><p>I do not find that very surprising that this has happened, most of the captchas
around are very naive, and I've seen multiple scientific articles detailing
how to attack various captchas. Many use colors and thin lines to make them
look hard, but after applying a naive energy function and doing some blurring
to remove the thin lines, they break down.</p><p><a href="http://recaptcha.net/">ReCaptcha</a> is quite interesting, because
it doesn't bother with some useless colorification that doesn't change
contrast. But I wonder if it can't be overrun by spammers and how long it will
scale. Still I figure it is what I would pick right now, because they can
upgrade it if it actually is attacked by solvers.</p><p>It doesn't help much for the proxy attack on Captchas though (offer users to
view some pr0n in exchange for solving a Captcha that you actually were given
to solve by another site) - at least not when combined with some XSS and/or
bot net. (The 'obvious' proxy approach can be IP-filtered.)
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Tue, 21 Apr 2009 17:53 GMT</pubDate>
</item>
<item>
   <title>Web data time series</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009041701-timeseries-web-data</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009041701-timeseries-web-data.html</link>
   <description><![CDATA[
<p>For a research project, I'm looking for some real-world time series data.
Time-series are an interesting thing to study, however it is hard to get access
to interesting non-trivial real-world data.</p><p>I was wondering if some people could contribute me some summarized
web access data; no URLs or IP addresses.</p><p>The data I'd like to get can best be explained by the preprocessing step:
<pre>
... | perl -ne '/\[(\d+\/\w+\/\d{4}:\d\d):\d\d/
&& print $1."\n";' | sort | uniq -c
</pre>
(Sorry if you aren't fluent in regexp - it extracts the date and hour out of
an Apache default log file, nothing else. These lines are then summarized
by counting their unique appearances.)</p><p>That should produce 24 lines per day (one per hour), looking like this:
<pre>
  count day-of-month/month/year:hour
</pre></p><p>It would be cool if you could send me some series for a couple of sites, if you
happen to be in the position to provide this data. The data should cover at
least a few weeks, the longer the better even up to a few years.</p><p>Too small sites are however not very useful but might be too noisy (so probably
not the personal home page of your mom). If you are providing a larger number
of series, you are of course free to include them.</p><p>I don't care much about what the site actually contains, I'd just ask you to
give a tiny amount of meta information:</p><p><ul>
<li>Server timezone if not UTC</li>
<li>Typical user timezone (if applicable, that mostly applies to 'regional'
sites)</li>
<li>Coarse classification of site (e.g. "product website", "web service",
"search engine", "company site", "OSS software site" or something like
that ...)</li>
<li>Redistribution permission (if possible; most likely I'll only use
data series where redistribution is allowed, since some conferences ask you
kindly to provide such material)</li>
<li>Your email address will be withheld, to increase anonymity of the data</li>
</ul></p><p><b>Data use:</b></p><p>The main project idea is to evaluate different distance metrics in their
capability of separting the different data sources, assuming that there is
some difference in the shape in these curves. A different problem can be
constructed by breaking the series into chunks covering approximately a day
and then trying do separate different days, starting hours of the series (or
offset server timezone vs. user timezone) and/or weekdays from weekends.</p><p>In our experiments, we've come to the conclusion that the experimental results
are most interesting when there is a sufficient number of classes; so I'd like
to get like 20 different interesting data series. At the same time, the series
should be long enough, so I can break them into multiple chunks to have a
reasonable number of 'sub-series' per class. If I have really long series, or
e.g. series covering the same site but from multiple servers, I could even
experiment with taking sample of different length from these sets.</p><p>(Say I have series covering 2 years, that is ~17k samples, from 3 servers, then
I can take 51 disjoint sub-series of length 1000, or 102 of length 500, ...)</p><p>But it's obviously not possible for me to collect this data myself - I don't
operate two dozen of such sites myself ...</p><p>An extra project I've been considering some time is some peak prediction for
web accesses. Say you're running some fast growing site, wouldn't it be useful
to have a prediction when the number of accesses will likely hit some magical
limit (and e.g. overload your server) so you can increase your capacity on
time? Of course it would be more sensible to apply this prediction e.g. onto
CPU usage, e.g. predicting when your system might hit 90% load average over
a 5 minute window in regular operation. Network bandwidth and disk IO also
come into mind. You get the idea.</p><p>Please send them via email to erich.schubert AT gmail com</p><p>Thank you.</p><p>[P.S. Already recieved the first series, thank you! I can take care of sorting
myself, no need to worry about that. And yes, I'm aware that the series will
probably all be quite similar - common computer usage patterns such as work
hours - but that is common in real world data and part of the challenge.
Separting apples from dinosaurs is not a challenge.]
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Fri, 17 Apr 2009 16:01 GMT</pubDate>
</item>
<item>
   <title>Finding an web editor widget</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/web/2009041601-finding-an-edit-widget</guid>
   <link>http://blog.drinsama.de/erich/en/web/2009041601-finding-an-edit-widget.html</link>
   <description><![CDATA[
<p>I've <a href="http://blog.drinsama.de/erich/en/2009010701-blog-rewrite.html">previously mentioned my plans on redoing my blog</a>. Well, I've settled down on some design issues already (posts will be stored as mini <a href="http://en.wikipedia.org/wiki/Atom_(standard)">Atom feeds</a>, which makes the generation of Atom feeds for the blog and categories trivial, and gives me maximum flexibility. I already have a working converter for my existing blog to Atom posts.)</p><p>Generating static HTML pages from that will be easily possible using an XSTL
transformation (for example), and I got the feedback that I could just use
<a href="http://www.pixelbeat.org/docs/web/comments/">Google AppEngine for
blog comments</a>, so my actual blog could remain static-only (and thus much
more secure and reliable). Any attack/spammer/bot can then only kill the
comment functionality, not my own site.</p><p>Which brings me to another design consideration: the editing widget. Either for
the blog comment application, for writing my own blog entries (via a https
protected script or whatever) or maybe for a small CMS I've been thinking about
- having a reliable HTML in-browser editing widget is something I could use
every now and then (well, I'm not doing much Web stuff anymore these days).</p><p><a href="http://geniisoft.com/showcase.nsf/WebEditors">Geniisoft has a good
overview over in-browser (aka: Through The Web, TTW) editors</a>. The top
candidates seem to be:
<ul>
<li><a href="http://www.fckeditor.net/">FCKeditor</a></li>
<li><a href="http://tinymce.moxiecode.com/">TinyMCE</a></li>
<li><a href="http://trac.xinha.org/">Xinha</a></li>
</ul></p><p>I've heard before of FCKeditor and TinyMCE; I think I've been on the Xinha
page before, too. However, comments on them have not always been good.</p><p>To some extend they all seem to have (to some extend)
<a href="http://en.wikipedia.org/wiki/Creeping_featurism">feature creep</a>
which usually is a bad sign - most often this means that there are security
issues in one or another module or plugin.</p><p>TinyMCE for example has been described as "a bit of a pain" and "a tad
clumsy" on the GSoC mailing list. I have had less fights with it than
with the Debian Wikis (MoinMoin) markup language though.</p><p>I'm not looking for anything as big as these - I just need an editor that
allows for some basic formatting (bold etc., links) and that produces
reasonable XHTML output. I'll be feeding the output through some custom
cleanup script anyway, which will kill disallowed code. So I don't want
any editor which allows the user to create code that will then be killed
afterwards.</p><p>Any personal experiences with any of these, or an important alternative that
I might have missed (no PHP involved, please!) - email me (no comments on
blog) at erich AT debian org.</p><p>Update: I've received a couple of pointers. Please don't send me links to
projects that are not actively maintained anymore - I don't want to care about
having to fix bugs in the editor widget myself.</p><p>One link I've received twice is actually quite impressive:
<a href="http://www.wymeditor.org">WYMeditor</a>. It doesn't try to look like a
word processor, but actually is more of a semantic editor. Much more what I'm
looking for than any of the others. I've also received a link to the
<a href="http://developer.yahoo.com/yui/editor/">Yahoo! UI Library Editor</a>,
which is quite clutter-free, but in the default setup at least very
text-formatting oriented, not very semantic (that doesn't mean you couldn't
change it that way, I guess you can). I was also pointed to Dojo, but that
framework is totally feature creep (which also explains why it loads so slowly
I guess), and the last time I looked at it's source code, I
<a href="http://blog.drinsama.de/erich/en/xml/2007022102-dojo-dailywtf.html">had
some WTF moments</a> - code quality at least outside of core doesn't seem to
be very high (Yes, that code implements the "mod 7" operation using list shifts
instead of a simple arithmetic operation).</p><p>Looks like I'll give the first try to WYMeditor. Update #2: the code seems to
be rather ... complex. I'm looking for something neat and clean; it doesn't
need to bring along yet another XHTML schema and validator ... Maybe I should
try one of the others first?
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/web</category>
   <pubDate>Thu, 16 Apr 2009 14:51 GMT</pubDate>
</item>
<item>
   <title>Java: 'base' classes and 'final' modifier</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009033002-java-baseclasses-and-final</guid>
   <link>http://blog.drinsama.de/erich/en/2009033002-java-baseclasses-and-final.html</link>
   <description><![CDATA[
<p>In a Java framework I'm working on, 'pairs' arise everywhere. Unfortunately in
contrast to e.g. C++, Java doesn't include a predefined 'pair' class.
C++ templates are really nice because of the way they are complied and
optimized (in particular they also handle what Java calls 'native datatypes');
Java generics aren't up to par with that. (But yes, Java offers other benefits,
such as being much easier to parse and thus refactor). Anyway, this is not
going to be a rant on Generics.</p><p>So I have this interface in Java called <tt>Pair&lt;FIRST,SECOND&gt;</tt>
along with two implementations <tt>SimplePair&lt;FIRST,SECOND&gt;</tt> and
<tt>ComparablePair&lt;FIRST extends Comparable&lt;FIRST&gt;,SECOND extends Comparable&lt;SECOND&gt;&gt;</tt>.</p><p>For performance reasons, <tt>SimplePair</tt> is declared '<tt>final</tt>', and
so is <tt>ComparablePair</tt>. It's written everywhere that making classes
final can make a large difference in Java, and since these objects will be used
in a lot of places, it seems reasonable to care about this here.</p><p>However, it would often be nice to have better readable code, that is assuming
I'm using <tt>SimplePair&lt;Monkey,Banana&gt;</tt>, it would then be nice to
make a derived class <tt>BananaPreference extends
SimplePair&lt;Monkey,Banana&gt;</tt>, with added methods <tt>getMonkey()</tt>
and <tt>getPreferredbanana()</tt> to make the resulting code more readable.</p><p>Having readable code is also often quite as important as having performant
code, after all ...</p><p>If someone with solid experience in Java optimization has some ideas to share,
please do so! Email: erich AT debian DOT org - no comments in blog.</p><p>Right now, I have one idea on how it could be possible to achieve both
(seriously, I could use some feedback from Java Gurus on that): make
SimplePair and ComparablePair abstract, all methods there final, then derive
final classes as needed. Does that combine the benefits?</p><p>[Update: I received from Joachim Sauer the following helpful link:
<a href="http://gceclub.sun.com.cn/java_one_online/2005/TS-3268/ts-3268.pdf">JavaOne presentation on performance tuning and various VMs</a>. Basically this seems to indicate that in all these common situations, any modern Java VM should be able to figure out the inlining options automatically and optimize appropriately, so it won't benefit from any "final" hint by the developer. Note that a C++
compile doesn't do runtime optimization, but allows compile time optimization at a much lower level, so this rule doesn't apply to C++.]
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Mon, 30 Mar 2009 13:43 GMT</pubDate>
</item>
<item>
   <title>Google Summer of Code 2009</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/debian/2009033001-summer-of-code-2009</guid>
   <link>http://blog.drinsama.de/erich/en/linux/debian/2009033001-summer-of-code-2009.html</link>
   <description><![CDATA[
<p>Just a short reminder that the application phase for the
<a href="http://code.google.com/soc/">Google Summer of Code 2009</a> is running.</p><p><a href="http://code.google.com/soc/" class="img" ><img src="http://code.google.com/images/2009socwithlogo.gif" width="300" height="200" border="0" alt="GSoC 2009 logo" /></a></p><p>So far, we have quite <em>few</em> applications. Deadline is April 3rd, 19:00
UTC. Usually applications arrive rather late, but still I have the impression
that we have much less than the previous years. But less copy &amp; paste, too.</p><p>If you are interested in doing a GSoC project at Debian:
<ul>
<li><a href="http://wiki.debian.org/SummerOfCode2009">Check the Debian Wiki</a>
which has all kind of relevant information.</li>
<li>Talk to Debian people</li>
<li>Make sure it's related to Debian (and not just "runs on Linux")</li>
<li>Talk to Debian people</li>
<li>Make sure your application shows your genuine interest and has some
original ideas, <em>copy &amp; paste will not be sufficient</em></li>
<li>Talk to Debian people</li>
</ul>
I hope to see more applications - and good luck that we get enough slots for
all of you!</p><p>P.S. as far as I can tell, current Debian Developers can be eligible as well,
although it has also always been a goal of the project to get new contributors
involved.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux/debian</category>
   <pubDate>Mon, 30 Mar 2009 12:43 GMT</pubDate>
</item>
<item>
   <title>On Facebooks new layout</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009032001-facebook-new-layout</guid>
   <link>http://blog.drinsama.de/erich/en/2009032001-facebook-new-layout.html</link>
   <description><![CDATA[
<p>Lot's of people complain about the new layout, but I guess it's mostly
because it has actually changed. I'm okay with it, since it seems to feed me
less irrelevant information and more content deliberately generated by the
users.</p><p>Anyway, what I'm more interested in is the technical background. The last
months, I saw lot's of instability in Facebook, and it often gave the
impression of being next to breaking down because of load. With the redesign,
it feels more stable to me. Given that the 'mobile' page still seems to be
pretty much the same, they still seem to be aggregating the same information,
and the difference is just in the front page.</p><p>Guess their servers just weren't able to handle all the 'live feed' access
for all their users. Maybe they'll get some database experts to redesign that
feature and bring it back; after all it seems one of the things people most
miss in the new layout.</p><p>P.S.
Add this to your user stylesheet to hide some ads:
<pre>
@-moz-document domain(facebook.com) {
.UIHotStory_Ad { display: none; }
#profile_sidebar_ads, #sidebar_ads { display: none; }
}
</pre>
And <a href="http://userscripts.org/scripts/show/44687">install this small
greasemonkey script I wrote</a> to turn the front page into a two-column
layout (filters are moved to the right column) that is more efficent in using
the screens real estate.
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Fri, 20 Mar 2009 10:13 GMT</pubDate>
</item>
<item>
   <title>Weird Sansa Bug with 20+ playlists</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/2009021701-weird-sansa-bug</guid>
   <link>http://blog.drinsama.de/erich/en/2009021701-weird-sansa-bug.html</link>
   <description><![CDATA[
<p>On my Sansa Clip, a playlist that worked fine before suddenly stopped working.
I couldn't find anything wrong with the list: it had DOS line separators,
DOS directory separators and relative paths (all stuff that Sansa needs,
unfortunately). I had generated proper EXTINF meta data lines, the list was
called .m3u8 to singal it's UTF-8. And after all: it had worked before.</p><p>After deleting a leftover playlist, suddenly I noticed with another playlist
that had worked the day before that it was now empty. Ouch. So I investigated
a bit more...</p><p>At some point, I ran across a post on a similar Sansa device titled
<blockquote>
Firmware Bug - Playlists numbers 43, 46, and 49 are always empty 
</blockquote></p><p>This is also what I'm seeing on my device. Certain 'slots' in the playlist
list don't work.</p><p>No, I'm not kidding.</p><p>The way I 'fixed' my non-working playlist was this: I duplicated the file.
Now the duplicate is on the broken slot, and the old copy is working again.
Unfortunately, this also has shifted the later playlists into other slots, so
I'll have to repeat this for the other affected playlists. I guess I'll just
write a script to keep duplicate copies of all my playlists, since apparently
there are no two consequent slots affected.</p><p>Broken slots on my Sansa Clip v2: 22, 25, 28, 34, 37, 40 (I have 59 lists
currently on the player, so there might be more broken 'blocks' somewhere)</p><p>[Update: Sansa Firmware 02.01.32 says in the 'bugs fixed': Playlists redirect
to GoList when there are many playlists on device, so this might be fixed now]
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich"></category>
   <pubDate>Tue, 17 Feb 2009 13:08 GMT</pubDate>
</item>
<item>
   <title>Congratulations, Debian</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/debian/2009021601-congratulations-debian</guid>
   <link>http://blog.drinsama.de/erich/en/linux/debian/2009021601-congratulations-debian.html</link>
   <description><![CDATA[
<p><img src="http://debian.org/Pics/lennybanner_indexed.png" alt="Debian Lenny Banner" />
<b>Congratulations to all developers</b> (DDs or not, we have sponsored uploads,
Debian contributors and such, too!) who contributed to the release of Debian
GNU/Linux "lenny" 5.0. I must admit that I've been largely inactive recently,
I just managed to keep the bugs on my remaining packages low. Funnily, just
the day lenny was released I learned about a bug in Enigma on AMD64 that is
probably worth fixing through proposed updates ...
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux/debian</category>
   <pubDate>Mon, 16 Feb 2009 17:33 GMT</pubDate>
</item>
<item>
   <title>Flash on Linux</title>
   <guid isPermaLink="false">http://blog.drinsama.de/erich/en/linux/2009013001-flash-on-linux</guid>
   <link>http://blog.drinsama.de/erich/en/linux/2009013001-flash-on-linux.html</link>
   <description><![CDATA[
<p>... is almost as bad as ever before.</p><p>On my Core 1 Duo system (32 bit), official Adobe Flash crashes my browser
when I close a tab which had a Flash plugin running. Going to a blank page
then closing the tab usually helps, but it seems that sometimes Flash continues
to run in the background (sound doesn't stop either) and then will still crash.</p><p>On my AMD64 system, the official Adobe plugin crashes my browser. There are
reports at Adobe that link it to Gmail. So here, the Adobe flash is unusable.
The 32 bit version via nspluginwrapper did not have sound for me, probably
some issue with Pulseaudio.</p><p>So I'm now trying out Gnash. First thing I noticed: it has FlashBlock built in,
all those stupid Flash things won't auto-run but I'll always have the nice
play button to enable them when I want them to run. And while Gnash is working
pretty well on most sites, every now and then something just does not work.
Like some YouTube movies not playing (usually very short ones - maybe it will
only start playing a video when the buffer was completely filled, and a video
which is smaller than the buffer will thus not play?) etc.</p><p>Some wishlist items:
<ul>
<li>Add a 'auto run whitelist' to Gnash, when I go to YouTube I usually want to
actually run the video.</li>
<li>Provide some 'fall through' option, so if the Flash doesn't work right in
Gnash I can pass it on to the Adobe plugin if I really need to.</li>
</ul>
I know that the latter won't be easy, but isn't nspluginwrapper doing something
like that?
</p>
]]></description>
   <category domain="http://blog.drinsama.de/erich">/linux</category>
   <pubDate>Fri, 30 Jan 2009 12:12 GMT</pubDate>
</item>
</channel>
</rss>
