Vitavonni

Tue, 28 Apr 2009

Supporting Internet Exploder

Quoting MSDN on the CSS "clip" property:

As of Internet Explorer 8, the required syntax of the clip attribute is identical to that specified in the Cascading Style Sheets (CSS), Level 2 Revision 1 (CSS2.1) specification; that is, commas are now required between the parameters of the rect() value.
...
In Internet Explorer 7 and earlier (and in Internet Explorer 8 or later in IE7 mode, EmulateIE7 mode, or IE5 mode), the commas should be omitted.
... and if you want to support both?

I see a few options:

  • Giving both rules, hoping that the browsers will just ignore one they can't parse (and not revert to no clipping) - untested
  • Use Conditional Comments to use a "ie7fixes.css" file (works)
  • Do not use clipping (the most commonly used solution)
  • Ignore the existance of Internet Exploder 7 and just follow the CSS 2.1 standard (that's the way it should be)

[category: /en/web | Permalink]

Tue, 21 Apr 2009

php3bb Captcha cracked

DarkSEO has some code to attack php3bb captchas. (Note: I didn't even look at the code, it could be a virus or anything).

I do not find that very surprising that this has happened, most of the captchas around are very naive, and I've seen multiple scientific articles detailing how to attack various captchas. Many use colors and thin lines to make them look hard, but after applying a naive energy function and doing some blurring to remove the thin lines, they break down.

ReCaptcha is quite interesting, because it doesn't bother with some useless colorification that doesn't change contrast. But I wonder if it can't be overrun by spammers and how long it will scale. Still I figure it is what I would pick right now, because they can upgrade it if it actually is attacked by solvers.

It doesn't help much for the proxy attack on Captchas though (offer users to view some pr0n in exchange for solving a Captcha that you actually were given to solve by another site) - at least not when combined with some XSS and/or bot net. (The 'obvious' proxy approach can be IP-filtered.)

[category: /en/web | Permalink]

Fri, 17 Apr 2009

Web data time series

For a research project, I'm looking for some real-world time series data. Time-series are an interesting thing to study, however it is hard to get access to interesting non-trivial real-world data.

I was wondering if some people could contribute me some summarized web access data; no URLs or IP addresses.

The data I'd like to get can best be explained by the preprocessing step:

... | perl -ne '/\[(\d+\/\w+\/\d{4}:\d\d):\d\d/
&& print $1."\n";' | sort | uniq -c
(Sorry if you aren't fluent in regexp - it extracts the date and hour out of an Apache default log file, nothing else. These lines are then summarized by counting their unique appearances.)

That should produce 24 lines per day (one per hour), looking like this:

  count day-of-month/month/year:hour

It would be cool if you could send me some series for a couple of sites, if you happen to be in the position to provide this data. The data should cover at least a few weeks, the longer the better even up to a few years.

Too small sites are however not very useful but might be too noisy (so probably not the personal home page of your mom). If you are providing a larger number of series, you are of course free to include them.

I don't care much about what the site actually contains, I'd just ask you to give a tiny amount of meta information:

  • Server timezone if not UTC
  • Typical user timezone (if applicable, that mostly applies to 'regional' sites)
  • Coarse classification of site (e.g. "product website", "web service", "search engine", "company site", "OSS software site" or something like that ...)
  • Redistribution permission (if possible; most likely I'll only use data series where redistribution is allowed, since some conferences ask you kindly to provide such material)
  • Your email address will be withheld, to increase anonymity of the data

Data use:

The main project idea is to evaluate different distance metrics in their capability of separting the different data sources, assuming that there is some difference in the shape in these curves. A different problem can be constructed by breaking the series into chunks covering approximately a day and then trying do separate different days, starting hours of the series (or offset server timezone vs. user timezone) and/or weekdays from weekends.

In our experiments, we've come to the conclusion that the experimental results are most interesting when there is a sufficient number of classes; so I'd like to get like 20 different interesting data series. At the same time, the series should be long enough, so I can break them into multiple chunks to have a reasonable number of 'sub-series' per class. If I have really long series, or e.g. series covering the same site but from multiple servers, I could even experiment with taking sample of different length from these sets.

(Say I have series covering 2 years, that is ~17k samples, from 3 servers, then I can take 51 disjoint sub-series of length 1000, or 102 of length 500, ...)

But it's obviously not possible for me to collect this data myself - I don't operate two dozen of such sites myself ...

An extra project I've been considering some time is some peak prediction for web accesses. Say you're running some fast growing site, wouldn't it be useful to have a prediction when the number of accesses will likely hit some magical limit (and e.g. overload your server) so you can increase your capacity on time? Of course it would be more sensible to apply this prediction e.g. onto CPU usage, e.g. predicting when your system might hit 90% load average over a 5 minute window in regular operation. Network bandwidth and disk IO also come into mind. You get the idea.

Please send them via email to erich.schubert AT gmail com

Thank you.

[P.S. Already recieved the first series, thank you! I can take care of sorting myself, no need to worry about that. And yes, I'm aware that the series will probably all be quite similar - common computer usage patterns such as work hours - but that is common in real world data and part of the challenge. Separting apples from dinosaurs is not a challenge.]

[category: /en/web | Permalink]

Thu, 16 Apr 2009

Finding an web editor widget

I've previously mentioned my plans on redoing my blog. Well, I've settled down on some design issues already (posts will be stored as mini Atom feeds, which makes the generation of Atom feeds for the blog and categories trivial, and gives me maximum flexibility. I already have a working converter for my existing blog to Atom posts.)

Generating static HTML pages from that will be easily possible using an XSTL transformation (for example), and I got the feedback that I could just use Google AppEngine for blog comments, so my actual blog could remain static-only (and thus much more secure and reliable). Any attack/spammer/bot can then only kill the comment functionality, not my own site.

Which brings me to another design consideration: the editing widget. Either for the blog comment application, for writing my own blog entries (via a https protected script or whatever) or maybe for a small CMS I've been thinking about - having a reliable HTML in-browser editing widget is something I could use every now and then (well, I'm not doing much Web stuff anymore these days).

Geniisoft has a good overview over in-browser (aka: Through The Web, TTW) editors. The top candidates seem to be:

I've heard before of FCKeditor and TinyMCE; I think I've been on the Xinha page before, too. However, comments on them have not always been good.

To some extend they all seem to have (to some extend) feature creep which usually is a bad sign - most often this means that there are security issues in one or another module or plugin.

TinyMCE for example has been described as "a bit of a pain" and "a tad clumsy" on the GSoC mailing list. I have had less fights with it than with the Debian Wikis (MoinMoin) markup language though.

I'm not looking for anything as big as these - I just need an editor that allows for some basic formatting (bold etc., links) and that produces reasonable XHTML output. I'll be feeding the output through some custom cleanup script anyway, which will kill disallowed code. So I don't want any editor which allows the user to create code that will then be killed afterwards.

Any personal experiences with any of these, or an important alternative that I might have missed (no PHP involved, please!) - email me (no comments on blog) at erich AT debian org.

Update: I've received a couple of pointers. Please don't send me links to projects that are not actively maintained anymore - I don't want to care about having to fix bugs in the editor widget myself.

One link I've received twice is actually quite impressive: WYMeditor. It doesn't try to look like a word processor, but actually is more of a semantic editor. Much more what I'm looking for than any of the others. I've also received a link to the Yahoo! UI Library Editor, which is quite clutter-free, but in the default setup at least very text-formatting oriented, not very semantic (that doesn't mean you couldn't change it that way, I guess you can). I was also pointed to Dojo, but that framework is totally feature creep (which also explains why it loads so slowly I guess), and the last time I looked at it's source code, I had some WTF moments - code quality at least outside of core doesn't seem to be very high (Yes, that code implements the "mod 7" operation using list shifts instead of a simple arithmetic operation).

Looks like I'll give the first try to WYMeditor. Update #2: the code seems to be rather ... complex. I'm looking for something neat and clean; it doesn't need to bring along yet another XHTML schema and validator ... Maybe I should try one of the others first?

[category: /en/web | Permalink]

Wed, 15 Apr 2009

CSU polemisiert über Killerspiele (mal wieder)

Offizielle Pressemitteilung 127/09 des Bayerischen Staatsministerium des Innern:

Killerspiele widersprechen dem Wertekonsens unserer auf einem friedlichen Miteinander beruhenden Gesellschaft und gehören geächtet. In ihren schädlichen Auswirkungen stehen sie auf einer Stufe mit Drogen und Kinderpornografie, deren Verbot zurecht niemand in Frage stellt.
Hallo? Killerspiele töten nicht. Menschen töten, mit Waffen, nicht mit Spielen.

Oder um es mit dem Comedian Vince Ebert zu halten:

Verursachen Zahnspangen Pubertät?
Statistisch ist diese These nicht leicht zu verwerfen - die beiden Ereignisse sind miteinander korreliert. Nur auf diese Weise findet man nicht die richtige Ursache-Wirkung-Beziehung.

Und so halte ich folgende These bzgl. Killerspiele dann doch für plausibler:

Menschen mit einer Neigung zu Gewalt bevorzugen Killerspiele (und tendieren auch eher zu Gewaltausbrüchen wie Amokläufen)

Um noch einmal auf Vince Ebert zurück zu kommen:

A: Warum dachte man früher, die Sonne kreise um die Erde?

B: Es sieht halt so aus, als ob die Sonne um die Erde kreisen würde.

A: Wie sähe es denn aus, wenn es andersrum ist, und die Erde um die Sonne kreist?!?

Fazit: Interpretationen sind immer subjektiv, und man sollte sich die Mühe machen, das ganze auch mal von einem anderen Standpunkt aus zu interpretieren, denn dieser kann eine ganz andere aber ebenso plausible Erklärung liefern! Und eine solche Interpretation lautet eben, dass Killerspiele lediglich das Gewaltpotential sichtbar machen, dass durch tiefer liegende Probleme verursacht wird.

Aber die Symptome zu behandeln statt den Ursachen, das war in der Politik schon immer einfacher und populärer.

[category: /de/politik | Permalink]
Menu
[planet.debian]
[planet.xmlhack]
[planet SELinux]
[munichblogs]
[email]
[RSS 2 feed]
[English RSS 2]
Categories
< April 2009 >
SuMoTuWeThFrSa
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
2627282930  
Archives
2010-Mar
2010-Feb
2010-Jan
2009-Dec
2009-Nov
2009-Oct
2009-Sep
2009-Aug
2009-Jul
2009-Jun
2009-May
2009-Apr
2009-Mar
2009-Feb
2009-Jan
2008-Dec
2008-Nov
2008-Oct
2008-Sep
2008-Aug
2008-Jul
2008-May
2008-Apr
2008-Mar
2008-Feb
2008-Jan
2007-Dec
2007-Nov
2007-Oct
2007-Sep
2007-Aug
2007-Jul
2007-Jun
2007-May
2007-Apr
2007-Mar
2007-Feb
2007-Jan
2006-Dec
2006-Nov
2006-Oct
2006-Sep
2006-Aug
2006-Jul
2006-Jun
2006-May
2006-Apr
2006-Mar
2006-Feb
2006-Jan
2005-Dec
2005-Nov
2005-Oct
2005-Sep
2005-Aug
2005-Jul
2005-Jun
2005-May
2005-Apr
2005-Mar
2005-Feb
2005-Jan
2004-Dec
2004-Nov
2004-Oct
2004-Sep
2004-Aug
2004-Jul
Other links:
Swing and the City - Lindy Hop in Munich