Discussion:
Blocking evil bandwidth thieves
Brion Vibber
2004-05-21 06:49:26 UTC
Permalink
On Tomk32's request I've blocked www.themensuche.de (217.172.181.98)
from access to our servers. It was "hosting" German Wikipedia pages by
sending every request on to de.wikipedia.org and slurping the content,
but using it solely for the purpose of bringing in search hits. The text
is never displayed; instead every hit uses a JavaScript redirect through
an intermediary or two to amazon.de.

Pages looked like this:
<html>
<head>
<title>Der Jäger aus Kurpfalz</title>

<script language="javascript">
what = 'site www themensuche de aus';
</script>
<script language="javascript" src="/daten/dat.js" type="text/javascript">
</script>
<link rel="stylesheet" type="text/css" href="/daten/lay.css">
</head>
<body bgcolor="white">
<div id="dat" name="dat">
Friedrich Wilhelm Utsch Friedrich Wilhelm Utsch wurde
[actual content from wiki snipped for brevity]
<center>
<a href="index.html">HOME</a> |
<a href="De.htm">INDEX</a> |
<a href="mailto:webmaster-3d/Iy/***@public.gmane.org">MAIL</a>
<hr>
<p><a href="http://www.google.de/search?q=Der Jäger aus Kurpfalz">SUCHE
BEI GOOGLE</a> | <a href="http://search.msn.com/results.aspx?q=Der Jäger
aus Kurpfalz">SUCHE BEI MSN</a></p>
<p>
History:<br>
Copyright (c) 2004
<D-E.W-I-K-I-P-E-D-I-A.O-R-G></wiki/Der_Jäger_aus_Kurpfalz><br>
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2
or any later version published by the Free Software Foundation.
A copy of the license is included in the section entitled<br>
<a href="gnu_fdl.txt">"GNU Free Documentation License"</a>.
</p>
</center>
</div>
</body>
</html>

Where the script brought up at the top contains only:
top.location.replace('http://www.steine.de/partnersys/index.html?swi=' +
what)

which if you go there, sends you on to the amazon.de search page. This
script is executed automatically before any content is shown (and the
markup is invalid and nothing at all shows in some browsers even if JS
is disabled). There is no way to read Wikipedia content at that site
short of turning off JavaScript and using 'view source'.

This is not even *vaguely* legitimate. I've cut them off in the IP
firewall on coronelli and browne, so they're no longer able to steal
bandwidth on every page hit just to promote their referrer links.

-- brion vibber (brion @ pobox.com)
Thomas R. Koll
2004-05-21 08:37:54 UTC
Permalink
Post by Brion Vibber
On Tomk32's request I've blocked www.themensuche.de (217.172.181.98)
from access to our servers. It was "hosting" German Wikipedia pages by
sending every request on to de.wikipedia.org and slurping the content,
but using it solely for the purpose of bringing in search hits. The text
is never displayed; instead every hit uses a JavaScript redirect through
an intermediary or two to amazon.de.
I wrote mails to three other sites obviously running proxies without cache
(and have more than 500MB traffic per month).
There might be a few others, surely not as evil as themensuche.de but
it still costs a lot of traffic which could be avoided by using the
database dumps instead.
Is anybody up for hunting and contacting them? We would need more stats
because the webalizer stats by kbyte only show the TOP10.
Also many hits/files but only a few visits is a good indicator.
Post by Brion Vibber
which if you go there, sends you on to the amazon.de search page. This
script is executed automatically before any content is shown (and the
markup is invalid and nothing at all shows in some browsers even if JS
is disabled). There is no way to read Wikipedia content at that site
short of turning off JavaScript and using 'view source'.
There are plugins for firefox for turning off css too.
--
== Weblinks ==
* http://www.tomk32.de - just a geek trying to change the world
* http://de.wikipedia.org/wiki/Benutzer:TomK32 - Free Knowledge
* http://tinyurl.com/27c88 - WikiReader Internet: bald im Druck
* http://tomk32.bookcrossing.com - Free Books
Jimmy Wales
2004-05-21 12:56:31 UTC
Permalink
Post by Brion Vibber
This is not even *vaguely* legitimate. I've cut them off in the IP
firewall on coronelli and browne, so they're no longer able to steal
bandwidth on every page hit just to promote their referrer links.
Bravo!

As is well known, I advocate taking a fairly relaxed approach to
"bandwidth thieves" who are framing our content, etc. I don't think
it's good, but I think the better approach is usually going to be to
contact them first and ask them to stop it, and try to get them on
board with a technologically better (and more friendly to our
pocketbook) method of reuse!

But as Brion says, this particular case is not even *vaguely*
legitimate.

Good call.

--Jimbo
Thomas R. Koll
2004-05-21 13:59:31 UTC
Permalink
Post by Jimmy Wales
As is well known, I advocate taking a fairly relaxed approach to
"bandwidth thieves" who are framing our content, etc. I don't think
it's good, but I think the better approach is usually going to be to
contact them first and ask them to stop it, and try to get them on
board with a technologically better (and more friendly to our
pocketbook) method of reuse!
But as Brion says, this particular case is not even *vaguely*
legitimate.
We haven't done much investigations yet, but as it seems the
website is in some connection to de:Benuzter:Herbye who sometimes
adds hidden (using divs) links to pages like steine.de which all
belong to Merkel Internet Service.
themensuche.de has a different Admin-C but uses steine.de as an
intermediary.

the admin of themensuche.de seems to have reacted and deactived his
evil script, but I didn't get an answer to my mail from them yet.

ciao, tom
--
== Weblinks ==
* http://www.tomk32.de - just a geek trying to change the world
* http://de.wikipedia.org/wiki/Benutzer:TomK32 - Free Knowledge
* http://tinyurl.com/27c88 - WikiReader Internet: bald im Druck
* http://tomk32.bookcrossing.com - Free Books
Loading...