Discussion:
Protocol relative URLs
Gregory Maxwell
2008-06-11 15:35:30 UTC
Permalink
Anyone here have any experience with protocol relative URLs, that is
URLs of the form "//some.domain.org/file.ext"? URLs of this form are
uncommon but appear compliant with RFC 1808.

A possible application of protocol relative URLs for MediaWiki is that
they could be used remove the problem of needing duplicate parsings of
pages containing external (and cross-domain) links in order to support
HTTPS. With that issue out of the way the only impediment to high
performance SSL is connection setup which can be addressed with
dedicated crypto cards or crypto enhanced CPUs like Ultrasparc T1/T2.

I've confirmed protocol relatives they work in the browsers I have
ready access to. Googling around I found
http://nedbatchelder.com/blog/200710/httphttps_transitions_and_relative_urls.html#comments
which claims "The HTML 2 spec references RFC 1808 which describes this
behavior, and was written in 1995. I know this syntax works in IE6,
IE7, FF2, and Safari 2 and 3. I don't know of any browsers in which it
doesn't work."

Anyone here have practical experience with URLs of this form?
Brion Vibber
2008-06-11 18:04:28 UTC
Permalink
Post by Gregory Maxwell
Anyone here have any experience with protocol relative URLs, that is
URLs of the form "//some.domain.org/file.ext"? URLs of this form are
uncommon but appear compliant with RFC 1808.
A possible application of protocol relative URLs for MediaWiki is that
they could be used remove the problem of needing duplicate parsings of
pages containing external (and cross-domain) links in order to support
HTTPS. With that issue out of the way the only impediment to high
performance SSL is connection setup which can be addressed with
dedicated crypto cards or crypto enhanced CPUs like Ultrasparc T1/T2.
Duplicate parsing honestly isn't much of an impediment here; the primary
impediment is just configuring things properly for virtual hosts and SSL
proxies on the same IPs that we run non-SSL on.

eg, we want https://en.wikipedia.org/wiki/Foobar to work, which requires:

* SSL proxies in each data center
* wildcart certs for each second-level domain
* appropriate connection setup for the certs to work; eg one public IP
per data center per second-level domain

We did some experimentation in this direction last year, but haven't
really got the ball rolling yet.

- -- brion
Gregory Maxwell
2008-06-11 23:02:15 UTC
Permalink
Post by Brion Vibber
Duplicate parsing honestly isn't much of an impediment here; the primary
impediment is just configuring things properly for virtual hosts and SSL
proxies on the same IPs that we run non-SSL on.
I'd think that 2x the memory usage / disk usage in caches would be
nothing to sneeze at... or the cpu cost of holding one cached copy and
replacing the URLs internally.

In any case, I've started testing protocol relatives. If they turn
out to be reliable then it's just a further enhancement. I'll let
you know when I have some results.
Post by Brion Vibber
* SSL proxies in each data center
* wildcart certs for each second-level domain
* appropriate connection setup for the certs to work; eg one public IP
per data center per second-level domain
We did some experimentation in this direction last year, but haven't
really got the ball rolling yet.
Right, and the wildcard certs tend to be more expensive for who knows
what reason... :(
Cool enough.
Brion Vibber
2008-06-11 23:42:17 UTC
Permalink
Post by Gregory Maxwell
Post by Brion Vibber
Duplicate parsing honestly isn't much of an impediment here; the primary
impediment is just configuring things properly for virtual hosts and SSL
proxies on the same IPs that we run non-SSL on.
I'd think that 2x the memory usage / disk usage in caches would be
nothing to sneeze at... or the cpu cost of holding one cached copy and
replacing the URLs internally.
Ehh, wouldn't hurt in theory but I'm always suspicious. :)

Consider also non-browser uses:

* search spiders
* RSS feed links
* screen-scraping goodies
* post-processing web tools such as online translators, kanji->furigana
converters, etc

Note also that the fully-qualified URL may be pulled by {{SERVERNAME}}
or {{FULLURL:}} in the middle of wikitext, and is used in the print
footer etc.
Post by Gregory Maxwell
In any case, I've started testing protocol relatives. If they turn
out to be reliable then it's just a further enhancement. I'll let
you know when I have some results.
Sweet... :D
Post by Gregory Maxwell
Post by Brion Vibber
* SSL proxies in each data center
* wildcart certs for each second-level domain
...
Right, and the wildcard certs tend to be more expensive for who knows
what reason... :(
Otherwise people would buy one wildcard cert instead of two or three
individual-host certs, and the CAs would make less money... :D

- -- brion

Loading...