Redirecting RSS redirection
Sam Ruby suggests that the solution to redirecting RSS feeds is to use the rdf:about attribute of the channel item as a URL for the feed. If your feed changes URLs, just put the new URL in the old feed, and aggregators will pick it up and switch to the new URL.
A nice idea, I suppose, but completely doomed. The RSS 1.0 spec currently says that the channel rdf:about usually is “either the URL of the homepage being described or a URL where the RSS file can be found”, but that it “is a URI which identifies the channel”, nothing more. If you choose to use a URN or some other obscure form of URI, that’s perfectly legal. If you choose to use the URL of your HTML page, as I do, then you get some benefit from the RDF in RSS 1.0, since you make a few valid assertions about that page, and not just a bunch of temporary assertions about the RSS file itself.
To use the channel rdf:about as a redirect, you would first have to convince a significant percentage of the producers of RSS 1.0 that they should change to the URL of their RSS file. Given the rather large number of Movable Type users who are producing RSS 1.0 without even really realizing it, that’ll be tough. Unless you persuade most of them to switch, you’ll have to convince aggregator developers that they should try to GET the rdf:about URL whenever it isn’t the URL they are using for the feed, in case it’s a redirect, even though most of the time they’ll end up GETting (X)HTML instead of RSS. I doubt that many will be persuaded, especially since all that effort and dislocation is completely unneccesary. It’s also not very RDFish, since changing the channel’s rdf:about doesn’t say “this is in a new place now,” it says “here’s a completely different channel than the one you were expecting to find here: surprise!”
If you must reuse some existing thing, and must play nice with the keepers of the RDF flame, use mod_link, which is designed for exactly this sort of linking. Kevin designed it to be expanded with new link relationships, so I’m sure he’d be happy to incorporate l:rel="redirect". Add <l:link l:rel="redirect" l:type="application/rss+xml" rdf:resource="http://mynewurl.com/index.rss"/>, and once poor Jake gets done rewriting Radio’s namespace support to deal with a situation where a single empty element needs to be handled by different scripts based on the value of an attribute (since the handler for l:rel="redirect" can hardly be expected to also handle l:rel="service" and every arbitrary l:rel someone might add), and then everybody’s set.
Now that’s pot-stirring, not a simple little addition of the RDF namespace.
Jon Udell had a <redirect:newFeedLocation> element at the <channel> level in his feed for a short time. That’s something like what I had in mind. A namespace with one element in it. The value of the element would be the new URL of the feed. I certainly didn’t anticipate getting into RDF for this. I saw Sam’s post and couldn’t make heads or tails of it. What the heck does this have to do with RDF?
The only thing it has to do with RDF is that any module you propose will get wider support if it’s usable in both RSS 2.0 and 1.0. As an aggregator developer, I’m more likely to support mod_redirect if there’s just one thing to look for, in either format. People will support the same data in different tags, like dc:date and pubDate, if they want it badly enough, but if they’re looking at something like cloud versus cp:server, they’re more like to just skip the whole thing.
A simple element with the URL as the content, which makes perfect sense in RSS 2.0, runs into trouble in 1.0. It’s perfectly valid RDF and RSS 1.0, but it’s not useful RDF, because the URL is just a string, not a resource. An RDF parser can understand <redirect:newFeedLocation> just fine, and say that ”the newFeedLocation for this channel is the string http://newlocation.com/index.rss”, but it can’t do anything more with that. (Yeah, I know, steam is coming out of your ears, but bear with me a second.) Even if the RDF parser already knows all about the feed at http://newlocation.com/index.rss, it can’t put together the things it knows about that resource with the newFeedLocation, because it’s just a string. And since the RSS 1.0 folks are really big on reusing existing stuff rather than creating new elements, they’re more likely to accept a new relationship added to mod_link than a brand new element, even if it’s <redirect:newFeedLocation rdf:resource=”http://newlocation.com/index.rss”/>.
Whether it’s worth the inevitable hassles to do either a mod_link mod or an RDFed mod_redirect is a hard question to answer, and if I’m going to do much more fence-sitting I’m going to need some jeans with a double-layer seat.
Phil - like you, I’ll sidestep whether or not the RDF folks would find the usage of the about attribute without all the other RDF machinery in place useful.
However, given that the about attribute specifies that it should be either a link to the site or to the rss, it is a rather simple if check to see whether or not it matches the value of the channel link.
To illustrate: in the following excerpt, both values match, so it is not a redirect:
In the following, the two values do NOT match:
In my case, I can change things so that both http://radio.weblogs.com/0101679/rss.xml and http://www.intertwingly.net/blog/index.rss2 contain these lines. When reading these lines the latter feed, the attribute is merely a reaffirmation that you have reached the right feed. When reading theses lines in the in the former feed, you are provided with the information that the preferred location for this feed is the latter one.
I believe it is necessary to realize, that the syntax requirements of RSS/RDF and RSS 2.0 are different. This leads to the realization that some processing is always needed to convert between the two, and then:
Why not allow for multiple syntaxes in the same module - one for RSS/RDF and one for RSS 2.0?
For a redirection module this could be: <redir:newFeedLocation>URL</redir:newFeedLocation> for RSS 2.0 and <redir:newFeedLocation rdf:resource=”URL”> for RSS/RDF.
As long as the content/object is the same literal value (not in different formats like the pubDate/dc:date problem) it should not be a problem.
Redirection - He went thadda way
Dave Winer has raised the redirection issue: ”Also, I realize there’s a need for a RSS module for redirection. I
You guys need to step back a few steps and look at all the discussion you’re having over a brain dead simple addition. Unbelievable. Just add the fcuking namespace and be done with it. Geez Louise.
”This could be a first experience at really working together, with no flames”
How about not reinventing the wheel?
Look at the dcterms module. Specifically a very nice little element called ”isReplacedBy”
There’s also a very handy one called ”Replaces” that could be used in the new feed to indicate what other URL it’s replacing. There are, of course, issues with security here. You wouldn’t want your aggregator to pick up a ’replaces’ element unless it was authoritative about it. At the very least you should pick up the ’old’ URL and look to see that it’s ”isReplacedBy” URLs matches up.
Sheesh people, do some research here! All this reinventing of the wheel is a huge waste of effort. Go now and read the modules for RSS-1.0.
More specifically, the correct way to express this would look something like this.
The old feed, at http://example.com/oldrss.xml would need to contain a channel element like this:
<dcterms:isReplacedBy rdf:resource=”http://example.com/newrss.xml”>
And the new feed over at http://example.com/newrss.xml would need:
<dcterms:replaces rdf:resource=”http://example.com/oldss.xml”>
Thus an aggregator reading the old URL could learn if something new existed and it’s new location. At the same time it could look at the new feed and match up the URLs. This would help avoid the risk of someone sticking a ’replaces’ inside a new feed and hijacking the aggregator’s listing for the old one.
There’s also some nice elements in things like hasPart, isPartOf, references and such that would be interesting to see used in feeds. One could use them in a category feed to indicate it’s parent and vice versa.
The question of whether to use the rdf:resource vs the element value is simple a matter of style. Following the tremendous amout of work done by the RSS-1.0 group seems prudent.
I believe that these are the examples that Bill intended:
<dcterms:isReplacedBy>http://example.com/newrss.xml</dcterms:isReplacedBy>
<dcterms:replaces>http://example.com/oldrss.xml</dcterms:replaces>
Almost. I edited his comment to show the examples, which are skating around the literal versus resource question.
Bill, it’s not just a question of style, and you know it. To an RDF parser, <dcterms:replaces>http://example.com/oldrss.xml</dcterms:replaces> might just as well be <dcterms:replaces>The old feed at my old domain</dcterms:replaces>, since it’s not a resource, just a string that happens to look like one.
I’m pretty sure I’ve said this before, but reinventing the wheel can be useful, if you end up inventing rubber tires rather than iron clad wooden wheels. Just because something exists that you can shoehorn you current problem into doesn’t mean that it’s the best choice.
Given that anything parsing RSS as RDF and knowing how to deal with dcterms is likely to be schema-aware (and a bit farther off into the future, I’m afraid), how about this: define a new module, mod_redirect, with a schema that defines its one element, newFeedLocation, as a subPropertyOf dcterms:replaces. Syntax: your choice of <redirect:newFeedLocation>theURL</redirect:newFeedLocation> or <redirect:newFeedLocation rdf:resource=”theURL”/>.
Sam, fcuking is not a flame, technically speaking. ;->
Bill, I think it’s an interesting idea to use the dcterms element. But I have to go with what Phil says.
Why not ask Jon Udell and/or Rogers Cadenhead what makes sense to them. They’re the users who tripped over the problem. And they’re both smart guys and quite technical. Both have written books on programming. Maybe what makes sense to them matters? Let them lead, I say.
Phil, forgive me, your suggesting of subclassing feels (to me, at least) as the approach which more akin to putting iron cladding over wooden wheels.
What, exactly, is not road worthy in today’s RSS 2.0 context about dcterms:isReplacedBy as originally defined and documented?
I think everybody would be happy, if Phil’s suggestion of 9:46AM were to be used - with the addition of the ”syntax choice” being dependent on the RSS version, i.e. rdf:resource for 1.0 and literal content for 2.0
Comments?
FWIW, I liked Will Cox’s idea:
http://weblog.infoworld.com/udell/2002/09/25.html
Based on discussion here I gather that won’t fly. In that case, I’d be happy with the replaces/isReplacedBy idea. Or anything equivalent, really. I don’t care how it works, but I would like to avoid another disruption. It really messed me up last time.
Two things I don’t like about dcterms:isReplacedBy:
For RSS 2.0: it’s not especially human-readable. My design goal for an RSS 2.0 element is that it be instantly obvious just exactly what it is and does.
For RSS 1.0: as I’ve said, it’s not good RDF. A parser may know hundreds of things about the old URL and the new URL, but unless you give it a resource rather than a literal for the new URL, it’s not going to put them together.
Sam: ouch. I hate it when a perfectly good analogy turns around and bites me like that. Still, putting iron (or, with a bit of luck, rubber) cladding on a wooden wheel is exactly what RDF schema tells us to do, as near as I can tell.
So: here’s my 0.1 draft of a ”make as many people as happy as possible” redirect module. No embedded schema yet, since I’ve got to go looking for an example to steal. Suggestions and typo corrections welcome.
Phil: My design goal for an RSS 2.0 element is that it be instantly obvious just exactly what it is and does.
+1
Phil, re oldFeedLocation, what if I (for the sake of argument, I’d never do this) put an oldFeedLocation in my feed for a feed that I didn’t like, causing all people who subscribe to my feed to unsubscribe from the Evil Feed (the one I don’t like).
Right, I caught that in Bill’s earlier comment and then promptly forgot it. ”On encoutering a redirect:oldFeedLocation, an aggregator should inspect the oldFeedLocation, and if it includes a newFeedLocation pointing to the URL where the oldFeedLocation was found, it should unsubscribe from the oldFeedLocation” (with a bit of editing for readability) sound workable?
*sigh* No, Morten, I’m not particularly happy. I would have been had Phil identified something substantative wrong, however minor, with the existing element, but as it stands now:
==> isReplacedBy is emminently readable to this particular human. And it conceivably be used in RSS, OPML, XHTML, etc., etc., etc.
==> proposing two syntaxes indicates that the new proposal doesn’t solve the second problem any more than the existing one did.
The original request stated ”This could be a first experience at really working together, with no flames.” It certainly has been an educational experience. Phil, do you really expect RSS 1.0 advocates to drop a perfectly good element that has been in place for years? If not, why not simplify your proposal and only have one syntax?
Sam, *sigh* — I’m not particularly happy either.
Phil gave you some counters. The most important of which (to me) was that someday someone in RDF-land or Dublin Core-Land is going to come along and tell us we have to change the way we’re using the elements they defined.
The DCTERMS element is not exactly what we want. What we’re looking for is an XML-level redirect, for times when a content provider doesn’t have control over the server that’s hosting the old version of his or her content.
Anyway — to me, the people we have to please are:
1. Users
2. Tool developers
3. Aggregator developers
In roughly that order. Let’s take some time on this, and let’s hear from lots of all the above.
From my perspective as a user, the only thing I want is to put something in the old RSS file that tells aggregators where the new RSS file can be found.
The ”replaces/isReplacedBy” solution is OK, but I wonder if the complexity of a ”replaces” safeguard really is worth the implementation cost for aggregators. I control the access to both old and new RSS files. I’m not worried that someone at my old host will put a phony ”isReplacedBy” item in my old feed and send traffic to the wrong place, because if that’s a problem, my old host could do all kinds of other things to make me unhappy.
My original thought was that RSS redirection wouldn’t take place within RSS at all. I figured that the best solution would be for news aggregators to receive an HTTP redirect and act on it.
However, I’m happy with any solution that automates redirection.
How odd. I’m delighted. I see a seam between the two formats, where it’s possible to define something that will work perfectly well with either one, satisfying the needs of both.
I think having two syntaxes perfectly solves the second problem. There’s no need to have the URL be an RDF resource in RSS 2.0, but there is in RSS 1.0. Having both in the spec makes it clear to aggregator developers that they need to look for both, and may cut down on the amount of semantic augmentation that RDF-savvy producers feel the need to do (since that’s one of the scariest threats to developers I’ve ever seen in a spec).
”Been in place for years”? As an RSS 1.0 module, it’s been in place for four months, and is so widely known that Ben didn’t even think of it ;).
If (a huge if) every aggregator out there supports mod_redirect, and none support dcterms:isReplacedBy, then people who don’t like my minor redefining as a subPropertyOf are certainly welcome to use dcterms:isReplacedBy, and they even have my permission to feel superior while they look at all the aggregators still hitting their old URL.
I’m not worried that someone at my old host will put a phony ”isReplacedBy” item in my old feed and send traffic to the wrong place, because if that’s a problem, my old host could do all kinds of other things to make me unhappy.
As someone who writes an aggregator, having the safeguard sounds like approximately twice the work I want to do for this feature.
I’m a user, I’d like to see redirection work.
I’m a developer I’d like to use well thought out structures.
I’m an aggregator developer as well as a database developer. I’d like to see a single format. Coding around multiple standards is a pain in the ass.
Syndic8.com supports making web service calls into it asking for a feed’s status. One such status is ”Redirected”. A redirected feed can have the ID of the new location. This has been part of Syndic8 for a while as we’ve recognized things do move around a bit.
OK, I can see the point of using replaces/isReplacedBy (both are necessary) instead of the redirect module, but from a SW standpoint, it seems we first need to fix the meaning of the channel’s rdf:about - otherwise it might end up saying that an HTML page has moved!
I guess this is what the module adds to the current situation, since it states that it is the feed that has moved (even though it could be contradicting the rdf:about).
In any case, I see no problem with two syntaxes, even though the Dublin Core documents specify the RDF resource style.
<dcterms:isReplacedBy rdf:resource=”http://url.of.new.rss” />
I haven’t seen any real arguments yet as to why this is not the ideal solution
Win-Win.
(Incidentally, using such a redirect property/element in the channel part of the feed is one more argument in favour of the channel’s rdf:about being the RSS URI, rather than the HTML URI, as I was talking about last week
Yeah, I guess Rogers is right: the only time an aggregator needs redirect:oldFeedLocation is when it’s subscribed to the old feed, and if it’s subscribed to the old feed it already got the redirect:newFeedLocation there. Unless someone sees a use for it that I don’t, I’ll take it out in a rev or two.
Bill, you may not believe this, but I totally agree with your statement, esp the third part.
DJ, if you haven’t heard objections, bluntly, try again. Both Phil and I have explained. Try to understand where we’re coming from. Thanks.
Dave, I read everything, carefully, that people said before posting my own comments, which I stand by. Standard netiquette.
Which of the points I made don’t you agree with? Maybe there’s a disconnect. I’d be more than glad to expound if needed.
Thanks
It doesn’t sound like the issue of dcterms:isReplacedBy/dcterms:replaces is actually an issue — nothing dictates that an aggregator needs to actually take dcterms:replaces and screw over the original feed because of it. Or, to put it another way, the same issue exists (existed?) with redirect:oldFeedLocation, unless the aggregator takes the precaution of also doing the same thing it would have to with dcterms:replaces, namely checking the old feed for the corresponding tag to give it authority to commence ignoring said old feed.
Another wrinkle in all this: what happens when the old feed finally falls off the face of the Earth (e.g, the site is taken down)? It sounds like the new feed would, by necessity, have to stop including the dcterms:replaces or redirect:oldFeedLocation subelements, since the old site would no longer be available for checking the veracity of that subelement.
In the end, as a user, I’d like to see redirection work. However it’s implemented doesn’t matter one wit to me; I’m just a user, of either an aggregator or a website content management system, and any good aggregator or CMS will make the details transparent to me.
As both a tool and aggregator developer, I want to try to minimize the number of possible constructions out there that try to do the same thing. In my read of the dcterms:isReplacedBy element, it seems to be exactly what is being asked for here.
As a user and (occasional) tool developer, using dcterms:isReplacedBy with the rdf resource syntax exclusively seems the way to go:
<dcterms:isReplacedBy rdf:resource=”http://www.myweblog.com/newfeed.rss”>
As a user:
* Adding this to my own feed is a matter of editing my template with a simple cut-n-paste.
* Only one way to do the redirect, in whatever flavor of feed I have, so I don’t have to figure out what format my feed is before choosing the appropriate code (or worry about choosing the wrong one). This is the biggest plus. The question isn’t ”how hard is this to figure out?” but ”do I have to figure this out?”.
* The readability argument is bogus. This is just as readable as a bgcolor attribute on an HTML body tag. If the tag attribute is unreadable to a particular human, I think that the tag contents will be too.
As a tool developer:
* If I add this to my tool, I don’t have to add any conditional code to switch between different representations of the same information.
* Alternatively, I don’t have to add the information in more than one place (attribute and tag contents), so there is one less thing to keep in sync if I refactor something.
In short, as a user and developer, I want *one* place to add this information, in *one* way, using *one* syntax, no matter what.
Hmm. A few more comments came in while I was composing this. Dave, I see Phil’s objection to conflating the rdf:resource attribute with a literal as the content of the tag, but I do not see an objection to using the attribute alone.
Why not dcterms:isReplacedBy?
The element name issue: I don’t know how much end-user support you all do, but I’ve spent much of the last year and a half supporting users of Blogger, and now Blogger and Movable Type, and based on that experience I would rather have redirect:newFeedLocation with a single element in a spec that talks about nothing but redirecting than dcterms:isReplacedBy in a spec that talks about world+dog.
It’s not like isReplacedBy is something that’s widely known and supported: neither Ben nor I remembered it until Bill pointed it out. If I was developing an aggregator and trolling for elements I should support, I wouldn’t necessarily go through all of dcterms and say ”oh, I better support redirects from isReplacedBy.” In fact, I didn’t, and I’m pretty sure nobody else has either. RSS is already full of elements that either slightly or not-at-all redefine Dublin Core elements, and one more isn’t going to hurt.
Ease of implementation: so far, we have one general-purpose desktop aggregator developer talking about supporting it, and it will be vastly easier for him to support a module that does just one thing in its namespace.
Why not force the rdf:resource syntax?
Mostly, because it’s a slap in Dave’s face, and I won’t do it. But, also…
It requires that people adding a redirect to their RSS 2.0 feed add two namespace declarations rather than one.
Anyone who had developed an RSS parser for 1.0 and 0.9x/2.0 beyond title/description/link already has a system in place for finding the same information expressed in different ways in different places: saying ”if ( element == ”redirect:newFeedLocation” && attr['rdf:resource'] ) {…} elseif ( element == ”redirect:newFeedLocation” && data ) {…}” isn’t going to be a huge added burden, and it seems like a better place for the burden than on the producer. If you are writing a guide to RSS 2.0, then your section on redirecting says ”use <redirect:newFeedLocation>http://newdomain.com/index.rss</redirect:newFeedLocation>”, and if you are writing a guide to RSS 1.0, it says ”use <redirect:newFeedLocation rdf:resource=”http://newdomain.com/index.rdf”/>”. Producers who are going to be confused by a spec with two different syntaxes shouldn’t have to be reading the spec anyway, they should be reading a Busy Producers Guide.
Dave, DJ,
I think I see the problem. Upon reading the Dublin Core module section on isReplacedBy, it seems that the module does not use an rdf:resource attribute, but a literal value.
So, the objection seems to be that using the module as defined (eg. <dcterms:isReplacedBy>http://www.newsite.com/rss.xml</dcterms:isReplacedBy>) does not actually give any RDF benefits, so we can assume that at some point in the future someone will want to rewrite it to use an rdf:resource attribute instead of a literal contained by the element.
This is a valid objection.
Accordingly, in keeping with the benefits I earlier ascribed to Bill Kearney’s suggestion, I would like to throw my weight behind Phil’s specification, with the following caveat:
Please get rid of all the ”Recommended for RSS 2.0” sections, and remove the explicit ”Recommended for RSS 1.0” labels from the spec.
I don’t see a problem with including the rdf:resource attribute version in an RSS 2.0 feed. To a non-RDF parser it is just an attribute. Simplify the spec, please.
Hmm. Phil posted another comment, in part saying that he feels that forcing the attribute would be a slap in Dave’s face. Dave, do you feel that simplifying the spec to use only one variant with an rss:resource attribute would be a slap in your face?
Regarding the confusion issue over having more than one way to do it, the busiest and most naive producers aren’t going to read *anything*, they are going to copy-and-paste an example from another feed, or maybe from the first result in google, or some other source. If there is more than one way to do it, a certain percentage of folks are going to copy the wrong version because they found it first.
A certain other percentage are going to get a qualified answer in the form ”If you have a 1.0 feed do it like this, if you have a 2.0 feed, do it like that”, perhaps also from google. Then they have to figure out what version their feed is, and their feed may not be conformant to any particular version of RSS, depending on it’s copy-and-paste ancestry. At that point, they’ll probably decide to try one syntax or the other, and if it works with whatever aggregator their testing with, they’ll then stop.
Aggregators will have to support both syntaxes for both RSS variants (four combinations) in order to work with all feeds. Consequently, testing in an aggregator becomes useless in determining whether you made the correct choice.
Does any of this sound familiar? This is how we got the current mess that most HTML documents are, since most HTML authors stop testing if it ’looks good in their browser’, and browsers have to accept badly written HTML.
Phil, please reduce your spec to one syntax that works in both RSS variants. Mandating that you have to add another namespace declaration to a 2.0 feed (and having to deal with feeds that forget to add it) is (I think) preferable to having two syntaxes and having some producers choose wrong.
I agree it would be far more sensible to use the same syntax for all versions, like this:
The RSS 1.0 dcterms module suggests that the simple syntax illustrated above should be used and I have just removed a couple of examples from the dcterms module that used rdf:about and rdf:resource (the dcterms document might not be updated for a hour or two).
Michael, your question is argumentative. I really appreciate Phil’s statement. This problem can be solved without RDF.
Yes! Damn! Thanks for pointing this out, my last post was wrong :-( I’m going to fix the dcterms draft now.
Should we expect anything different?
1. Dave Winer asks for community help deciding on a way for someone’s website syndication feed to indicate that it
Based on the DC in XML guidelines the syntax for plain XML should be this:
Also if the plan is to use this in the <channel> element then the URI should be the address of the old channel not the address of the old RSS feed… Shouldn’t it?
In any case, this stuff is best done using HTTP redirects isn’t it?
Replaces: pretty much a non-issue for implementation, since nobody has a secure use-case for it. It’s just there for decoration, so I say do it any way you like.
Yes, a dcterms:replaces or dcterms:isReplacedBy is saying that the resource under discussion, in this case the channel’s rdf:about, is replacing or replaced by. So to use isReplacedBy for a redirect, you have to mandate that the channel be about the feed, and not everyone agrees that it should be. redirect:newFeedLocation is explicitly defined as being the new location of the RSS feed for the channel, so it doesn’t matter what your channel is about.
HTTP redirects are great for them as can do it, but what started this whole discussion was the plight of people who can’t. A busy producer’s guide should certainly say ”first, if you can, use a HTTP redirect, but in case you can’t or there are readers that don’t support HTTP redirects but do support newFeedLocation…”. Whether the mod_redirect spec should also mention that you should combine it with HTTP redirects if you can is another question.
At this time, do any news aggregators do anything useful with HTTP redirects of an RSS feed?
Dave,
If my question seemed argumentative, perhaps it was because I was trying to clarify an underlying assumption behind one of Phil’s arguments.
I understand that you appreciate Phil’s statement. If someone had expressed concern over my feelings, I would appreciate it too. However, what Phil has raised as an objection to forcing the use of an RDF attribute, ”I won’t do that because Dave would be offended” is an ad-hominem argument (albeit an unusual one).
Now, unlike most folks, I don’t pretend that ad-hominem arguments are necessarily invalid. We’re social creatures making decisions based on imperfect information. If some of the information we do have relates to one of the people involved in the discussion, we can’t usually afford to simply discard it.
In point of fact, because I respect your contributions and would want you to continue your involvement, I would give Phil’s argument considerable weight, if I knew that it was true. Accordingly, since you’re posting to this discussion, I asked you whether it was true.
I still don’t know the answer to that question.
Michael, I’m still not sure what the question is. Do I think Phil should change his draft spec? No, not unless he wants to. Do I want to go out of my way to support the RDF? Straight answer: It never occurred to me that we would have to do any extra work to support it. This is a very simple problem. There may be others like it down the road. I have very little interest in RDF. Hopefully that answers the question.
Phil, I’m getting Internal Server Errors when I post, and when I come back the post appears not to be here, so I post again. Oy. Next time I’ll assume there’s a short lag between posting and when it appears on the page. Sorry for the doubles.
Not your fault: it’s the reasonable reaction. I think the problem is that I’ve got too many templates (RSS 1 and 2, for posts and for comments, as well as RSS for individual entries, on and on) all getting rebuilt every time a comment is posted, and so when the server is busy it takes long enough to rebuild that my host’s reaper kills the process, after the comment is saved but before the page is rebuilt. You don’t see your comment, so you do the right thing and repost it. I’m not sure quite what I’m going to do about it, but it better be something, before it drives me crazy.
HTTP redirects: preliminary answers based on a PHP script that returns a 301 (I had it handy already):
RadioUserland, Aggie, and AmphetaDesk all get the content from the redirected location, but none of them seem to change the URL that they will use for the next access (unless I’m looking in the wrong place: Aggie and AmphetaDesk both store it in myChannels.opml, and don’t change it; Radio I’m not so sure). So unless there’s something wrong with my redirect (take a look), then I’d say that using a 301 isn’t currently enough to get aggregators to stop hitting your old URL.
Radio does not store the new URL. That’s because the redirect is handled at a lower level than the aggregator. It probably wouldn’t be a huge deal to improve it, and clearly we’re going to have to do it (and at the higher level as well).
Phil, could you share the URL for your PHP script? I may do a little programming during the playoff game and see if I can get Radio to do the right thing with a redirect.
Recapping the biggest issue with isReplacedBy (especially for those of you following along in #rss-dev (and it creeps me out every time I get a referral from the logs of people talking about me in IRC, reasonable or not)):
You can only use dcterms:isReplacedBy as a redirect if your channel’s rdf:about is the URL of the feed, rather than the URL of the HTML page. Otherwise you are saying that the HTML page URL is replaced by the new URL for the feed. There are people who don’t feel that the channel’s rdf:about should be the URL of the feed, me among them, never mind the hassle involved in explaining to users that before they redirect they have to make sure to change their MT template from ”<channel rdf:about=”<$MTBlogURL$>”>” to um, uh, I’m not actually sure how you get the URL for an arbitrary index template in MT. Hard code it, I guess. So, no, that wheel won’t roll without some reinventing.
Didn’t I? http://philringnalda.com/scratchpad/redir.php
And thanks for the reminder - I meant to watch some football today, but then I got up to dozens of comments, and never got around to turning on the tube.
Dave, here’s a recap:
I had asked:
Dave, I see Phil’s objection to conflating the rdf:resource attribute with a literal as the content of the tag, but I do not see an objection to using the attribute alone.
Phil said:
Why not force the rdf:resource syntax?
Mostly, because it’s a slap in Dave’s face, and I won’t do it.
So I asked:
Dave, do you feel that simplifying the spec to use only one variant with an rss:resource attribute would be a slap in your face?
To which you replied:
Michael, your question is argumentative.
So I said:
If my question seemed argumentative, perhaps it was because I was trying to clarify an underlying assumption behind one of Phil’s arguments. [snip]…what Phil has raised as an objection to forcing the use of an RDF attribute, ”I won’t do that because Dave would be offended” [snip]…Accordingly, since you’re posting to this discussion, I asked you whether it was true.
Whoosh, that was a longer recap than I thought it would be.
To sum up and repeat the question unambiguously: Dave, would you be personally offended if Phil changed his spec to requiring the use of an rdf:resource attribute on the tag rather than allowing a literal value contained in the tag?
Hi guys,
I’m Rogers Cadenhead’s ”old host” ;)
There’s one thing I’d like to mention: the importance that the method used doesn’t require separate RSS feeds to be generated for old and new locations.
To cope with existing aggregators that understand HTTP redirects but not the new redirect semantics, we want to be able to use an HTTP 301 redirect to point from the old to the new location. However, this means that RSS fetched from both URLs will be *identical*.
That means you can’t use ’replaces’ and ’isReplacedBy’, because you can’t display two separate RSS feeds if one of them is just a redirect to the other one.
’redirect:newFeedLocation’ sort of works, but won’t really make sense when it is presented in the XML *at the new location*. How about ’redirect:authoritativeLocation’ or ’redirect:properLocation’? ’urlUpdate:rssLink’?
<troll>Personally, I’d recommend just dropping in an ’rssLink’ element inside ’channel’ (alongside ’link’), that always points to the most up-to-date URL, but I guess that doesn’t fit in with the whole the-core-is-frozen-now thing ;-)</troll>
i personally don’t see what the big deal is with using dcterms:replacedBy. it’s an appropriate application of a property from a well-defined vocabulary. the problem seems to be that RDF folk, myself included, want to a see the replacement URL as an rdf:resource (because the value of rdf:about ought to be the URL of the RSS feed, not the HTML page it describes, but i digress…), and the non-RDF folks don’t seem to agree. what’s funny is the objection raised to adding the RDF namespace. if it were any other namespace, no matter how obscure, i can’t imagine it would be an issue.
Phillip, I don’t understand why you’d want both to be identical. Why not redirect through a 301 if you can? That’s the easiest and it’s the most universally understood way to redirect at this time.
Re your troll, yes indeed it would be easier to just throw an optional redirect attribute on the <rss> element, but we are frozen, and gotta stick to that. It would be even cooler to put such an optional attribute on the <xml> element, that way it would work for all kinds of stuff, not just RSS. Keep dreaming, right.
Justin: that’s not a digression, that’s the core of why I don’t think isReplacedBy is a good fit. If you’ve got the chops to change the RSS 1.0 spec to mandate that the channel rdf:about only point to the URL of the feed, and then persuade a significant percentage of the aggregator developers to support isReplacedBy, more power to you. I’ll stop producing RSS 1.0, since the only reason I’m producing it now is for the side benefit of getting some RDF that describes my HTML page, but you go right ahead.
Justin, you’d have to try it out to see if your imagination is right or wrong. Phil and I have both said we’re aiming for as simple as possible. I’ve even written up my design philosophy for RSS in Sept 2000, when RDF came into our world. Simplicity is important, even paramount, if you want to make this stuff go and grow.
Now that said, it would be great if you could get a group of pragmatic, non-adversarial developers to speak on behalf of RDF. So much of what we hear from its advocates begin with personal attacks. That’s another reason work seems to stop when RDF becomes the topic.
Dave, that a problem can be solved without RDF does not mean it should be solved without it.
Personally, I’m all for using the dc elements. Yes, they might not be supported now, but any aggregator which would spend the time implementing one element from a new namespace could spent he same amount of time implementing a single dc element instead, and then when dc is supported by a different program it will understand the constructs in these feeds. Why duplicate effort?
about the digression:
The channel element contains metadata describing the channel itself, including a title, brief description, and URL link to the described resource (the channel provider’s home page, for instance). The {resource} URL of the channel element’s rdf:about attribute must be unique with respect to any other rdf:about attributes in the RSS document and is a URI which identifies the channel. Most commonly, this is either the URL of the homepage being described or a URL where the RSS file can be found. emphasis mine.
all of the language, with the exception of the last sentence seems to indicate that it ought to be the channel itself. the statement that the metadata describes the channel also seems to indicate (perhaps even dictate) that the value of rdf:about ought to be the URI of the channel, since that is what we ”are talking about” in an RDF sense. (why that last sentence made it through is a mystery to me) the sample given in the spec uses the RSS URI as the value of rdf:about.
the problem i see with using the HTML URL is that you might generate conflicting metadata. for example, the dc:creator or the HTML page might be different then the dc:creator of the RSS feed. likewise, the dc:dates will probably differ.
DJ linked to his musings on this above.
…it would be great if you could get a group of pragmatic, non-adversarial developers to speak on behalf of RDF.
i’m certainly no expert, but i’m more than willing to help spread the RDF message. what is it that you’d like to know? i’ll admit that at a first glance RDF can seem awkward and strange. however, once you take the time to get to know it, the value becomes apparent.
i’ve been trying to come up with an analogy and here’s what i have so far: learning RDF is sort of like writing your first recursive function. at first it seems strange, like it won’t work. however, it soon becomes second nature and you can solve all sorts of problems. or, maybe learning RDF is like bulding a class (or struct) that contains and instance of itself. again, at first it seems like it won’t work, but you soon find yourself using the technique all the time. before you know it, you’re building all sorts of cool stuff.
That’s another reason work seems to stop when RDF becomes the topic.
we’ve stopped working on RDF!? how come i didn’t get the memo! :)
This is a problem which needs to be solved by both RDF and not-RDF. That’s the post-2.0 reality. That’s what we are talking about here.
There are plenty of ways that an individual can indicate in RDF that another file should be used instead: isReplacedBy is one, mod_link is another, there may well be others. That’s not the problem we are trying to solve.
For those of us who want to see things that are supported, rather than simply things that are elegant, a solution needs to fit with both RSS 1.0 and RSS 2.0, so that enough aggregator developers will support it to make it seem worth using to producers. I don’t think it’s a coincidence that there are lots of two year old RSS 1.0 modules that are full of RDF goodness, but aren’t supported by a single aggregator, but the simple, obvious-in-XML content:encoded is now quite widely supported.
Further, using isReplacedBy requires that Justin get his way, and the channel rdf:about be mandated to point to the RSS file. I think his analysis of the spec is confusing ”the channel”, a concept, with ”the channel element and its contents”, some RDF expressed in XML. I can’t stop the RSS WG from changing the spec, but if they do the RDF in RSS 1.0 will lose all its value to me, since it will no longer serve as RDF describing my web site, and I will stop producing RSS 1.0. If you read DJ’s analysis, you’ll see that he never says that it must point to the RSS feed, only that there are some situations where that’s the only thing that makes sense, and other situations where either way might make sense.
We haven’t even stopped working with RDF, but if you don’t quit being obtuse just for your own amusement, we’ll wish we could.
Further, using isReplacedBy requires that Justin get his way, and the channel rdf:about be mandated to point to the RSS file. I think his analysis of the spec is confusing ”the channel”, a concept, with ”the channel element and its contents”, some RDF expressed in XML.
you make an excellent point about the distinction between the two. i, personally, read channel-the-element to be defining channel-the-concept. the phrase ”The channel element contains metadata describing the channel itself” is what cements it for me. if channel-the-element is describing channel-the-concept, it seems like channel-the-element’s rdf:about value ought to be the URI of the channel-the-concept being described. others may read it differently, and that’s fine. someone from the RSS WG would have to say what it means officially.
also, any way you slice it, it’s going to go someone’s way. why not mine? :) (of course, it’s not really my way. there are people on both sides of the fence. even within the RDF camp.) and, i didn’t mean to imply that DJ’s position and mine were the same. i just thought he gave a good overview.
…but if you don’t quit being obtuse just for your own amusement, we’ll wish we could
perhaps i came across differently than i had intended. sorry.
Agreed: the channel element’s rdf:about should point to the URI for channel-the-concept. The URI for my concept of the channel is the URL for my HTML page. If you see the concept of your channel as being your RSS feed, then the rdf:about for your channel element should be the URL for the feed. I assume that’s why the spec leaves the choice to the producer: there isn’t one right answer.
And if there isn’t one right answer, then an element can’t assume that its subject is the URL for the feed, so an element that says ”this is the new URL for the feed” has to be one defined to say that, rather than one defined as ”this is the new URL for my subject, whatever it may be.” RSS 1.0 is sometimes weird RDF, so you can’t always say the things you would say in plain old RDF. I think this is one of those times, when we’ll just have to muddle through as best we can, and people who can’t do HTTP redirects won’t be able to tell RDF parsers that don’t know about the RSS redirect module where their feed has moved. I hope that’s a pretty small part of the problem set, but in that case an individual could certainly try changing their channel rdf:about and adding a isReplacedBy: just because I want another module, and want to see desktop RSS readers support it, doesn’t mean that isReplacedBy doesn’t exist, or can’t be used, by people with a need and a feed where it will work.
Reading back, I noticed that I missed replying to my share of Michael’s concerns. Oops.
Actually, there isn’t any ”wrong” way to do it. If you use plain XML in your RSS 1.0 feed, well, you’ve produced slightly less useful RDF than you could have, and a well written aggregator should still redirect. If you use the RDF syntax (and the namespace, since otherwise you haven’t actually said anything at all) in a 2.0 feed, fine, you’ve just gone to a little extra work for the same result.
But if you are writing an aggregator to support it, you need to remember to support both syntaxes in either format. What does that mean? When you see a newFeedLocation element in the redirect namespace, you need to check the attributes to see if there’s a resource attribute in the RDF namespace, and you need to check the character data to see if it’s a URL. That’s all. No need to write multiple converters for various formats and encodings, no need to fire up a schema-aware RDF parser to deal with semantic enhancement, just look for either a particular attribute or character data. And if whatever a producer did works in one aggregator, then it should work in yours and everyone else’s, too. This just isn’t that complicated.
HTTP 301
Phillip Pearson: This makes perfect sense to me. HTTP 301 redirect is the RESTful solution. Blog hosting services should offer it. RSS aggregators should not only follow it, but follow the instructions specified in RFC1945:The requested resource has
Phillip Pearson: There’s one thing I’d like to mention: the importance that the method used doesn’t require separate RSS feeds to be generated for old and new locations. To cope with existing aggregators that understand HTTP redirects but not the new redirect semantics, we want to be able to use an HTTP 301 redirect to point from the old to the new location. However, this means that RSS fetched from both URLs will be *identical*.
+1
I expand on this notion a bit here.
There’s been a few comments regarding the ”proper use of DC Terms in XML” and then into RDF, especially referencing the DC Guidelines for XML.
Please note that the DC Guidelines for XML specifically mention that they are not guidelines for DC in RDF/XML, that modelling DC for use in RDF (and then XML as a serialization) is a seperate effort.
Using rdf:resource for a DC value is thus a possible interpretation of how DC would be modelled in RDF. More authoritive discussion or references would be appreciated!
Absolutely. By an odd coincidence, I was just fruitlessly searching the DCMI site for any mention of how to do DC in RDF, or even any mention of the precise meaning of ”should” in the XML guidelines. I know what ”SHOULD” means to the W3C, but I suspect that DCMI isn’t using quite the same formal meaning.
D’oh. If I’d started at the front page and gone to documents, I would have found Expressing Qualified Dublin Core in RDF/XML and Expressing Simnple Dublin Core in RDF/XML. Sometimes it doesn’t pay to search.
And given the authors, no surprises about how it’s done.
I am pleased to report that I was just able to subscribe to Phil’s feed through the redirection link, and have the new URL recorded in the database, in Radio 8. Now I’ve got to let it burn in for a few hourly scans, see if the change broke anything in the rest of the aggregator. Basically, at least one part of the problem is almost solved in one aggregator.
Dave: BRAVO! I’ve updated my blog entry to reflect this good news.
Phil: The question I am trying to tee up is as follows: what information should be add to this feeds to allow less HTTP protocol aware aggregators to also update their databases? Should they put in a into the new feed?
Phil said:
Actually, there isn’t any ”wrong” way to do it.
[snip]
[I]f you are writing an aggregator to support it, you need to remember to support both syntaxes in either format. …[I]f whatever a producer did works in one aggregator, then it should work in yours and everyone else’s, too.
Well, that seems (to me) to be making unnecessary extra work for folks developing aggregators. But as a User and occasional Developer of Tools, I can’t complain.
Dave: Did your test work with attributes as well as literals? I can’t see Phil’s redirection feed, because there is an HTTP redirect on it (not the situation I thought that was being tested).
Dave: kudos! Even if nothing else comes of this, having one 301-aware aggregator is a big step in the right direction.
Sam: I’ve got the ugly feeling that I’m missing something here. Hope you don’t have to draw me a picture, but you might.
If I write an utterly clueless app that does its own HTTP but ignores 301, I’m toast: all I get back is a header saying 301 and the new location, so as far as I’m concerned the file is suddenly blank.
If I’m using a decent HTTP library poorly, then when someone adds a .htaccess (as I just did) in /scratchpad/, saying
RewriteRule redirected.rss /index.xml [R=301,L], then my library gives me a status=301, location=http://philringnalda.com/index.xml, and the content of index.rss, but since I’m too lame to check the status code, I think I just retrieved /scratchpad/redirected.rss. I parse it, see redirect:newFeedLocation with a value of /index.xml, say ”wups, that’s not the URL I just GETted, better try again,” and when I GET /index.xml I see redirect:newFeedLocation with a value of /index.xml, and say ”yup, I already knew that.”So unless I’m missing something, if you can do a 301, do, and add redirect:newFeedLocation in the new location just in case. If you can’t do a 301, put redirect:newFeedLocation in the old file, or if your situation requires it put it in both, since anything that sees it should only take action if $url_I_got != $newFeedLocation.
Michael: no, we’re playing with HTTP 301 while we wait with fingers crossed to see whether anyone else will consider supporting redirect:newFeedLocation. As to the pain of implementing the syntax, Sam’s got a tagline for you: it’s just code. I’ll fight tooth and nail against that being a retroactive answer, where you say ”we’re changing the spec, everything previously rewritten needs to change, but it’s just code” because there’s no way of knowing if all the authors of that code are even still alive, but for something new, I don’t see a problem with a little extra code if it makes things simpler for the end user/producer.
So unless I’m missing something, if you can do a 301, do, and add redirect:newFeedLocation in the new location just in case. If you can’t do a 301, put redirect:newFeedLocation in the old file, or if your situation requires it put it in both, since anything that sees it should only take action if $url_I_got != $newFeedLocation.
Personally, I think that definition is not obvious from the name. I would think that a aggregator which sees a redirect would try to chase it. I’m pretty sure that a browser would.
I’d prefer a tag name which connotes the following concept: ”if you are reading this feed, the prefered URL for accessing me is http:xxxx”.
Remember: My design goal for an RSS 2.0 element is that it be instantly obvious just exactly what it is and does.