Even geekier portion of this post:
When I gathered all these feeds together there were a lot of duplicate feeds, and figuring those out was a pain in the ass because a lot of people had customized the names and descriptions and so forth. What to do?
It helps to know that an OPML file is just text, and that NetNewsWire will fetch descriptions for you. So here's a line of text representing the feed for Super Colossal (which is great but sadly not recently updated):
<outline text="Super Colossal" description="" title="Super Colossal" type="rss" version="RSS" htmlUrl="http://supercolossal.ch" xmlUrl="http://supercolossal.ch/feed/"/>
Pretty readable: there's text, an empty description, a title, and then some things that look less irrelevant. So, if we have a second line in the file that says
<outline text="Also Super Colossal" description="It's got stuff about building things" title="Also Super Colossal" type="rss" version="RSS" htmlUrl="http://supercolossal.ch" xmlUrl="http://supercolossal.ch/feed/"/>
it's going to appear out of order on the list and eliminating a duplicate is tough (time-consuming really). TextWrangler to the rescue, one of the ultimate Mac freebies for geeks. TextWrangler does grep, a very powerful and dangerous kind of search if you don't know what you're doing. I frequently don't know what I'm doing, but I just don't care, so on we went. Since we know NetNewsWire will fetch text and title and description, all we have to do is wipe those for the 2800 feeds in the file. Easy! So here's a search that more or less means look for anything that says 'title="anything"' and replace it with an emptied version:
And there's the result:
So you just do that a couple more times for the other fields - I guess you could do
text=".*?" description=".*?" title=".*?"and get it all done at once but I wasn't smart enough to think of it at the time and grep and ambition don't match well with n00bs - and then the fields are wiped. TextWrangler can then put selected lines in order and then process them in various ways, like removing duplicate lines:
And there we go, a whole bunch of feeds eliminated. DKW would do it in Excel, but I swear TextWrangler is faster. Down to around 2100. Empty out NetNewsWire, reimport the cleaned up OPML and it'll acquire the right text and description and title on its own. And if the feeds vary slightly - one's RSS and one's Atom - it's pretty likely that when NetNewsWire gathers information about the feed it's gonna show up next to its not-quite-duplicated partner anyway, and you can get rid of that pretty easily.