How to Create 301 Redirects from a Sitemap Page - Quick & Easy Using Excel and Notepad
One of the things I keep hearing is how difficult it is to create all of the proper 301 redirects when you're moving a site from blog.domain.com to www.domain.com/blog.
But with this tutorial, you can create 10,000+ 301 redirect links in under 20 minutes. Starting with just the sitemap page, and using a few Excel spreadsheet techniques and a text editor.
Can't I Just Ignore 301s?
''Can't we just 301 redirect the whole blog.domain.com to www.domain.com/blog? Isn't that good enough?'' many people ask. ''I didn't budget for it, and I can't spend another 40 hours doing this by hand.''
The short answer is no. If you ignore 301 redirects, your website (or the Website of your client) will quickly loose ranking. Instead of directing users to proper URLs, you're giving them 404s. We had a client come to us after their previous company migrated them to a new eCommerce platform, and didn't bother to do 301s. (I don't think they knew what they were.) All of the URLs changed. They disappeared from rankings, and their sales tanked. Really sad, because a few hours of work would have likely prevented this from happening.
And nothing says, ''I don't care'' like an Error 404. Both to Google and the end user.
The good news: there is an easy way to automate the process, using nothing but a Web browser, Excel and a Text editor, like Notepad. It doesn't have to take you a week of labor to do this.
So ... here's how you can create a list of 301 redirects quick and easy, without any real programming and, just by using an existing sitemap on a website that you can then drop right into your .htaccess file, and everything will work properly.
Tools you'll need to build a list of 301 redirects:
Web browser (if you don't have this... you should move on to another project. And I'm not really sure how you're reading this. Perhaps your assistant printed out this page for you).
Excel (Google docs won't work. You need to have the real desktop version of Excel).
Notepad (on a PC) or Text Edit (on a Mac) for editing plain text files.
For this example, we'll be creating a bunch of 301 redirects from http://www.customerparadigm.com/ to redirect to: http://blog.customerparadigm.com/.
Usually you'd be moving content from ''blog.domain.com'' to ''domain.com/blog'', so that you can build up the domain authority for all of your efforts.
But I already have all of the content on our website at domain.com/blog, so we'll start there.
To do this the other way, you should be able to figure out how to change the necessary steps. This example also works if you are changing the domain name, too. For example, it's just as easy to use this example to take all of the pages on the www.CustomerParadigm.com site and create 301s to a new domain, such as CustomerParadigm.org.
Step 1. Visit the Sitemap on the Website:
The sitemap on a Website is usually a link listed in the bottom footer navigation on the page.
A little history lesson on Website Sitemaps.
When Websites were still really new (think 1995 or so), a ''Sitemap'' made it easy to figure out how to navigate around a Website. Because things like top navigations didn't really exist yet. Oh yeah. Dial up speeds of 56kbs was considered quick.
Then a couple of years later, navigation menus were more standard (think 1998 or so), and the prevailing theory of the day was that if you put a sitemap on your site, it meant that your navigation sucked, and users had to default back to the sitemap to find their way around. So many Web designers took them out.
But then SEO people discovered that a sitemap helped a search engine like Google to see and index all of the pages on a site, and so we started linking to the sitemap from the bottom of the page.
Why Use the Sitemap?
What's important for the 301 redirect lesson is that a sitemap is now a pretty standard feature on most sites. And if it's not turned on and linked, it's usually easy to generate.
For our purposes, the sitemap is going to give you a list of ALL of the pages on the site... in a way that we can then rip apart and create a 301 redirect file, quick and easy.
Okay... enough with the history lesson. Here's the sitemap for the Customer Paradigm site, generated auto-magically by WordPress. It displays a list of all of the pages on our site:
If you scroll down, we're just going to focus on the ''Photography'' section of the sitemap:
Perhaps the big strategists from the brilliant SEO team found out that all of the photography content on the site needed to be moved elsewhere, to a different blog.
But you could do this whole exercise with the entire sitemap if you needed to, and create 301 redirects for all of the pages on the site.
Step 2. View the HTML Source Code:
Depending on your browser, click on the necessary buttons to view the source code in a text editor. In many browsers, right click and select ''View Source''.
In this case, I've scrolled down to line 2038 of the HTML source code, to focus solely on the photography section:
You can copy and paste these lines of HTML into a new text editor, like Notepad on a PC or TextEditor on a Mac.
Step 3. Create A Tab Stop & Do A Find-And-Replace
In the text editor, find a place in the text, and place your cursor there. Hit the ''tab'' button.
Then highlight, select and copy this tab space.
It looks like you're just copying a space. But you're actually copying a tab.
On a PC, use CTRL+C to copy. On a Mac, use Command+C to copy. If you don't already know these keyboard shortcuts, you should learn them because it will speed up your work by 50 times or more. Really, you should know how to do this.
Next do a Find-And-Replace operation, using this tab stop to create a tab between the href=" and the http://www., such as in the example below:
In Notepad on a PC, Control-H will bring the find-replace function up; it's different in different programs.
For the ''Find What'' field, type (or paste) in: href="http://www.
For the "Replace with" field, type in: href=" http://www.
Where the space between the '' and the http://www. contains the Tab stop.
Here's a screenshot of how this looks in Notepad:
This is an important and critical step.
What we're doing is creating a tab stop before the link, so that we can isolate it later into it's own column in Excel. (Don't worry about that part yet. But that's why we're doing this.)
Hit Replace All, and let it do its magic.
If it works, the first line will look something like this:
href=" http://www.customerparadigm.com/category/photography-portfolio-jeff-finkelstein/" title="Photography"
You're halfway there. Next, we need to isolate the end of the link with a second tab stop.
Step 3B. Put a Tab at the End of the Link
We need to somehow get rid of the '' title=''Photography'' portion after the link. Again, our goal is to isolate the full link, in between two tab stops.
In this case, all of the HTML was formatted with the '' title=" after the end of the href=" " content. One of the reasons I like using the sitemap page, especially if it is auto-generated, is that all of the content and links are usually in a very consistent format. And consistent = easy to automate.
So, let's find and replace that "title="Photography" out of our lives, by inserting a tab stop before the first ":
Again, in the Find What field, type: " title="
In the Replace With field, type: " title="
... where the space in between the " and title=" is now a Tab stop.
Hit Replace All.
If it works, the first line will look something like this:
href=" http://www.customerparadigm.com/category/photography-portfolio-jeff-finkelstein/ " title="Photography"
Here's a screenshot of how you have a tab stop before and after the HTML link. We've now "wrapped" the link in between two tab stops.
The next part is sheer magic. It's like slicing the crust off of your bread, but just doing it through 10,000 pieces of bread at the same time.
Step 4. Paste the Code Directly Into A Blank Excel Spreadsheet.
Open up a blank Excel Spreadsheet. Google Docs probably won't cut it. You need the real desktop version, especially for Step 7.
If you copy and paste the raw HTML code into it, Excel interprets the Tab stops you've added as column separators. It's really handy. I use this all the time.
If you did this correctly, it should look something like this:
Perfect! Now you have the link that you need, isolated in it's own column in an Excel spreadsheet. We're golden.
Next, highlight Columns A and B.
Right click (or Control-Click on a Mac) and delete these columns.
Do the same thing with Columns D, E and F.
You should now have a clean version of the link:
Go through the links and make sure there are no blank lines. If you need to, you can always do a sort from the Data tab. This will avoid empty lines in your spreadsheet, which can mess up things later.
Yeah! The hard part is over. Now, it's a matter of taking this huge list of links, and formatting them properly for the 301 redirect.
Don't worry if Excel decides to "Hotlink" any of the URLs. We're going to be pasting into a text editor next, and that will remove any of this super-fancy Microsoft-inspired formatting.
Highlight Column A, and copy to the clipboard. (Don't use the Cut command. You still need the stuff in Column A for a later step).
Step 5. Create the "source" link for the 301 Redirect.
Next, paste the "clean" list of links into a text editor.
It should look something like this, with the full list of URLs, one per line.
Our goal is to take this list of links, and turn it into the "source" for the 301 redirect links.
A quick side note about how 301s work and are formatted:
A 301 Redirect is usually found in a .htaccess file (on Apache servers) and is formatted like this:
Redirect 301 /category/photography-portfolio-jeff-finkelstein/ http://blog.customerparadigm.com/category/photography-portfolio-jeff-finkelstein/
Where:
Redirect 301 (with a space after it) tells the .htaccess file to redirect this with a permanent 301 move.
/category/photography-portfolio-jeff-finkelstein/ - this is the "Source" file on the site that is being redirected
http://blog.customerparadigm.com/category/photography-portfolio-jeff-finkelstein/ - this is the destination of the 301 redirect
More information about 301 redirects are here
What we need to do is take the big long list of links that we've created, and remove the first part of it. In this case, we want to get rid of the http://www.customerparadigm.com
In your text editor, use the find-replace function to:
Find: http://blog.customerparadigm.com/
Replace: /
Important note: you really do need that first slash. If you don't include it, the 301 redirects probably won't work on the site. Computers are often picky that way.
Here's a screenshot of how this should look:
Hit Replace All.
If it works properly, your text file should look like this:
Copy and Paste this contents from the text file into Column B of the spreadsheet.
Make sure you start at the top and double check to make sure that the lines add up. Otherwise, you're going to redirect users to another page on the site, which will confuse your clients, confuse Google, and make you look bad.
Here's how this should look, after you paste the information back into Column B:
Okay ... we're almost there.
Step 6. Create the 301 Redirect Destination Links:
In the next step we need to create the destination link for the 301 redirects. (This is the final URL where you want the end user to go... the place where the page has been permanently moved.)
We're going to repeat much of what we did in Step 5. Take the list of links from Column A in the spreadsheet, and paste them into a text editor again.
It should look like this:
Next, use the Find-Replace option to change the URL from http://www.CustomerParadigm.com to the new 301 redirect destination, http://blog.customerparadigm.com.
If you wanted to use the same script to change the domain name (i.e. to CustomerParadigm.org), then use that URL instead. Normally you wouldn't want to change a blog from the www.domain.com/blog or a subdomain like blog.domain.com, as it's not good for SEO. But, this is an example... and my example site was set up properly :).
Find: http://www.customerparadigm.com/
Replace: http://blog.customerparadigm.com/
Hit Replace All.
If you did this correctly, your text file should look like this:
Select all the links, copy them, and then paste them in Column C in the Excel Spreadsheet:
Before you go to the next step, make sure your columns line up.
Again, I've been guilty of messing this up in the past. (Only in that case, I mixed up the first names and email addresses in an email list for a 5-star hotel, and then sent an email to all of their guests where their name was incorrect. Not so good. I'm much more careful now.)
Anyway, it should look like this, where the URLs all match up in Columns A, B and C:
Step 7. Building Your 301 Redirects via Concatenation:
Microsoft Excel has a tremendous amount of power. Believe it or not, I spent an entire semester in grad school learning about one single function in Excel. Excel can do some amazing things.
One of the most useful functions for SEO people is concatenate.
Concatenate basically allows you to string together columns of text, and combine that into one cell.
Yes, there are much more elegant ways to do this with object-oriented programming. But if you're not a programmer, and don't want to mess with database strings, this is a function that makes it really quick and fast to do. By the time your highly-paid programmer has read the requirements for what you want to do to string fields of text together, this Excel spreadsheet function allows you to do this in a matter of seconds.
Okay... to get started, click your mouse into Cell D1.
Then click on the fx button to insert a function:
Choose "Concatenate" from the drop-down. It's in the Text category:
Next, build your concatenation argument.
Here's how you should do this:
Text1: Type in "Redirect 301 " (make sure to have that space after the 301)
Text2: Click your mouse onto cell B1 (or type in B1)
Text3: Type in a space. (the system will show " ")
Text4: Click your mouse onto cell C1 (or type in C1)
Here's how this should look like:
Hit okay to have Excel build your 301 redirect. This will only do this for the first line.
Here's how the formula will read in the D1 cell:
CONCATENATE("Redirect 301 ",B1," ",C1)
(You can also just copy and paste this right into cell D1, too.)
Next, take the little square on the bottom right of the square, and drag it all the way to the bottom of your rows. You can also double-click on this square, and it should automatically fill to the bottom of the page, unless you have an empty line (in this case it might stop). That's why I recommend dragging it all the way down by hand.
It should look like this:
You're done! Your 301 redirects have been built auto-magically.
Copy and paste Column D and paste it into a text editor, or directly into your .htaccess file to complete the 301 redirects.
Step 8. Testing. (Make sure the 301 redirects work properly)
First, before you upload your .htaccess file with the freshly-minted 301 redirects, it's important to back up your .htaccess file first. I recommend backing it up in at least two locations - one on your computer, and the other on the server. That way, if you mess up anything, you can quickly revert back to the original version. It's the belt-and-suspenders method: if you wear both a belt and suspenders, chances are your pants won't fall down. Redundancy is key. Let me repeat that :).
Second, try pasting in ten rows, and testing:
Do each of those ten 301 redirects work properly? Visit the source URL, and make sure it goes to the correct place with a 301 redirect.
Then, make sure that all the other pages on the site still work properly.
Then... and only then, add in everything else. Do another round of testing. You might want to automate the process by putting in all of the source links into an automated program to make sure that they don't return error 404s.
If something in the .htaccess file breaks your site, quickly revert to the original file. Look for illegal characters, lines joined together, or other "funky" issues. Try adding back in the 301s one at a time, until you can isolate the line or lines causing the issue.
Third, don't do this at the end of your work day. Or even on a Friday afternoon. It's best to push the 301 redirects live sometime in the morning, when you've had a couple of cups of coffee. What you don't want to have happen is your .htaccess file breaking other parts of your site, and you're checked out for the weekend.
Summary
You shouldn't have to spend a week building 301 redirects by hand. Once you have these steps down, you can create a 301 redirect file for 10,000 pages in less than 20 minutes.
But Can't I Just Use a One-Line Rewrite Rule in .htaccess?
Yes, you could add a line like this to your .htaccess file and accomplish what we're doing here today:
RewriteRule ^/blog/(.*)$ http://blog.subdomain.com/$1 [R=301,L]
However:
You might not want to use mod rewrite, because you can't see each line and test it. For a small or medium-sized site, having 250 or 300 lines in an .htaccess file gives peace of mind, because you know that each URL will be redirected properly, and you can actually see each line and test it. You don't have to worry about coding rules that may or may not work, because you don't understand Apache. In this case, spending 20 minutes using Excel and Notepad might be a lot easier for SEO professionals to tackle, then trying to work with an expensive programmer.
If you only want to 301 redirect parts of a site, this is an ideal way to pick and choose how to create rewrites for a specific section. For example, if I only wanted to redirect the article section on a site, it's easy to create these 301s easy, and without programming.
Finally, in some cases, a mod rewrite rule won't work - especially if the URL has strange characters in it, or the file extension changes. Without knowing specific programming, it's easy to do a find-replace on the .html extension and replace it with a slash or .php fairly easily.
Note: The opinions expressed in this article are the views of the author, and not necessarily the views of Caphyon, its staff, or its partners.
Article by
Jeff Finkelstein
Finkelstein is the founder of Customer Paradigm, a Boulder, Colorado-based interactive marketing firm that specializes in eCommerce and interactive marketing. Finkelstein has been featured in The New York Times as a Web Guru, and his company has received numerous awards, including the prestigious 2008 Rocky Mountain Direct Marketing Association Supplier of the Year Award.
stay in the loop