root/trunk/CleanSweep/CLEANSWEEP-README.txt

Revision 631, 3.8 kB (checked in by breese, 19 months ago)

my hack for the day: adapting Clean Sweep to automatically guess the intended URL being requested and redirect the client browser accordingly. the benefit: fewer 404s!!

Line 
1
2Clean Sweep Plugin For Movable Type
3By: Byrne Reese <byrne at majordojo dot com>
4
5Donated in whole to the Movable Type Open Source Project
6Copyright 2007-2008 Six Apart Ltd.
7
8OVERVIEW
9
10CleanSweep is a plugin that assists administrators in finding and fixing
11broken inbound links to their website. It was build to support two use
12cases:
13
14* to help users get a clean start with their blog by allowing them to
15  completely restructure their permalink URL structure and have a system
16  that can automatically adapt by redirecting stale and inbound links to
17  the proper destination
18
19* to help users in the process of migrating to Movable Type who are
20  forced to modify their web site's URL and permalink structure
21
22Both of these use cases have to do with preserving a site's page rank
23in light of a major redesign.
24
25HOW IT WORKS
26
27Under the Blog Plugin Settings, select the Clean Sweep and retrieve the
28Apache configuration directive that will begin routing all 404s through
29the Clean Sweep plugin.
30
31Clean Sweep will then track all inbound links that result in a 404 and
32will ultimately deduce the indended file and redirect the client to that
33file.
34
35Clean Sweep will also produce a set of Apache mod_rewrite rules to map
36inbound links to their destination permanently.
37
38REDIRECTION RULES
39
40Clean Sweep will use the following ruleset in trying to guess the target
41URL the client is requesting:
42
431) Is the target resource using the entry id as a URL
44   This is a prevalent URL pattern for older MT installations. This will:
45
46   Map: http://www.majordojo.com/archives/000675.php
47   To:  http://www.majordojo.com/205/07/goodbye-bookque.php
48
492) Is the target resource using underscore when it should be using hyphens?
50   Many users have switched to using hyphens for purported SEO benefits.
51   This will attempt to look for a file in the system of the same name, but
52   using '-' instead of '_'. This will:
53
54   Map: http://www.majordojo.com/2005/07/goodbye_bookque.php
55   To:  http://www.majordojo.com/2005/07/goodbye-bookque.php
56
573) Is their a target resource with the same basename somewhere?
58   If a user switches their primary mapping to use a date based URL as
59   opposed to a category based URL, then this rule will apply. This will:
60
61   Map: http://www.majordojo.com/personal-projects/goodbye-bookque.php
62   To:  http://www.majordojo.com/2005/07/goodbye-bookque.php
63
644) Let me know and I will add it!
65
66SUPPORTED WEB SERVERS
67
68Clean Sweep supports both Apache and Lighttpd. For now you elect what web
69server you are using on a blog-by-blog basis. All documentation however,
70refers to Apache, as it is far more common. Lighttpd users should simply
71follow the analogous instruction for their web server when appropriate.
72
73INSTALL
74
751. Unpack the Clean Sweep archive.
762. Copy the contents of CleanSweep-1.x/plugins to:
77   /path/to/mt/plugins/
783. Create a page in Movable Type called "URL Not Found". Give it a
79   basename of "404". Place whatever personalized message you want that
80   will be displayed to your visitors when Clean Sweep is unsuccessful
81   in mapping the request to the correct page or destination.
824. Publish the page and remember the complete URL to this page on your
83   published blog.
845. Navigate to the Plugin Settings area for Clean Sweep.
856. Enter in the full URL to your "URL Not Found" page you created in
86   step #3. Copy that URL into the "404 URL" configuration parameter
87   for Clean Sweep.
887. In your plugin settings area for Clean Sweep, make note of the
89   Apache configuration directive that Clean Sweep asks that you place
90   in your httpd.conf or in an .htaccess file.
918. Add the Apache configuration directive to your web server. This may
92   be placed in your httpd.conf file or in an .htaccess file located
93   in the DocumentRoot for your blog.
949. Restart Apache
95
96LICENSE
97
98Clean Sweep is licensed under the GPL (v2).
Note: See TracBrowser for help on using the browser.