root/trunk/CleanSweep/CLEANSWEEP-README.txt

Revision 631, 3.8 kB (checked in by breese, 7 months ago)

my hack for the day: adapting Clean Sweep to automatically guess the intended URL being requested and redirect the client browser accordingly. the benefit: fewer 404s!!

Line 
1
2 Clean Sweep Plugin For Movable Type
3 By: Byrne Reese <byrne at majordojo dot com>
4
5 Donated in whole to the Movable Type Open Source Project
6 Copyright 2007-2008 Six Apart Ltd.
7
8 OVERVIEW
9
10 CleanSweep is a plugin that assists administrators in finding and fixing
11 broken inbound links to their website. It was build to support two use
12 cases:
13
14 * to help users get a clean start with their blog by allowing them to
15   completely restructure their permalink URL structure and have a system
16   that can automatically adapt by redirecting stale and inbound links to
17   the proper destination
18
19 * to help users in the process of migrating to Movable Type who are
20   forced to modify their web site's URL and permalink structure
21
22 Both of these use cases have to do with preserving a site's page rank
23 in light of a major redesign.
24
25 HOW IT WORKS
26
27 Under the Blog Plugin Settings, select the Clean Sweep and retrieve the
28 Apache configuration directive that will begin routing all 404s through
29 the Clean Sweep plugin.
30
31 Clean Sweep will then track all inbound links that result in a 404 and
32 will ultimately deduce the indended file and redirect the client to that
33 file.
34
35 Clean Sweep will also produce a set of Apache mod_rewrite rules to map
36 inbound links to their destination permanently.
37
38 REDIRECTION RULES
39
40 Clean Sweep will use the following ruleset in trying to guess the target
41 URL the client is requesting:
42
43 1) Is the target resource using the entry id as a URL
44    This is a prevalent URL pattern for older MT installations. This will:
45
46    Map: http://www.majordojo.com/archives/000675.php
47    To:  http://www.majordojo.com/205/07/goodbye-bookque.php
48
49 2) Is the target resource using underscore when it should be using hyphens?
50    Many users have switched to using hyphens for purported SEO benefits.
51    This will attempt to look for a file in the system of the same name, but
52    using '-' instead of '_'. This will:
53
54    Map: http://www.majordojo.com/2005/07/goodbye_bookque.php
55    To:  http://www.majordojo.com/2005/07/goodbye-bookque.php
56
57 3) Is their a target resource with the same basename somewhere?
58    If a user switches their primary mapping to use a date based URL as
59    opposed to a category based URL, then this rule will apply. This will:
60
61    Map: http://www.majordojo.com/personal-projects/goodbye-bookque.php
62    To:  http://www.majordojo.com/2005/07/goodbye-bookque.php
63
64 4) Let me know and I will add it!
65
66 SUPPORTED WEB SERVERS
67
68 Clean Sweep supports both Apache and Lighttpd. For now you elect what web
69 server you are using on a blog-by-blog basis. All documentation however,
70 refers to Apache, as it is far more common. Lighttpd users should simply
71 follow the analogous instruction for their web server when appropriate.
72
73 INSTALL
74
75 1. Unpack the Clean Sweep archive.
76 2. Copy the contents of CleanSweep-1.x/plugins to:
77    /path/to/mt/plugins/
78 3. Create a page in Movable Type called "URL Not Found". Give it a
79    basename of "404". Place whatever personalized message you want that
80    will be displayed to your visitors when Clean Sweep is unsuccessful
81    in mapping the request to the correct page or destination.
82 4. Publish the page and remember the complete URL to this page on your
83    published blog.
84 5. Navigate to the Plugin Settings area for Clean Sweep.
85 6. Enter in the full URL to your "URL Not Found" page you created in
86    step #3. Copy that URL into the "404 URL" configuration parameter
87    for Clean Sweep.
88 7. In your plugin settings area for Clean Sweep, make note of the
89    Apache configuration directive that Clean Sweep asks that you place
90    in your httpd.conf or in an .htaccess file.
91 8. Add the Apache configuration directive to your web server. This may
92    be placed in your httpd.conf file or in an .htaccess file located
93    in the DocumentRoot for your blog.
94 9. Restart Apache
95
96 LICENSE
97
98 Clean Sweep is licensed under the GPL (v2).
Note: See TracBrowser for help on using the browser.