[Solved] Having trouble with .html at the end of pages

New to mod_rewrite? This is a good place to start.

[Solved] Having trouble with .html at the end of pages

Postby AsherJac » Sat Aug 29, 2009 1:14 am

Hi,

I've just archived an old site to an area on a new website. I've tried to redirect stuff so that Google doesn't lose it, and I've managed to do the main pages manually using mod rewrite, but I'm can't get an overall script to work. Here is the problem in detail.

1. Site has moved from www.XXX.com to www.YYY.com/archive/www.XXX.com
2. Google shows search results to www.XXX.com/mypage which are being redirected to www.YYY.com/archive/www.XXX.com/mypage, but this is throwing up a 404 as actually it should go to www.YYY.com/archive/www.XXX.com/mypage.html
3. So I'm trying to set up a mod rewrite that says any request for mypage should go to mypage.html

I tried using this from the FAQ (obviously replacing .php with .html):
Code: Select all
# Redirect to remove .php
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{SCRIPT_FILENAME} -f
RewriteRule ^(.+)\.php$ /$1 [R=301,L]


But what happened was links to www.XXX.com/mypage went to www.YYY.com/mypage.html, I need the path to remain.

Any help much appreciated. Please let me know if I've not been clear or you need more info.

Thanks,

Asher
Last edited by AsherJac on Sun Aug 30, 2009 11:33 am, edited 1 time in total.
AsherJac
 
Posts: 5
Joined: Sat Aug 29, 2009 12:56 am

Postby richardk » Sat Aug 29, 2009 2:42 pm

Are there directories?
Where does xxx.com go to? (Where are you putting the mod_rewrite?)

Try
Code: Select all
Options +FollowSymLinks

RewriteEngine On

RewriteCond %{HTTP_HOST} ^(www\.)?xxx\.com$ [NC]
RewriteRule ^$ http://www.YYY.com/archive/www.XXX.com [R=301,L]

RewriteCond %{HTTP_HOST} ^(www\.)?xxx\.com$ [NC]
RewriteRule ^(.+)$ http://www.YYY.com/archive/www.XXX.com/$1.html [R=301,L]
richardk
 
Posts: 8800
Joined: Wed Dec 21, 2005 7:50 am

Missed some vital info out

Postby AsherJac » Sun Aug 30, 2009 2:24 am

Hi Richard thanks for the reply, I've tried your code, but I can't get it to work, this is probably my fault though as your questions made me realise I'd missed out some vital info.

The initial redirect is through Network Solutions (the registrar used for www.XXX.com). I told that to redirect all traffic from www.XXX.com to YYY.com/archive/www.XXX.com - it's through a GUI so I don't know what method is being used for this.

I have created a .htaccess folder in YYY.com/archive/www.XXX.com and I have added in this code to redirect the most popular links from Google (these seem to work):
Code: Select all
Options +FollowSymlinks
RewriteEngine On

RewriteRule ^/XXX_content/training_and_resources XXX_content/training_and_resources/XXX_tr.html
RewriteRule ^/XXX_content/for_free XXX_content/for_free/XXX_for_free.html
RewriteRule ^/mail_password_form index.html
RewriteRule ^/my_school index.html
RewriteRule ^/index_html index.html
RewriteRule ^/emailQuestion emailQuestion.html
RewriteRule ^/XXX_faq XXX_faq.html
RewriteRule ^/XXX_links XXX_links.html
RewriteRule ^/XXX_contact XXX_contact.html
RewriteRule ^/XXX_cpp XXX_cpp.html
RewriteRule ^/news front-page/news.html


So, for instance, someone clicking a link in Google to www.XXX.com/XXX_faq gets taken to YYY.com/archive/www.XXX.com/XXX_faq.html (YYY.com/archive/www.XXX.com//XXX_faq is shown in the address bar - don't know why the extra slash appears, but it works).

What I can't get to work though is a general rule that where Google links to www.XXX.com/mydir/mypage it should be redirected to YYY.com/archive/www.XXX.com/mydir/mypage.html

Hope this is clearer.

Thanks again for your help.
AsherJac
 
Posts: 5
Joined: Sat Aug 29, 2009 12:56 am

Postby richardk » Sun Aug 30, 2009 11:17 am

Try
Code: Select all
Options +FollowSymLinks

RewriteEngine On

# Check if the request exists with a .html extension.
RewriteCond %{SCRIPT_FILENAME}.html -f
RewriteRule . %{REQUEST_URI}.html [QSA,L]

in /archive/www.XXX.com/.htaccess.
richardk
 
Posts: 8800
Joined: Wed Dec 21, 2005 7:50 am

Thanks so much

Postby AsherJac » Sun Aug 30, 2009 11:29 am

Richard,

Thanks so much, that's worked!

I had to put it in YYY.com/archive/www.XXX.com/.htaccess in the end but otherwise it seems to have worked perfectly.

Now I just need to study it to try to understand the regex so I can fix these problems myself next time.

Thanks again.
AsherJac
 
Posts: 5
Joined: Sat Aug 29, 2009 12:56 am

Postby richardk » Mon Aug 31, 2009 7:52 am

%{SCRIPT_FILENAME} is the path to the requested file/directory, eg. /document/root/for/YYY.com/public_html/archive/www.XXX.com/mydir/mypage for YYY.com/archive/www.XXX.com/mydir/mypage.

-f checks if that path is a file
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritecond wrote:'-f' (is regular file)
Treats the TestString as a pathname and tests whether or not it exists, and is a regular file.


Code: Select all
.

(the only regular expression) makes sure there is at least one character (ie. mypage, so that it doesn't match /archive/www.XXX.com/mydir/).

%{REQUEST_URI} is /archive/www.XXX.com/mydir/mypage for YYY.com/archive/www.XXX.com/mydir/mypage.

QSA
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule wrote:'qsappend|QSA' (query string append)
This flag forces the rewrite engine to append a query string part of the substitution string to the existing string, instead of replacing it. Use this when you want to add more data to the query string via a rewrite rule.


L
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule wrote:'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite rules. This corresponds to the Perl last command or the break command in C. Use this flag to prevent the currently rewritten URL from being rewritten further by following rules. Remember, however, that if the RewriteRule generates an internal redirect (which frequently occurs when rewriting in a per-directory context), this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.


So if there is a mypage (the RewriteRule pattern) and there is a .html file of the same name (the RewriteCond) the .html file is requested (the RewriteRule substitution) and processing stops.
richardk
 
Posts: 8800
Joined: Wed Dec 21, 2005 7:50 am

Postby AsherJac » Mon Aug 31, 2009 11:59 pm

Thanks again Richard, that's really helpful.
AsherJac
 
Posts: 5
Joined: Sat Aug 29, 2009 12:56 am


Return to Beginner's Corner

Who is online

Users browsing this forum: No registered users and 26 guests

cron