If someone can solve the issue that we are having via mod_rewrite, we will gladly pay for your time. In fact, if this is something that can be solved and you have the expertise, I will gladly pay for your time/ consulting charges.
OK, here we go!
We use Joomla and we have duplicate URLS all pointing to the same page and we are getting a duplicate penalty with the search engines.
From my Robots.txt
Disallow: /content/view/
Allow: /content/view/*/40/
Allow: /content/view/*/49/
Allow: /content/view/*/61/
Allow: /content/view/*/64/
Allow: /content/view/*/63/
Allow: /content/view/*/93/
Allow: /content/view/*/105/
However, search engines are "finding" duplicate URLS and even with the following in the Robots.txt, they still get added. I want to do a 410 error, so that they URL will initially NOT be added. I cannot find the place where Joomla serves up the wrong/duplicate URLs.
Need to add to Mod_Rewrite to serve up 404 except for one URL.
All of these point to the same page.
www.domain.com/content/view/88/63/
www.domain.com/content/view/88/1/
www.domain.com/content/view/88/525/
www.domain.com/content/view/88/
the
www.domain.com/content/view/88/63/
Is the correct URL.
urls that end in specific numbers are OK (40,49,61,64,63,93,105). The 88 in the above URL refers to an article ID.
The URL starts like:
/content/view/
Then there are 2 numbers separated by dashes in the URL.
If there are two sets of numbers:
/88/525/
and the last number (/525/) is not in the 40,49,61,64,63,93,105, we need to send a 410 error
If there is only one set of numbers in the URL:
/content/view/88/
We need to send a 410 error, however, we cannot block the whole directory, because, this is valid
/content/view/88/63/