It is currently Tue Oct 17, 2017 2:50 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Sun Jun 06, 2010 6:16 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
I've noticed that Google searches on the Wiki using site:wiki.nesdev.com often act dumb. For example, when I search for MMC3, none of the 21 hits is the main MMC3 page, even though it has MMC3 in the title. Using intitle:MMC3 gives zero hits (even if I add MMC3 again as a normal search string).

Is there something telling Google to skip documents? At least a week or two ago, I was searching for things on nesdevwiki and Google had lots of hits to the now-nonexistent site. Maybe that has something to do with it, like Google thinks the new site is a spam mirror or something, I dunno. I see no robots.txt, so that wouldn't be it (some bad entry in one or something).


Top
 Profile  
 
 Post subject:
PostPosted: Sun Jun 06, 2010 6:32 pm 
Offline
User avatar

Joined: Tue Jun 24, 2008 8:38 pm
Posts: 1517
Location: Fukuoka, Japan
I don't know enough about media wiki to give an answer but could it be related to the latest spam links that we received about essays that could affect google since it must be a common spam link?

Maybe there is a way to check on google to see how nesdev is affected by this but I don't know about that too. I will see if I can give it a look.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jun 07, 2010 4:52 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19084
Location: NE Indiana, USA (NTSC)
The robots.txt is a 404, and there don't appear to be any robots directives in meta elements. Nor are internal links using nofollow.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jun 07, 2010 6:55 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
I just noticed that Google is also getting hits within the skins/ directory on the Wiki. I'm thinking that should have an empty index.html in it. Otherwise one gets useless hits, even when restricting via a site: as mentioned in an earlier message.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 08, 2010 2:06 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
blargg wrote:
I just noticed that Google is also getting hits within the skins/ directory on the Wiki. I'm thinking that should have an empty index.html in it. Otherwise one gets useless hits, even when restricting via a site: as mentioned in an earlier message.

This is because of the secure MPM we use for Apache. A more appropriate fix would be to place an .htaccess file in /home/ndwiki/www/w which contains:

Code:
Options -Indexes

...which disables automatic directory generation listings for any directory therein which lacks an index.php/index.html/etc. document. The end result is the web client receiving an HTTP 403 Forbidden.

I've put said .htaccess in place; verified as working. It may take a few weeks before Google picks up the changes, as their crawler sometimes takes a while to notice such things.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group