It is currently Mon Aug 20, 2018 5:51 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 3 posts ] 
Author Message
 Post subject: SIDs in URLs?
PostPosted: Sat May 19, 2018 1:49 pm 
Online
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 4068
A reality of the internet is that bots crawl message boards, but having Phpbb SIDs in URLs messes with their ability to crawl the board. Is there any way to get rid of those from phpbb?

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
 Post subject: Re: SIDs in URLs?
PostPosted: Sat May 19, 2018 4:47 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3488
Location: Mountain View, CA
Not sure if this is still the case (probably is), but phpBB used to only used sid=XXX as an HTTP parameter if using cookies failed. In other words: it prefers cookies, but falls back to using a session ID in the URL if it can't. There are several posts on the phpBB support forum describing this mechanism.

If some random spider/bot is picking up sid=XXX in URLs, then it's because it's not allowing or using cookies.

And yes, this is one of many problems when it comes to bots crawling phpBB forums. The other is that they often get stuck in an infinite loop downloading everything. Generally speaking rejecting bots from hitting phpBB through robots.txt is more commonplace.


Top
 Profile  
 
 Post subject: Re: SIDs in URLs?
PostPosted: Sat May 19, 2018 6:10 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20428
Location: NE Indiana, USA (NTSC)
And for those bots deemed too useful to exclude, such as Google, Bing, Internet Archive, and whatever feeds into DuckDuckGo, try these in no particular order:

1. Make sure the board software is issuing a proper absolute URL in <link rel="canonical">.
2. Hardcode their user agents into the board software as not eligible to begin a session.
3. Try turning off session.use_trans_sid.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group