It is currently Sun Oct 22, 2017 7:43 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 21 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Tue Oct 16, 2012 8:31 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
Why are certain common words ignored on searches? Many times since the migration of the forums I have had trouble searching for threads/posts just because certain words are filtered out. Even though the words are common, they can be used with other (common or not) words to form not-so-common combinations. For example, I can't search for anything related to sprites (sprite cycling, sprite corruption, sprite animation, etc.) because the word "sprite" is ignored.

Because of this, I often have to try hard and remember other things (not necessarily related to the information I'm after) that might have been mentioned in a thread I'm looking for just so I can find it. I can only do this because I was around when the thread was made and have read it before, but what about new users? They can't do that, they rely on a search mechanism that doesn't ignore words that can form unique combinations with other words.

Can we reconsider some of these filtered words?


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 8:55 am 
Online

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
Does Google sprite site:forums.nesdev.com also fail?


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 9:33 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
I haven't tried, but it's kinda lame that we need an external tool for searches when our own software is able to do it. This solution is also very counter-intuitive.


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 11:29 am 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
Oh the irony. I search for "hit" and I get 1688 matches. I search for "sprite hit" (with search for all terms checked) and get the same number of matches. The system's explanation is that it helpfully omitted "sprite" because it's such a common word and would therefore cause excessive matches. Except that as you say, this is a narrowing search, and thus it would reduce the number of matches. Total retardation.


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 12:05 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
Exactly. And your example is perfect: it's practically impossible to search for info on sprite hits.


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 4:02 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
I'm not sure what can be done about this, honestly. Expecting a piece of forum software to have the same logic and "smarts" as a major search engine is unreasonable, though I do understand where you're coming from (re: usability).

We're limited to what you see in the below screenshot. There's absolutely nothing else we can adjust/change. That's just how phpBB is.


Attachments:
search.png
search.png [ 110.22 KiB | Viewed 4450 times ]
Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 4:07 pm 
Online

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
Near the bottom is a "Common word threshold". Perhaps a certain soft drink made by Coca-Cola is in the top 5 percent, and the value needs to be lowered and the index rebuilt.


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 4:11 pm 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2963
Location: Tampere, Finland
tepples wrote:
Near the bottom is a "Common word threshold". Perhaps a certain soft drink made by Coca-Cola is in the top 5 percent, and the value needs to be lowered and the index rebuilt.

Or set the value to 0 (may or may not need rebuilding of the index), and see if there's really a performance hit big enough to warrant changing it back to value greater than 0.

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 6:13 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Tepples et all are welcome to change this if they want. I have no idea of what the repercussions performance-wise or database-wise (e.g. craploads of rows resulting in a gigantic table, SELECT queries taking a long time (no idea if they use INDEXes), etc...).


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 6:59 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
Ah, so the common words are automatically detected by the software... This is kinda stupid, some of these common words are crucial to make searches meaningful.


Top
 Profile  
 
PostPosted: Tue Oct 16, 2012 9:48 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5732
Location: Canada
sprite sprite sprite sprite sprite!

Now you'll never find them! MUA HA HA HA HA HA! :twisted:


Top
 Profile  
 
PostPosted: Wed Oct 17, 2012 8:08 am 
Offline
User avatar

Joined: Wed Nov 24, 2010 12:51 am
Posts: 44
Location: Finland
Quote:
No posts were found because the word ciclone is not contained in any post.
No posts were found because the word 10nes is not contained in any post.
No posts were found because the word 3195 is not contained in any post.

This can't be right either.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2012 8:25 am 
Online

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
I cleared the index, set the stopword threshold to 1%, and rebuilt the index. It may take a few hours, or it may have been killed by a max_execution_time restriction on the web server. WhoaMan might be the best one to troubleshoot this.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2012 9:10 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2963
Location: Tampere, Finland
tepples wrote:
I cleared the index, set the stopword threshold to 1%, and rebuilt the index. It may take a few hours, or it may have been killed by a max_execution_time restriction on the web server. WhoaMan might be the best one to troubleshoot this.

Hmm, shouldn't it actually be set to a higher value? If I understand the wording right, the meaning is "if the word is contained in over 5% of all posts, it will be regarded as common (and ignored)", so 1% would actually regard more words as common.

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Wed Oct 17, 2012 9:46 am 
Online

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
My bad, I thought it was the top 5 percent of words, not words in over 5 percent of posts. I was confused by "Set to zero to disable". I'll rebuild the table again.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 21 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group