Searsh feature for more than 1 word

Found an issue with the phpBB system here at NESdev? Use this forum to report problems.

Moderator: Moderators

Post Reply
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Searsh feature for more than 1 word

Post by Bregalad »

It's really annoying. If I search for "foo bar" searsh will give me posts that contains either "foo" or "bar" but I want to look up posts that contains both "foo" and "bar". It does that no matter what is the position of the radio button.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Searsh feature for more than 1 word

Post by koitsu »

The default model for the search box is AND, with the two terms space-delimited. An example would be to enter into the textbox the literal string graphics snake. The results you'll get back are individual posts that contain the word graphics and also contain the word snake. Proof:

search.php?keywords=graphics+snake&terms=all

The | (pipe) operator is used to change the logic from AND to OR; for example graphics | snake would look for posts that contain the word graphics or the word snake and return all those results. I believe this is the same thing as choosing the "Search for any terms" radio button.

Use of double quotes (ex. "graphics snake") do not cause the search function to look for the string graphics snake as an individual word/phrase. phpBB does not support phrase-searching like this.

There are also numerous settings/adjustments in phpBB itself, such as when building the search results (this happens behind the scenes) there are a minimum number of characters and a maximum number of characters that limit the results; those are set to 3 and 14 right now, respectfully. So if you're looking for a short phrase such as ok that probably won't get you accurate results. There is also a "common word threshold" percentage adjuster that ignores certain words in queries that meet certain criteria; we have this set to 20% right now. For example the word if would probably match that condition, since it's an incredibly common word used in posts.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Searsh feature for more than 1 word

Post by Bregalad »

phpBB does not support phrase-searching like this.
Oh that sucks because that's exactly what I would be looking for. This is weird especially considering it's simpler to seach for 1 phrase than 2 words (I mean, computationally).

PS : It's also weird a search for "snake graphics" doesn't show your post, which contains both terms (and now this very post should be included too in the list because of this very sentence)
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Searsh feature for more than 1 word

Post by koitsu »

Bregalad wrote:
phpBB does not support phrase-searching like this.
Oh that sucks because that's exactly what I would be looking for. This is weird especially considering it's simpler to seach for 1 phrase than 2 words (I mean, computationally).
I'd urge you to join the phpBB forums and complain then. The general response given by the community there is "use Google" (I see the legitimacy of this but also roll my eyes). The colloquial term you're looking for, BTW, is "phrase search" or "phrase match". If you find a actively maintained phpBB mod(ule) that does what you want and works with our version of phpBB and (as said) is actively maintained then I'd be happy to install it/try it. What I found though is pretty abysmal (meaning I think this kind of functionality has to be added to the phpBB core itself, and not an addon module).
Bregalad wrote: PS : It's also weird a search for "snake graphics" doesn't show your post, which contains both terms (and now this very post should be included too in the list because of this very sentence)
It will eventually. The search index does not get rebuilt every single time someone posts something; search results are cached server-side (part of phpBB). The cache expires every 1800 seconds. Try again in 30-60 minutes and you'll get the result you expect. The caching is done to keep the server load down; there are literally 113108 posts, and the current number of indexed words is at 103264. I think the last time the index got rebuilt (induced by tepples) it took quite some time (hours). Large forums with lots of posts over many many years do not scale well.
Post Reply