Page 1 of 1

Wordgraph - how to exclude words

Posted: 7. February 2010 20:26
by urednik
Your Portal Version: 1.0.5
Your phpBB Type: Standard phpBB3
MODs installed: Yes
Your knowledge: Basic Knowledge
Boardlink: http://www.borovnica.eu

What have you done before the problem was there?
Nothing

What have you already tryed to solve the problem?
Edit "search_ignore_words.php"

Description and Message
Hello guys,

My Wordgraph show most common words, but those words are so common, that search engine wont except them.
I wish to exclude words, as "we", "a", "they", ... and others. So Wordgraph will show really important most common words, as "borovnica" and others...

Search engine in PHP have list of exclude words, which is good start to exclude this words.

Is there any similar list for Wordgraph ?
Any other solution ?

Re: Wordgraph - how to exclude words

Posted: 24. June 2010 13:45
by NR43
Hi urednik,

did you -by any chance- manage to create a filter for wordgraph?
If so, please share the code?
Would be awesome, because lot's of 3-letter words are indeed too common for wordgraph to be practical.

Re: Wordgraph - how to exclude words

Posted: 24. June 2010 14:02
by urednik
I just increase number of letters for words, to exclude words with les then to 5 letters.

Re: Wordgraph - how to exclude words

Posted: 25. June 2010 08:57
by NR43
Ah ok thanks.
Will look into this tonight

Re: Wordgraph - how to exclude words

Posted: 28. June 2010 16:53
by Nekstati
To exclude common words (see ACP ? Search settings ? Common word threshold):
find in portal/block/wordgraph.php

Code: Select all

    FROM ' . SEARCH_WORDLIST_TABLE . '
add after

Code: Select all

		WHERE word_common <> 1
To exclude common words and words contained in search_ignore_words.php
find in portal/block/wordgraph.php

Code: Select all

$sql = 'SELECT word_text, word_count, word_id
    FROM ' . SEARCH_WORDLIST_TABLE . '
    GROUP BY word_id, word_text 
    ORDER BY word_count DESC'; 
replace with

Code: Select all

include($phpbb_root_path . 'language/' . $config['default_lang'] . '/search_ignore_words.' . $phpEx);
$sql = "SELECT word_text, word_count, word_id
	FROM " . SEARCH_WORDLIST_TABLE . "
	WHERE word_common <> 1
		AND word_text NOT IN ('" . implode('\', \'', $words) . "')	
	GROUP BY word_id, word_text 
	ORDER BY word_count DESC";

Re: Wordgraph - how to exclude words

Posted: 4. July 2010 09:15
by urednik
THNX !!!

It works!

Maybe this is also good to include into next version of Board 3 Portal ...

THNX again !!

Re: Wordgraph - how to exclude words

Posted: 5. July 2010 19:30
by NR43
Thank you Nekstati!

Re: Wordgraph - how to exclude words

Posted: 14. July 2010 19:09
by Treverer
urednik wrote:I just increase number of letters for words, to exclude words with les then to 5 letters.
first: hi at all :)

your advise did not work in my wordgraph. perhaps the list of words are not updated, but i tried much (not everything ;) )

so use this solution, without Nekstati exclude list firtst:

find

Code: Select all

FROM ' . SEARCH_WORDLIST_TABLE . '
include after in extra line:

Code: Select all

 WHERE LENGTH(word_text)>'.(int) $config['fulltext_native_min_chars'].'
now you preferences of the ACP works immediately.
you need an sql-database, which nows sql-order 'length' - mysql did ;)

so, and now i will work for a solution which integrates Nekstati exclusion word list...

Re: Wordgraph - how to exclude words

Posted: 14. July 2010 19:41
by Treverer
Nekstati wrote:To exclude common words (see ACP ? Search settings ? Common word threshold):
find in portal/block/wordgraph.php

Code: Select all

    FROM ' . SEARCH_WORDLIST_TABLE . '
add after

Code: Select all

		WHERE word_common <> 1
To exclude common words and words contained in search_ignore_words.php
find in portal/block/wordgraph.php

Code: Select all

$sql = 'SELECT word_text, word_count, word_id
    FROM ' . SEARCH_WORDLIST_TABLE . '
    GROUP BY word_id, word_text 
    ORDER BY word_count DESC'; 
replace with

Code: Select all

include($phpbb_root_path . 'language/' . $config['default_lang'] . '/search_ignore_words.' . $phpEx);
$sql = "SELECT word_text, word_count, word_id
	FROM " . SEARCH_WORDLIST_TABLE . "
	WHERE word_common <> 1
		AND word_text NOT IN ('" . implode('\', \'', $words) . "')	
	GROUP BY word_id, word_text 
	ORDER BY word_count DESC";
so, here my solution combided with yours:

it's only the one entry "AND LENGTH(word_text)>".(int) $config['fulltext_native_min_chars'].", but is better now i think...

Code: Select all

$sql = "SELECT word_text, word_count, word_id
   FROM " . SEARCH_WORDLIST_TABLE . "
   WHERE word_common <> 1
   AND word_text NOT IN ('" . implode('\', \'', $words) . "')   
   AND LENGTH(word_text)>".(int) $config['fulltext_native_min_chars']."
   GROUP BY word_id, word_text
   ORDER BY word_count DESC";
a question: is it only possible, to edit the word-exclusion-file by editor? i did not found anything in APC - but APC is large :twisted:

Re: Wordgraph - how to exclude words

Posted: 12. October 2010 22:27
by ray
hm, in 1.0.6 the wordgraph code changed obviously. Do you know what lines to change there?