[ TOC ]
This file contains a list of bugs reported or known in Swish-e. If you find a bug listed here you do not need to report it as a bug. But feel free to bug the developers about it on the Swish-e discussion list.
[ TOC ]
Wild card searching needs to be optimized.
Here's a three letter search:
$ swish-e -w 'tra*' -m1 # Number of hits: 99952 # Search time: 5.424 seconds |
Two letters:
$ swish-e -w 'tr*' -m1 # Number of hits: 100000 # Search time: 10.563 seconds |
Single letter search:
$ swish-e -w 't*' -m1 # Number of hits: 100000 # Search time: 510.939 seconds |
and used about 280MB or RAM.
This is a potential for a DoS attack. If you have a large index you may wish to filter out single character wild cards.
The XML parser (Expat) returns UTF-8 data to swish-e. Therefore, the XML parser should only be used for parsing US-ASCII encoded text.
The XML2 & HTML2 parsers (Libxml2) converts characters from UTF-8 to 8859-1 encodings before indexing and writing properties. Indexing non-8859-1 data may result in invalid character mappings.
These issues will be resolved soon.
Phrase search failes with DoubleMetaphone
DoubleMetaphone searching can produce two search words for a single query word. The words are expanded to (word1 OR word2), but that fails in a phrase query: ``some phrase (word1 or word2) here''
swish-e query parser is due for a rewrite, and this could be resolved then.
Reported: August 20, 2002 - moseley |
Merging
merge.c does not check for matching stopwords or buzzwords in each index.
History:
Reported: September 3, 2002 - moseley |
ResultSortOrder
ResultSort order is not used (and is not documented). The problem is that
the data passed to Compare_Properties()
does not have access
to the ResultSortOrder table.
History:
Reported: September 3, 2002 - moseley |
[ TOC ]
$Id: SWISH-BUGS.pod,v 1.7 2003/08/14 20:43:35 whmoseley Exp $
. [ TOC ]