Searching your databases with SFgate means filling out HTML forms to specify queries. So setting up an SFgate Application means to set up a HTML form which provides the user interface to SFgate.
Most of the examples used in the explanations below are taken from the demo (see section A Sample SFgate Application).
With SFgate you get a lot of configuration options for your searches. Most of them belong to the presentation of search results some of them belong to the generation of the results and some make it possible for you to obtain debugging information.
Principle there are two ways to do the configuration. If you use forms you can do the configuration per FORM tags within your forms. On the other hand almost every configuration option has a default value which is hardwired within `config.pl'. Editing `config.pl' is the second way to configure SFgate searches. Values in `config.pl' act as default settings.
In some cases it does not make sense to have configuration options set per forms. Here the only way to change settings is to edit `config.pl'. On the other hand it does not make sense to hardwire some configuration options. Here you have to use configuration per forms.
Querying WAIS databases with SFgate is typically done by filling input fields in forms. Let's have a short look at the query syntax to use within these input fields.
Generally speaking you can use every feature of the freeWAIS-sf query syntax (see section `Query Syntax' in The freeWAIS-sf Manual). But according to the different possibilities to build up a SFgate query form the SFgate query syntax has been extended. SFgate parses queries in its own syntax to map them on the freeWAIS-sf query syntax.
A query consists of a set of conditions. A condition consists of a
fieldspecifier, a predicate and a comparison value if a special field is
to be searched. To search the global part of a WAIS database the
fieldspecifier is omitted, the predicate can be one of soundex
and phonix
.
Query conditions can be combined using boolean operators and/or parentheses. Query conditions can be nested in arbitrary depth.
See section C The Detailed Query Syntax for a more formal view on the query syntax.
Online HTML forms are the appropriate user interface to your databases. Forms make it easy to the user to specify his Information need.
We suppose that you've knowledge of setting up HTML forms. Otherwise "Learning by Doing" is the best way to get familiar with forms contacting SFgate. So if you get to a point where you don't know how to continue take a look at the demo files or at other SFgate applications.
SFgate supports the request methods `GET' and `POST'.
To specify databases, name the used FORM tags database
. You have
to know the name of the server, the port if different to 210 (the
default port number for the WAIS protocol) and the
name of the database:
<INPUT NAME="database" TYPE="checkbox" VALUE="ls6-www.informatik.uni-dortmund.de:210/demo"> A demo database with scanned images
The following leads to the same database since the port is 210:
<INPUT NAME="database" TYPE="checkbox" VALUE="ls6-www.informatik.uni-dortmund.de:210/demo"> A demo database with scanned images
In case you are using the Wais module and you want to search local databases (i.e. databases residing on the same host as your HTTP server) you can use the local search facility. Searches will be much faster than without local search.
Use the local
keyword to specify local databases and provide the
full path to the database:
<INPUT NAME="database" TYPE="checkbox" VALUE="local//home/wais/wais-sources/demo"> A demo database with scanned images
Note the double slash.
If you set the default database directory within the Makefile.PL
run to
`/home/wais/wais-sources' (see section 3.3.1.3 The Default Directory for WAIS Databases), the following would lead to the same database:
<INPUT NAME="database" TYPE="checkbox" VALUE="local/demo"> A demo database with scanned images
To specify freeWAIS-sf fields (see section `Definition of Field and Index Types' in The freeWAIS-sf Manual) in forms for use with SFgate, just name the used FORM tags like the fields existing in your WAIS database.
For example, if you have a field au
for author in your
freeWAIS-sf database you can specify in your form:
Author: <INPUT TYPE="text" NAME="au">
Everything written in the belonging input field will be searched in the author index.
There's one special field within WAIS databases: the default or
global field. To search this field you have to provide an input
field named text
:
Global Field: <INPUT TYPE="text" NAME="text">
If you want more than one input field for one field in the database you can enumerate the fields. For example if you want two author input fields, specify:
Author: <INPUT TYPE="text" NAME="au_1"> Author: <INPUT TYPE="text" NAME="au_2">
Later in this section we'll show how to choose individual Boolean connectors for each input field.
Note that freeWAIS-sf gives the possibility to index date fields. A date
value in an document is represented in the form yyyymmdd
in the
corresponding index.
To query such an field you can setup three input fields for year,
month and day. Suppose we have an entry date field in our
database called ed
:
entry date: year (yyyy, yy): <INPUT TYPE="text" NAME="ed_year" SIZE=4> month (dd, d): <INPUT TYPE="text" NAME="ed_month" SIZE=2> day (mm, m): <INPUT TYPE="text" NAME="ed_day" SIZE=2>
The second possibility is to set up just the year input field:
Entry date: <INPUT TYPE="text" NAME="ed_1_year" SIZE=10>
(Note that enumeration is also possible with date fields!) In this case the user has various possibilities to specify dates:
Another possibility to build up the query form is to provide various input fields for which the user has to decide which field is to be searched.
The name of such an field consists of fieldsel
and an optional
enumeration. For one input field you've to provide various FORM tags:
One tag is for the input field itself, the field name suffix is
_content
:
<INPUT TYPE="text" NAME="fieldsel_0_content">
The second tag is for field selection, the field name suffix is
_name
:
<SELECT NAME="fieldsel_1_name"> <OPTION> au <OPTION> ti <OPTION> text </SELECT>
Because field names in WAIS databases often are mnemonic, you could
provide a more comprehensive description to the user. The first step is
to replace the _name
in the field selection tag with the suffix
_description
:
<SELECT NAME="fieldsel_1_description"> <OPTION> author name <OPTION> title <OPTION> text </SELECT>
The second step is to tell SFgate to what field the descriptions do
belong. Use _name_<field>
as suffix where <field>
is
replaced with the name of the field in the WAIS database:
<INPUT TYPE="hidden" NAME="fieldsel_name_au" VALUE="author name"> <INPUT TYPE="hidden" NAME="fieldsel_name_ti" VALUE="title"> <INPUT TYPE="hidden" NAME="fieldsel_name_text" VALUE="text">
Note that enumeration does not make sense with the field -- description mapping.
By the way, if you enumerate the other fields, be sure that the
enumeration comes directly after fieldsel_
: Right:
`fieldsel_0_description'. Wrong: `fieldsel_description_0'.
Numeric database fields can be searched with the following predicates:
You can let the user choose the predicate to be used within
searches. Take the name of the FORM tag name of the numeric field
(including the enumeration if present) and append the suffix _p
:
<SELECT NAME="py_p"> <OPTION> < <OPTION> <= <OPTION> == <OPTION> >= <OPTION> > </SELECT> <INPUT TYPE="text" NAME="py" SIZE=4>
The same is valid for date fields. If there are field selection within your form, specify predicates as follows:
<SELECT NAME="fieldsel_1_description"> <OPTION> volume number <OPTION> issue number <SELECT NAME="fieldsel_1_p"> <OPTION> < <OPTION> <= <OPTION> == <OPTION> >= <OPTION> > </SELECT> <INPUT TYPE=TEXT NAME="fieldsel_1_content"> <INPUT TYPE="hidden" NAME="fieldsel_name_vo" VALUE="volume number"> <INPUT TYPE="hidden" NAME="fieldsel_name_no" VALUE="issue number">
Analogous to specifying predicates you can specify indextypes for text
input fields. Just use suffix _i
instead of _p
. Here is a
simple example:
<INPUT TYPE="text" NAME="au" VALUE="fuhr"> <SELECT NAME="au_i"> <OPTION> soundex and plain <OPTION> plain <OPTION> soundex </SELECT>
Valid names for the different indextypes are listed in the table below.
If you set up a form with multiple input fields you may want to choose
the boolean operator to connect the contents of the belonging input
fields to a query. To do this set up a FORM tag named tie
and set
it to the desired boolean operator.
Connect fields with <INPUT TYPE="radio" NAME="tie" VALUE="and">AND <INPUT TYPE="radio" NAME="tie" CHECKED VALUE="or">OR.
Alternativly set the variable $tie
in `config.pl' to the
desired operator. The default value of tie
is OR.
The second way to set boolean connectors applies to connecting single query conditions with the rest of the query.
To specify that a single condition must be met for resulting documents
take the FORM tag name of the field and append _and
:
The following condition must be met: publication year <INPUT TYPE="text" NAME="py_and">
The same result is given with an additional FORM tag which name consists
of the input field name plus the extension _tie
:
The following condition must be met: publication year <INPUT TYPE="text" NAME="py"> <INPUT TYPE="hidden" NAME="py_tie" VALUE="and">
This makes it also possible to let the user of the FORM choose the connection for a single input field:
publication year <INPUT TYPE="text" NAME="py"> connect with: <SELECT NAME="py_tie"> <OPTION> and <OPTION> or </SELECT>
This mechanism works also for field selection for input fields.
To set the boolean opeator for in-field query conditions either take an
input field tag named tieinternal
or edit the variable
$tieinternal
in `config.pl':
Connect in-field query conditions with <INPUT TYPE="radio" NAME="tieinternal" VALUE="and">AND <INPUT TYPE="radio" NAME="tieinternal" CHECKED VALUE="or">OR.
With tieinternal
set to AND a query condition like
author=(norbert fuhr)
would result in author=(norbert and
fuhr)
.
The language modules (`$SFgate/lib/SFgate/Languages') contain translations of the boolean operators for the specific language which you might want to use instead of AND, OR and NOT (see section 5.6.1.2 Language of SFgate Output).
To achieve a two dimensional boolean connection of query conditions you
can group several input fields and specify how query conditions
specified within this group shall be connected to the rest of the
query. Take group
as name to build up groups:
<INPUT TYPE="hidden" NAME="group_1" VALUE="ti_1,ti_2,ti_3"> <INPUT TYPE="hidden" NAME="group_2" VALUE="au_1,au_2,au_3"> title <INPUT TYPE="text" NAME="ti_1"> <INPUT TYPE="text" NAME="ti_2"> <INPUT TYPE="text" NAME="ti_3"> Connect this group with: <SELECT NAME="group_1_tie"> <OPTION> or <OPTION> and </SELECT> author name <INPUT TYPE="text" NAME="au_1"> <INPUT TYPE="text" NAME="au_2"> <INPUT TYPE="text" NAME="au_3"> Connect this group with: <SELECT NAME="group_2_tie"> <OPTION> or <OPTION> and </SELECT>
Note that grouping of fieldselection input fields is also possible. Take
the fieldsel
identifier with the optional enumeration to name a
fieldselection input fields in a grouping element.
Groups can be used to specify an indextype for every member of the group. This can be done for group two in the example above as follows:
Select indextypes for the author group: <SELECT NAME="group_2_i"> <OPTION> text <OPTION> soundex </SELECT>
SFgate's output are HTML pages (what else?). You can configure the Layout of these pages.
Per default SFgate inserts in header and footer of every page a link to a short description of SFgate.
You can change this behaviour by providing pieces of HTML code to be inserted on SFgate pages instead of the default header and footer.
To do this specify a FORM tag named application
and set this to
the name of the application:
<INPUT TYPE="hidden" NAME="application" VALUE="demo">
If you set the application files directory (within the Makefile.PL
run) to
`/example/dir' (see section 3.3.1.10 Directory for Application Files) create a
file `/example/dir/demo_header' and insert there the HTML code
for the header. If you want to provide language specific headers
(see section 5.6.1.2 Language of SFgate Output) you should append the language
respective, e.g. for english:
`/example/dir/demo_header_English'. Note the capitalized first
letter!
Define a file for the footer analogous: `/example/dir/demo_footer'
(`/example/dir/demo_footer_English').
SFgate can do its output in several languages:
Thanks to
The default language is english. To select another language either needs
editing the variable $language
within `config.pl' or
selection via a FORM tag named language
:
<SELECT NAME="language"> <OPTION> english <OPTION> dutch <OPTION> french <OPTION> german <OPTION> italian <OPTION> portuguese <OPTION> spanish <OPTION> swedish </SELECT>
If your search yields to results you are presented a headline menu per default. Every headline points to its corresponding document.
To obtain verbose headlines set a FORM tag named verbose
or the
variable $verbose
in `config.pl' to 1
. If you choose
verbose headlines you obtain to every headline the following meta
information concerning the related document:
The default are non verbose headlines, i.e. verbose
is set to
0
. You just get the text of the headline.
<INPUT TYPE="radio" NAME="verbose" CHECKED VALUE="1">verbose headlines <INPUT TYPE="radio" NAME="verbose" VALUE="0">short headlines.
Per default you can select one document from the headline menu which is fetched directly after selection. To get further documents you have to return to the headline menu to select the next one and so on.
If you want to select more than one document from the headline menu to
get the selected documents concatenated to a list set a FORM tag named
multiple
to 1
or edit the variable
$multiple_choice
within `config.pl'.
<B>Multiple choice headlines? <INPUT TYPE="radio" NAME="multiple" CHECKED VALUE="1">YES <INPUT TYPE="radio" NAME="multiple" VALUE="0">NO.
Setting multiple
to 1
yields to an headline menu where
you can select documents by checking checkboxes. Pressing the
fetch document
button yields to fetch the checked documents,
pressing the new choice
button clears the checkboxes.
You can choose wether you want your headlines to appear in a
description list, in a preformatted style or in a table. Set a FORM tag named
listenv
or the variable $listenv
in `config.pl'
to DL
for a description list (this is the default), to
PRE
for the preformatted style or to TABLE
for a table.
<INPUT TYPE="radio" NAME="listenv" CHECKED VALUE="DL">description list <INPUT TYPE="radio" NAME="listenv" VALUE="PRE">preformatted <INPUT TYPE="radio" NAME="listenv" VALUE="TABLE">table.
If you want to customize the number of hits transferred at most take a
FORM tag named maxhits
to specify the desired value or edit the
variable $WAISmaxdoc
in `config.pl'. The default is
40
.
How many hits do you want at most? <INPUT NAME="maxhits" TYPE="text" VALUE="40" SIZE=2>
To pass over the headline menu presentation of a query result set to get
the documents found directly set a FORM tag named directget
to 1
or edit the variable $direct_get
within `config.pl'. The default is
presenting a headline menu (directget
set to 0
).
If directget
is set to 1
, SFgate skips the
headline menu and concatenates all found documents to the result
HTML page.
<INPUT NAME="directget" TYPE="checkbox" VALUE="1">
If there are more query matching documents than shown in a search result the user has to reformulate his query by increasing the setting for the maximum number of hits (see section 5.6.2.4 Maximum Number of Hits) to see these documents.
Another way is to provide a pointer to further matching documents. This
can be done by setting the value of range
either in the query form
or within `config.pl'. If range
has a value greater than zero
SFgate tries to fetch so many documents as specified by maxhits
,
beginning with the range
'th document. So obviously range
should be set to 1
in your form if you want to provide pointers
to further matching documents:
<B>Give pointer to further matching documents?</B> <INPUT TYPE="radio" NAME="range" CHECKED VALUE="1"><B>YES</B> <INPUT TYPE="radio" NAME="range" VALUE="0"><B>NO.</B>
SFgate knows three kinds of document presentation.
The first form is presenting a single document which results from following directly a link from the headline menu. The document is shown without the corresponding headline.
The second form results from bypassing the headline menu: the result is a list of documents with their headlines.
The third form results from a multiple choice headline menu. Here the documents are concatenated without their headlines.
Before showing documents SFgate can convert them. Each converter is
encapsulated in its own Perl library. Look at the files in
`SFgate/Converters/' below the Perl library directory you specified
within the Makefile.PL
run (see section 3.3.1.1 Installation Directory for Perl Libraries).
To make use of a converter set up a FORM tag named convert
. Use
the basename of the library in which the converter resides as the
value:
Conversion of documents: <input type="radio" name="convert" value="Bibtex"> BibTeX <input type="radio" CHECKED name="convert" value="Label"> pretty
With multiple choice headline menus there's the possibility to select
the conversion from the search result page. Configuration for this
feature must be done in the SFgate application form. You have to provide
names (and optionally descriptions) of one or more converters which
should be selectable from the multiple choice headline menu. This is
done via an input field tag named convertm
. The value is consisting of
a semicolon seperated list of converter specification. A converter
specification consists of the name of the converter (see the
convert
field above) or the name, a comma and a converter
description.
<INPUT TYPE="hidden" NAME="convertm" VALUE="Label,pretty;bibtex,BibTeX">
These are the same converters as used in the example above.
The best thing to do to get to know how to write/install your own converters is to look at the converters coming with the distribution. These are the main keypoints on writing a converter:
Makefile.PL
run. It must be encapsulated in a package
named exactly like the library.
convert
subroutine which is
called by SFgate. It is called with two arguments, the document text
itself and the headline. These two items should be converted and
returned. If you don't want the WAIS headline to be diplayed along with
the converted document just return an empty string instead.
&::encode_entities
, e.g. $ntext =
&::encode_entities($ntext)
encodes the special chars in $ntext
.
Here's an example of a conerter doing exactly nothing:
package SFgate::Converter::Non; ## ##################################################################### ## convert ## ##################################################################### ## interface for SFgate to convert one document ## ## (string) $text: document to convert ## (string) $headline: headline of document to convert ## ## By default every document is printed with its corresponding wais ## headline. If you want to derive another headline just change ## the $headline variable according to your needs. ## ## HTML special characters (&, <, >) should be encoded in text not ## being HTML code. To do this you can use the function ## &::encode_entities, e.g. $ntext = &::encode_entities($ntext) ## encodes the special chars in $ntext. ## ## returns: ## - string: the converted document ($ntext) ## - string: headline of converted document ($headline) ## sub convert { my($text, $headline) = @_; return($text, $headline); }
Besides the possibility to convert documents you can convert headlines
before they are displayed within the headline menu. Therefore you must
provide a convert_headline
subroutine within the converter module
to be used. According to the specification of a converting module for
documents (see section 5.6.3.1 Converting Documents) set up a FORM tag named
converthl
. Use the basename of the library in which the headline
converter resides as the value:
<input type="hidden" name="converthl" value="Donothing">
And that's the way a headline converter works: Your library holding the
headline converter (in the example `SFgate/Converter/Donothing.pm')
must provide the subroutine convertheadline
which takes two
arguments:
$headline
$url
The return value should be the new headline and the corresponding url. Take a look at the `Donothing' example:
sub convert_headline { my($headline, $url) = @_; # here your headline conversions should go... return qq[$headline]; }
If some or all elements of a query result set consist of WAIS source descriptions (see section `Database Description' in The freeWAIS-sf Manual), these are converted to a query form on the fly. Within the result set you can select the corresponding databases by checking checkboxes and sending queries to them through filling a textarea.
If there are WAIS source description within the query result the multiple choice headline menu option is disabled.
If something does not run as expected perhaps debugging information leads to the solution of the problem.
There are two possibilities to get debugging information. The first is to dump the environement of the process running SFgate the second is to get debugging information concerning the execution of SFgate.
To obtain the environment settings of your executing SFgate, you can set
a FORM tag named dumpenv
to 1
or yes
. The environement is dumped
to an HTML page and the scripts exits.
<SELECT NAME="dmpenv"> <OPTION> no <OPTION> yes </SELECT>
You can't set this option within `config.pl'.
To get (a lot of) debugging information you can specify a FORM
tag named debug
and set the value either to on
or to
1
. A value of 0
or off means not to print debugging
messages.
If there are problems with SFgate you should try this option.
<SELECT NAME="debug"> <OPTION> off <OPTION> on </SELECT>
The default value for this option is off
.
If you want to set this option in `config.pl' edit the variable $debug
.
By default SFgate writes a logfile `SFgate.log' in the directory
you specified within the Makefile.PL
(see section 3.3.1.11 Directory for SFgate.log)
run. If you want to turn logging off, set the variable $logging
within `config.pl' to 0.
Consider that you HTTP server must have write permissions to `Sfgate.log'. Else logging is disabled silently.
You can't set this option within a form.
On some terminals it's not very easy to generate the german special
characters (Umlaute) like ä
, ö
, ü
, ß
. To
make this more easy you can specify these characters in LaTeX notation,
i.e. "a
for ä
, "o
for ö
, "u
for
ü
, "s
for ß
. Just set the detex field or edit the
$detex
variable in `config.pl':
<INPUT NAME="detex" TYPE="hidden" VALUE="1">
With freeWAIS-sf comes the possibility to index HTML pages and put the
URL of the documents in the document id instead of an conventional
WAIS document id. Additionaly these documents get the document type
URL
. If SFgate has to show the headline of this document it
extracts the URL and lets the headline point directly to the HTML
document so that no WAIS request is needed to fetch that document.
Instead there are two possibilities. On the one hand SFgate gives as link just the URL so that the HTTP client has to fetch the document if the user requests it. On the other hand SFgate can code the URL so that a user request of the document leads to a call of SFgate, which then has to fetch the document via its own HTTP client.
The first method is the default one. If you want the second method,
you have to edit `config.pl' and change the value of
directhttp
from 1
to 0
(or set up an FORM tag
named directhttp
).
Even if the first method is the default way to fetch HTTP documents, there are cases where it makes sense to let SFgate fetch such documents.
The first case if a user does document selection via an multiple
choice headline menu. The second case is when the headline menue is to
be omitted (directget
is set).
In both cases the result is a page with possibly more than one document concatenated.