INFORMATION TO DO WITH SEARCHING STRATEGIES AND
PROBLEMS


Subject: Searching Call Numbers

Using the Display1 Item command:
  If you type in a call number and put in a $ for a truncation
  at the end you will get a list of call numbers similar to a browse

Subject: Author Keyword search If you try a AUTHOR KEYWORD search on their example: HILL C$ and set the options to PARTIAL-- via webcat you get 20 hits and no other information via chui you get a screen asking if you want the full/partial hitlist, selecting full you get 113 hits. selecting partial, you get the 20 hits webcat users see. OUR DEFAULT IN THE OPTIONS FOR KEYWORD IS "KEYWORD" AND IF THE OPTION IS NOT CHANGED, YES THE USER GETS THROWN INTO A BROWSE LIST. HILL C$ is a good example of an instance where even a high truncation > threshold rolls over to a browse, presumably because C$ returns too > many results--potentially a serious retrieval problem for users with > citations in "scientific" form, where the author's first name is > abbreviated to a single initial.
Subject: SERIAL hypertext links This becomes somewhat longer and more complex than originally anticipated. Earlier this week I noticed a loose thread on my ankle, pulled and pulled, and discovered I had unraveled the entire hem of my trouser leg. I'm beginning to feel the same way about 78X added title entries for serials. I was able to get on the system before anyone was active this morning, to make and test some changes related to "hypertext entries". These are handled slightly differently for terminal and web interfaces, but all affect the 780 (earlier title) and 785 (later title) fields in the SERIALS format. For character interface, a "related subfields" attribute governs what subfields are used to do a lookup from the LIKE button. I believe what subfields are used to do a lookup from the LIKE button. I believe that as part of last summer's V7 to V8 migration, these got reset to ALL, which meant that a LIKE lookup from these fields would always fail. As of this morning, only subfield t is included in the LIKE lookup, and it works. WebCat uses a different mechanism, since there is no LIKE button. Instead, in typical web fashion, any entry that can be used for a hypertext lookup appears as a link (usually represented as blue underlined text). The subfields displaying in WebCat are controlled by the "display subfields" attribute, which has also been changed to include only subfield t for 780 and 785. As of this morning, these lookups finally work in WebCat. See the British periodical HEART for a good example. NOTES: 1. Both character and webcat interfaces are actually doing a PARTIAL title lookup (not a periodical title lookup) for the hypertext entry. In the character user interface, a lookup that returns a large number of hits (although I believe the threshold is 24 hits, I haven't yet found out if it is configurable) first returns an intermediate screen: ========================================================= ============== Partial search results too long To pick a new button, first return to buttons by pressing TAB(s). Select one of the partial search result options, then press RETURN or HELP GOBACK STARTOVER PRINT UTILITIES END CLEAR TYPE ACCESS REQUEST ----------------------------------------------------------------------- PARTIAL SEARCH RESULT OPTIONS Result of partial search too long. Do you want to CANCEL the search, display PARTIAL results, or display FULL results? ======================================================= In WebCat, the same thing happens, BUT WebCat doesn't offer the PARTIAL/FULL response, and defaults to PARTIAL. This hasn't caused any problems in the tests that I have done, but I am sure that there will be situations with words more common than HEART where the desired title is not included in the 24 records returned by WebCat. Sirsi is aware of this problem with PARTIAL searches in WebCat. 2. While researching this, I came upon a record with the title "Combined cumulative index to cardiology," which leads me to ask if similar changes are needed for the 787 (Other Title) field in the Serials format. Currently this field is defined as a hypertext field, which means it shows up under the LIKE button, BUT, it is not included in the FULL entry list, so these titles do not show up in WebCat unless one changes the view option to ALL. Further, because related/display subfields are set to ALL, hypertext lookups fail for 787 titles as they did for 780/785 fields prior to today. For the example title, the following 787 fields exist: Other title: American heart journal Other title: American journal of cardiology Other title: British heart journal Other title: Cardiology Other title: Cardiovascular research Other title: Circulation Other title: Circulation research Other title: Journal of molecular and cellular cardiology If you feel that hypertext lookups for 787 would be useful, policy definitions for this field in the SERIALS format can be brought in line with those for 785/787. 3. BROWSE index policies remain unchanged, so ISSN and OCLC numbers for these linking fields still appear in the browse indexes. My recollection is that this was an express desire of OPAC and serials folks when we initially set up browse index policies. However, staff should be aware of what this actually means, which may not be immediately obvious. A Periodical Title browse on BRITISH HEART JOURNAL returns: BROWSING THE CATALOG 1> BRITISH HEART JOURNAL [2] 2) BRITISH HEART JOURNAL DLC SC 85001051 OCOLC 1537247 [1] 3) BRITISH HEART JOURNAL OCOLC 1537247 [1] Browse index entry 2 actually points to the journal HEART, because this added title entry (including the various control numbers) is encoded in the 780 of this record, and nowhere else. Browse index entry 3 actually points to the record for Combined cumulative index to cardiology journal, which has a 787 entry for this complete field. Unicorn is behaving exactly as it should, given the way we have set things up. Whether the way we have set them up is what we actually want is a matter for consideration by ESC, and perhaps the OPAC working group. 4. Policy changes described in notes 1 and 2 are effective with only a halt/restart, and require no re-indexing. Browse index policies described in note 3 require a total system rebuild (tentatively planned for July 3-4, 1998). There are other problems regarding the use of hypertext links in WebCat for library catalogues and certain Z39.50 destinations. WebCat encloses the hypertext field in double quotes and sends it to the server. Quoted search strings tell BRS to do a literal search of everything: For example: "John, Smith" BRS searches for the comma I have asked SIRSI to get WebCat to strip out punctuation before sending the search string. Another option is to NOT send the string as quoted. This poses problems of its own when the string contains an operator like "and". The TY was programmed to deal with all of the above issues. As Selden mentions, the WebCat hypertext link should not be equated with the LIKE operator. For our in-house workstations I have added Javascript to do some reformatting of hypertext linked fields before they are transmitted to the server. In the meantime, SIRSI has addressed most of these problems within the cgiopac. Slavko
Subject: KEYWORD SEARCHING The default operator delivered by SIRSI for keyword searching is SAME. This operator acts just like the AND operator, only it narrows the search universe to a paragraph (tag). This helps prevent false hits. For example, a search for JOHN AND KENNEDY using SEARCH EVERYTHING could find JOHN WILSON as an personal author, and FRED KENNEDY as an added author. This would be perfectly legitimate when considering the definition of the AND operator since John AND Kennedy were found. Submitting the search with the SAME operator forces the two words to be in the same paragraph, so it is much more likely that you will find Kennedy, John F.,and only Kennedy, John F. Which default operator to use is a local decision. Unicorn supports, AND, OR, NOT, XOR, SAME, NEAR, WITH, ADJ. Greg Mack Lundy wrote: We are on Unicorn98 and use WebCat as our public access method. Assuming that your 020 and 001 tags are keyword indexed, then you can do a search on ISBN and OCLC number by plugging the number into the search field and clicking on "search everything." I will grant you there is some difference in search results when doing a search with AND between the keywords in a general search and the same search without the boolean operator but maybe we should just consider that a BI opportunity : ) this is the way we configure it to work. an exact search uses the browse (heading) indexes to start the search. in our standard configuration we create browse indexes for author, title, and subject. when the search screens are setup general browsing and general exact searching is sent to the subject browse index. Using systemconfig,you can alter our standard configuration. please note that keyword searching works differently: Our standard configuration provides for hundreds of indexes, usually for specific tags, as well as a number of grouped indexes such as author, title, subject and everything. jim
Subject: Searching strategy When you search for "Comedy of errors and rodgers" the system looks for that fully enclosed phrase in ONE bib. tag in any record. Your record does NOT have all these words in that order in one bib. tag. That is why I search for it by enclosing the title in quotes only and using the boolean operator to say this phrase PLUS this word (rodgers) as in the following search string: "COMEDY OF ERRORS" and rodgers now the system looks for the phrase in any tag AND rodgers in any tag but not necessary both all in one tag. Does this make sense? Original question: I don't under stand why when I enter "Comedy of errors and rodgers" the system doesn't just ignore the "of". Instead it seems that this stop word "stops" the search and no results can be found. If I enter "comedy errors and rodgers" I get right to the record. This is an important search feature as it could lead to alot of false assumptions that we don't hold items which we do in fact have in our collection. It is critical then to inform our users that when doing a key word search they should leave off "stop" words.