Welcome to the @internet -- Casting Your Net on 
the Web Page!

Home Articles STARK REALITIES About This Site My PGP Public Key

After Hours Reality Check Magazine A Season in Methven Our Host Send Me Mail

Home Articles STARK REALITIES About This Site My PGP Public Key

After Hours Reality Check Magazine A Season in Methven Our Host Send Me Mail


Fishing for information on the Internet from a shell account is to info-angling via the World Wide Web as using a rod and reel is to employing a gill net. There are some tools available to users of forms-based Web browsers like Netscape Navigator 1.x and Spyglass Mosaic 2.x that character-mode tools can't begin to match in power and ease of use.

These tools can be divided into catalogs and search engines. The oldest of the catalogs is the Whole Internet Catalog (http://gnn.com/wic/newrescat.toc.html) in what, as of June 2, 1995, is now America On Line's GNN. GNN was started in August of 1993 as an experiment in advertiser-supported services on the Web and it offers users both a hierarchical tree and a Boolean keyword search capability of its many categories.

Another such catalog is offered by the EINet Galaxy (http://galaxy.einet.net/). Like the Whole Internet Catalog, EINet's entries are categorized by subject area and the whole catalog is searchable via a Wide Area Information Service (WAIS) forms-based query engine.

One of the best-known catalogs on the Net is the so-called Yanoff List (http://www.uwm.edu/Mirror/inet.services.html). Scott Yanoff, now a teaching assistant at the University of Wisconsin at Milwaukee, began compiling his List in 1991 and it has grown since then from the original half-dozen "interesting items" to its very impressive current size. Although Yanoff's List offers no search capabilities, it does offer links to a very wide variety of Internet resources and is well worth a visit.

One of my favorite searchable catalogs on the Web is Yahoo, (http://www.yahoo.com/), (which stands for Yet Another Hierarchical Organized Oracle--although there are several suggested alternatives to the word "organized"), a project begun by David Filo and Jerry Chih-Yuan Yang as graduate students at Stanford University. Yahoo now lives at Netscape Communications having become so popular that it outgrew the capacity of its original home at Stanford. As its name implies, it is arranged hierarchically by subject area, so it's browsable, or it can be searched for keywords using Boolean AND, OR and NOR arguments.

A large, hierarchical catalog is the W3 Virtual Library, (http://www.w3.org/hypertext/DataSources/bySubject/Overview.html) which offers several ways to view its listing of distributed links to specific content area home servers. Each subject area's specific links are independently maintained by individual content area administrators, thus the "Beer & Brewing" topic's sublinks actually live at http://www.mindspring.com/~jlock/wwwbeer.html.

This year, America On Line also purchased WebCrawler, (http://webcrawler.com/) one of the best-known search engines on the Web. WebCrawler began in January, 1994, as a project by Brian Pinkerton, a PhD. student at the University of Washington, as one of the first of what Pinkerton characterizes as "Web robots". WebCrawler conducts searches of Webspace, building broad indexes of pages it is told about, then following hyperlinks embedded in those pages to discover new resources as it proceeds. It also has a client mode, where it searches hitherto-unexplored links in documents whose indexed contents meet search criteria entered by a user with a forms-based Web browser.

Like America On Line, Microsoft has also seen the wisdom of buying into existing Internet search technology. Microsoft chose to license the Carnegie-Mellon University's Lycos search engine (http://lycos.cs.cmu.edu/), a fully-indexed searchable database of over 5 million Web documents. Lycos' automated exploration routine combs the Web to locate new or changed documents and then builds abstracts of each new entry. These abstracts consist of the page's title, headings and subheadings, its 100 most significant words, plus the first 20 lines of the document and its size both in bytes and number of words. Lycos (named after the Wolf Spider) offers users a quasi-Boolean search engine and revelancy-ranked search results. Carnegie-Mellon has also licensed Lycos' technology to The Library Corporation's NlightN online service (http://www.nlightn.com).

Finally, I have to mention the Configurable Unified Search Interface (http://pubweb.nexor.co.uk/public/cusi/cusi.html). CUSI is the most amazing collection of forms-based search engines on a single Web page that I've ever seen. All of the tools I've described in both this column and in June's column can be accessed simultaneously from this singe source. There are local mirrors of CUSI all over the globe. Use this one and they'll call you Ishmael!

(Copyright© 1995 by Thom Stark--all rights reserved)