11. IRCS Improves Sorting of Search Engine Results
This exhibit was researched and written by Nicholas G. Heavens, University Archives Summer Research Fellow, 2000-2002. He is an undergraduate at the University of Chicago and has been a University Archives Summer Research Fellow for the last three years. Much of this exhibit is based on his research of the history of computing at Penn in the summer of 2000.
Institute for Research in Cognitive Science, logo
Those who frequent Internet search engines are often frustrated by the results produced by the search terms they give. Some know techniques to limit extraneous and unrelated entries, but even those cannot totally remove entries that may contain an important reference in one paragraph. The United States Government was concerned by this cause of inefficiency as well and ran a competition in 1995 to create a solution to one aspect of the problem: a program that would find all paragraphs in a text that referred to a particular search term or other words such as pronouns that referred to the search term. Penn graduate students working at the Institute for Research in Cognitive Science (IRCS) created a summarizer tool that would produce the paragraph or paragraphs that contained the search term. While their work did not win the competition, they made one modification in their code and produced better results than the winning entry.
After the competition, the team refined the program and adapted it so that it could search Internet sites and gave an apparently impressive demonstration before the U.S. Congress. Unfortunately, their system took about twelve minutes to use. Using a grant from the U.S. Department of Defense, the IRCS researchers did further research on reducing the operating time of their tool.
Recently, this research has been used to manufacture new systems for data mining that now are sold by Alias I Inc. of Philadelphia, founded by IRCS researcher Breck Baldwin, continuing the University of Pennsylvania’s contributions to Philadelphia’s economy through computer research.