Comparing searching libraries and searching the Internet
Carl Gardner, Billy Guy, Bobby Nunn, and Leon Hughes formed the group called the “Coasters.” In 1957, when even I was pretty young, they recorded the famous song Searchin’ which was written for them. The group was inducted into the Rock and Roll Hall of Fame in 1987, at least partially based on this number-one hit.
Today I want to talk about searching—for technical material—and how the rise of the Internet has changed the search for information.
The theme of their song is about their search for love:
Well, I’m searching
I’m gonna find her
The refrain is:
Gonna find her, yeah ah, gonna find her
Go here to see them be introduced by Steve Allen and then perform their song.
Since GLL has not really been purchased by Google, we are able to say whatever we want about that company. Google is a great search machine, but there are things that I liked back before the Internet that I think are missing today.
Before Internet Searchin’
One of the skills I had when I was a young researcher was the ability to find things in libraries. When I was an assistant professor at Yale the math library was conveniently right next door to the CS building. I often went over to the library and browsed for hours, then would check out a ton of books.
One of the coolest things about being a faculty member at Yale was the fine rule: there were no fines possible for faculty. We could not be charged anything at all for late return of books—this actually was in the Yale faculty handbook. I really liked this, since as an undergraduate and as a graduate student I had often run up serious fines for late books. Late fines were later used by the video rental company Blockbuster, then replaced by a 30-day grace period ended by a full-cost replacement charge, which caused other problems as their fortunes declined. (They have recently been bought out by the Dish Network.)
The Yale librarian did not like me taking books out at all, and especially did not like me taking out large numbers of them. I think there were two issues. First, I was not a real faculty member, that is, I was not on the faculty of the math department. The other was that the library was in a perfect state when all the books were on the shelf in order, not when many books were sitting in my office or sitting somewhere else. Sometimes after I took out a bunch of books the librarian would call—there was no email then—and ask me to return book . I would always comply immediately, since perhaps someone really needed to look something up. But I usually went back in a day or two and took the book out again. It was a simple game we played; I think after a few years we were able to co-exist in peace, even though we didn’t become friends.
Enough about librarians, let’s get back to searching for information. There were a few tricks that I used back then to help me search the library:
Location: I really could remember information based on its physical location in the Yale math library. If someone asked for a reference on something, I often could find the book by recalling that it was on a certain shelf. Of course now and then the library re-structured the whole place and I was disoriented for a while, but I would get back in synch pretty quickly.
Shape, Color, Size: I could also recall a book by its physical characteristics. Paul Erdős and Joel Spencer’s little blue book was the first book on the probabilistic method. I could find it or help you find it by recalling its size and color. It was a thin book too.
Linear Search: No matter how big the Yale library was, I sometimes spent hours just reading through all books in an area of the library. When I was a graduate student at CMU I used to pick a journal, the journal on , pull out all the volumes, and scan through all the articles. One by one: from the first to the latest. Of course I could not read them all, nor could I even do more than just scan them. But the ability to scan all of them in this way allowed me to remember that there was some article on some topic. This often was instrumental in finding information that I really needed later.
One concrete example of this type of brute force search is a long story that I will go over another time. It is about a result on the cover time of a graph, joint with Romas Aleliunas, Dick Karp, Laszlo Lovasz, and Charles Rackoff. One day Dick Karp and I had an outline of a proof that the cover time was polynomial, but we needed a simple lemma. Karp had class, so while he was teaching, I went to the Berkeley library, and after about an hour of brute force search found the lemma we needed in some unrelated article.
Browse: I browsed through the library all the time. I would lookup something that I thought I needed, but once there in the shelves I would look at all the books nearby. This branching search often uncovered great nuggets of information.
Tomography: If I needed to learn something about a topic, especially if I thought this topic could help me prove some theorem, then I did a kind of “tomography.” Suppose the topic were finite group theory. I would go the library and take out 10-20 books on group theory—sometimes it would be all the books they had on the topic. I would not read them all—too hard. I would look at the exact topic I needed and see many different views of the same topic. These multiple views would give me a much better insight into the question I needed. Different authors had their own views, even of the same exact theorem. Some would give examples, some had different motivation, some different proofs, some different applications, but all together would give a fuller picture than one.
All of these techniques seem to be harder to do on the Internet. The search engines, like Google, are of course terrific at finding things, but the techniques that I used to employ are much harder to do today.
Location: I think this means nothing anymore.
Shape, Color, Size: The Internet could allow you to look for a book by specifying: It’s yellow, thin, and on exponential sums. But I do not believe that it does.
Linear Search: This is very hard today because there is so much stuff. I believe that the volume of material, especially since much of it is repetitive, makes finding the golden nugget more difficult.
Browse: Hard to do today, because there is less locality. You can branch and search, but not quite the same as before.
Tomography: I think this is really difficult today for two reasons. We cannot get books—the Internet is best for articles. Moreover, getting all the books or articles on some subject is nearly impossible. The number would probably be close to infinite; in any event it would be overwhelming.
E-Search Not Research?
Ken notes the following: The main advantages of online search are that with minimal skill you can frame criteria to specify what you are looking for, and the results often give you a high-valence tree of links to follow. If you spot a desired asociation among the first 10 or 20 hits, you can often follow the link to find more associations and make a better search. Thus in place of a linear search you are following paths in a tree.
The issue comes up if you find something that resembles, but is not as good as, the you need. There are basically three choices:
- Take and—facing down at a desk rather than forward to a terminal—try to work out how to make out of it.
- Keep on searchin’ trying to find .
- Be happy with , which you found so easily, and change what you’re doing, instead.
With books, the you find is usually at the same level of expertise as what you need, and this plus being at a desk or in a chair promotes the first kind of effort, which is the most valuable for research. Whereas online, there is more inducement not to think, or not to try harder. We have noticed this effect in our own research. Keep on searchin’ is good for not overlooking prior citations, but masks valuable thinking time with the feeling that you’re still being productive click-click-clicking.
Worse IMHO is the third case, whereby the Net can con you into “going with the flow” and thinking about something else rather than the problem at hand. This is a general issue faced first at middle-school and high-school level. Is “e-search”—not research in libraries—being used to produce papers that are broader but shallower? Does a sense of entitlement brought on by having answers come easy keep us from aiming for more? Amid a general discussion of the kind, “Does Google Make Us Stupid?”, we at the above-PhD end can be a valuable test of the answer.
Can we make today’s search for technical material better? Does e-search depress research? Or are the things that I am discussing just old and silly? I guess we all will keep on searchin’.