The following quote is one portion of his wider argument and so a little less ambitious; here he merely makes a good point about doing research wisely in a situation of an enormous amount of digitized metadata.
After going back and reading his entire essay, I must admit that I feel a bit silly only posting a portion of it here. Abbott's whole point is that students and young scholars with whom he has worked lack many basic skills for reading and research, and the thrust of his thesis is against the sort of discrete soundbyte and linking that I'm doing here. I would recommend strongly that you follow the above link and read his complete paper, at which point the following excerpt will really become a rather minor detail in comparison. It was, however, the anecdote that interested me in the full paper, so I suppose there's not too much harm in re-posting it here:
[A student] decided to write on the ways the US government gives out money, in particular about block grants. And he had found interesting material in Lexis/Nexis. He told us the idea of block grants went back far further than we thought. It wasn't just a Reagan invention, it went back to the 1930s. The only problem was that there was plenty of material in the 1980s, the 1970s, the 1960s, the 1940s and the 1930s, but nothing in the 1950s. Try as he would, Brian couldn't find anything.
[...] First he found a 1950s congressional speech sure to contain the phrase "block grant." Then he read the actual document and - sure enough - found the phrase. So it OUGHT to be found by the keyword query "block grant." And he ascertained that the keyword "grant" would locate this document, but the keyword "block" would not. So he tried various misspellings: black, blank, plonk, prank, blink, and so on. He hit paydirt with "blook." It turned out "blook grant" returned dozens of documents form the 1950s, all of which had in them the phrase "block grant." "Blook grant" of course, was an optical character recognition error. And it turned out, Brian discovered triumphantly, that the Federal government had changed its font around 1950. So the OCR algorithms, which are AI based, trained themselves on the old font, and then couldn't read the new font when it showed up. [...]
But it was news to the students that the electronic tools are not perfection. They had no idea that however comprehensive they may be, they are generally less accurate than the print sources that preceded them. Even less did they suspect that those inaccuracies could be systematic. Somewhere out there, they now realized, some idiot is writing a paper on how the concept of block grants disappeared from American political discourse in the 1950s. It is the fear of becoming that idiot that baptized them as serious library researchers.
Again (and now with the benefit of a bold-faced warning about becoming that idiot), please read the whole paper.