April 13, 2004

Language education for the 20th 21st century

Since I seem to have more readers interested in language than I thought, I think I might just post more material on it. I should note, for those who don't read the comments, that my discovery of Fernando Pereira was also blogged by Mark Lieberman at Language Log back in October. We're not exactly at the cutting edge of the literature here at Pedantry.

But, that brings me to today's topic, inspired by this post over at Language Log:

Grammar education: making up for a lost century

I see that Language Logger Geoff Pullum is giving a talk at Northwestern University on Friday 4/23/2004, entitled "What Happened to English Grammar?" I heard Geoff give a version of this talk last fall at Penn, and it's terrific. He's posted a few fragments of this material here in the past, and I hope we'll get a more complete version in future posts. Meanwhile, I've copied his abstract below -- and if you're in the Chicago area on April 23, go to Swift Hall, Room 107, at 3:30 p.m. and hear him.

Try to imagine biological education being in a state where students are taught that whales are fish because that is judged easier for them to grasp; where teachers disapprove of tomatoes and teach that they are poisonous (and evidence about their nutritional value is dismissed as irrelevant); where educated people accuse biologists of "lowering standards" if they don't go along with popular beliefs. This is a rough analog of where English grammar finds itself today. The state of relations between the subject as taught by the public and the subject as understood by specialists is nothing short of disastrous. The fact is that almost everything most educated Americans believe about English grammar is wrong. In part this is because of misconceptions concerning the facts. In part it is because hopeless descriptive classifications and antiquated theoretical assumptions doom all discussion to failure. Amazingly, almost nothing has changed in over a hundred years. The 20th century came and went without affecting the presentation of grammar in popular books or the teaching (what little there is of it) that goes on in schools. Today's grammar books differ in content only trivially from early 19th-century books. In this lecture I name and shame some of those on the long dishonor roll of myth-creators and fear-mongers (John Dryden, Henry Fowler, Ambrose Bierce, William Strunk, E. B. White, George Orwell, Louis Menand, Stanley Fish), and I sketch a view of what could and should be taught in a course on the grammar of Standard English in the 21st century.

I wish I was in Chicago this week, because I have had some similar questions. You see, I started out in linguistics in one of the few debatably prescriptivist areas left in academic linguistics: Terminology and Translation. One of the things that bothered me a great deal in those days was the disconnect between applied linguistics and theoretical linguistics. Theoretical linguistics has very few productive applications to its name that derive from work done after 1950. The basic elements of phonetics and phonology that serve speech pathologists date back to de Saussure. Modern theories of linguistics have not done much to inform language or literacy educators, and most of the informing they have done has proven worthless. The relationship between computer natural language processing and linguistics can best be summarised by this apochryphal quote attributed to Fred Jelinek at the IBM Speech Technology Group: Every time I fire a linguist the performance of the recognizer goes up. As for my original field - terminology, lexicography, translation - practitioners are better off ignorant of modern linguistics.

I am known to be very harsh on linguistics as a whole, probably excessively so. I started out in high energy physics before going into language. The concerns of theoretical physics are not the same as the concerns of engineers, but physicists generally have no trouble showing how the one body of art is compatible with, and even informs, the other. Nothing of the sort exists in linguistics.

I do know people who have tried to address this problem, and the blame does not fall wholly on linguistics, although it mostly does. Pullum complains of a "lost century" in language education. There is a certain amount of entrenched conservativism in perilinguistic fields like language education and translation, but where were us linguists during that century?

Let me offer you an example. My Chinese prof felt the need to cite some linguistics in her Master's thesis, since she was talking about some aspects of Chinese translation and was expected to show that she knew something about language. As a result, there is in her thesis a purely perfunctionary phrase-structure analysis of some phenomena and a citation of some early Chomsky. It is, as she readily admits, entirely superfluous. I pointed out to her how she could instead have gone to someone like Anna Wierzbicka or Igor Mel'cuk for a wholly different kind of linguistics much more oriented towards problems in translation. She had never heard of any sort of linguistics outside of phrase-structure grammar.

There is work in linguistics that is useful to people in perilinguistic fields, and there are a lot of people in perilinguistic fields. The global translation industry alone is huge and growing. Google - I don't know why people never come out and say this - Google is a language technology firm with a half-billion dollars a year in revenue built atop a language technology that was devised completely without the input of linguists. The social significance of natural language information retrieval technology is incredible, and who is the founding demigod of information retrieval? The late Gerard Salton, who had essentially no linguistics background. Language engineering is flourishing, and linguists are nowhere to be found.

Where were we when all this was going on? Why aren't linguistics programmes packed to the brim with future 'Net billionaires?

I've never been one to wait patiently for other people. I saw a gap in the field and I set out to fill it. My thesis started out as an effort to asssemble a functional theory of linguistics designed for language engineers of various kinds. I didn't intend for it to be the alpha and omega of language theory, just a set of serviceable doctrines. I didn't think I was going to need any very original parts - most of what I thought belonged in such a theory already exists in other people's work. I wanted to put together a coherent body of ideas accessible to translators, terminologists, lexicographers, amenageurs linguistiques, computer scientists, cultural anthropologists and maybe even politicians and economists. I even had a way of latching it onto corpus linguistics, which was the hot new thing just then. I had hopes that I could even contribute to education theory, although that was not my focus. I wanted to show how neatly this syncretic framework could resolve a variety of everyday problems all these different people face. There is a big market for something like this, especially if you can figure out how to sell it to people who have already seen linguistics and want nothing to do with it.

I ran into certain problems, primarily running out of money, dropping out of school and moving to California. Later, I got involved in neural networks and evolutionary programming at Stanford and decided to go back to school to study them, with the intent to go on to a PhD. Then, one day - in Roger Vergauwen's class for those of you who know Leuven - I came up with an extended version of my idea, something really novel that I don't think anyone else has ever considered. And now I'm off in a somewhat different direction, one that involves some heavy math that is currently beyond my ken.

But, I'm still looking for someone to write a book titled: Linguistics for Language Engineers: Where it's at and what you need to know about it. It too is on my list of things to get around to one of these days and it should start with something like a secondary school curriculum in language - one that doesn't engage in 19th century grammar teacher myths.

Maybe I can adapt the first chapter from Pullum. I hope a transcript of his talk gets posted on the 'Net.

Posted 2004/04/13 14:27 (Tue) | TrackBack

That's quite a list of people to be shaming.

I'd love to read Linguistics for Language Engineers. I work for a rival of Google, and I sometimes marvel at the way brute force is used (relatively successfully) in the search industry to make up for our lack of linguistic sophistication. The funny thing is, programmers who create computer programming languages probably know more about linguistics (well, at least various classes of grammars) than do the programmers responsible for programs which process billions of natural-language documents every day (though there's overlap between the two groups).

Posted by: Jeremy Leader at April 16, 2004 0:13

Jeremy, one of the odd things about this business is that inspiration can go both ways. One of the two really fundamental, core techniques that I'm using, and the one that directly inspired my someday PhD project, I got directly from Gerard Salton. One of my little secrets that I'm going to reveal here is that the commercial project I'm working on uses an augmented version of Salton's cosine metric, which dates back to the IBM SMART project back in 1968. It's applied in a novel way and the augmentations are not trivial, but essentially I'm using information retrieval techniques to do linguistics.

Posted by: Scott Martens at April 16, 2004 15:27
Post a comment

Remember personal info?