United States navy ditches ALL CAPS message format

Chris Dean's picture

United States navy ditches ALL CAPS message format. (From CNN, 2013, 06, 13)

Score one for Bouma.

oldnick's picture

So…internet “etiquette” trumps over a century of tradition. Marshall McLuhan was right…

Theunis de Jong's picture

"we have torpedos ready stop surrender your vessel stop so please stop stop lol stop"

But does morse code does not support lowercase. How are they going to do that thing with the flashing lights, then? Use Unicode?

"All your 🍕 are belong to us stop"

Chris Dean's picture

In it, the Navy said it is ditching its in-house Defense Message System in favor of e-mail. One with a very apt acronym: NICE (Navy Interface for Command Email).”

Beautiful irony. Capital letters for an acronym to describe a system which no longer uses ALL CAPS settings.

russellm's picture

... a century of tradition.

First it was the rum rations. Then it was Morse Code... Now this.

JamesM's picture

My uncle died in WWII when his ship was torpedoed and sank, so anything to improve naval communications is fine with me. But with that said, given that orders are usually read very carefully, I'm not sure how big a difference this will make, but hopefully it'll help.

Thomas Phinney's picture

Chris, do you actually believe in this bouma nonsense? There is an awful lot of research in this area. The reasons that lowercase is more legible than uppercase have nothing to do with word shape and everything to do with making each letter more distinctive from others.

That's not to say there is no interplay between letters, or that we don't need to recognize words. But we don't do it by the shape of the word as a whole.

Chris Dean's picture

Casual joke.

hrant's picture

A bouma isn't always a whole word; sometimes it's a single letter (and maybe even less often than I suspect).

The "nonsense" is thinking that the brain would simply ignore the information-rich clustering of letters (and often their silhouettes).


quadibloc's picture

I was amused that the article noted that back in the 1850s, teletypewriters with upper-case only character sets and three-bank keyboards were in use by the Navy.

In fact, 5-level code was still very much alive for TELEX messages even in the 1960s (and, in fact, I've heard that it isn't quite dead even yet).

oldnick's picture


Ditto for Weather Service and AP wires in the 1980s, sent to television stations...

russellm's picture

our work purchasing system works in all caps. I write my requisitions like this: "I would like to purchase 100 stop signs please. Thank you very much and have a good day." and they come out as, :" HEY YOU! I NEED 100 STOP SIGNS! NOW!!!

enne_son's picture

[Thomas] … do you actually believe in this bouma nonsense?

Thomas, for the record, the term Bouma Shape was introduced by Insup Taylor and M. Martin Taylor in their 1983 book The Psychology of Reading. From the information below, you will see that the construct includes a whole lot more than “the shape of the word as a whole.” The term is a result of Taylor and Taylor’s efforts to factor “interior features” into the “shape definition” of words.

Taylor & Taylor list “the seven groups of mutually confusable lowercase letters found by Bouma (1971)”. They mean the S1, S2, S3, S4, A1, A2 & D grouping of lowercase letters Bouma introduced in “Visual Recognition of Isolated Lower-case Letters”, which they number from 1 through 7, as follows [the descriptions are Bouma’s; the groupings are a result of confusion frequencies culled from recognition tests]:
1 = a s z x : [HB’s S1] inner parts and rectangular envelope
2 = e o c : [HB’s S2] almost round envelope
3 = r v w : [HB’s S4] oblique outer parts
4 = n m u : [HB’s S3] rectangular envelope with well-expressed vertical outer parts
5 = d h k b : [HB’s A1] ascending extensions protruding from a well-expressed body
6 = t i l f : [HB’s A2] slenderness
7 = g j p q y : [HB’s D] no description
1, 2, 3, & 4 are categorized by Bouma and the Taylors as short (S); 5 & 6 as tall (A); 7 as projecting (D)

Taylor and Taylor: “The “Bouma shape” of a word can be defined by listing the group numbers of its letters: “at” has the shape 16 (short-filled, tall-thin), and “dog” is 527 (tall-fat, short-round, projecting).

Taylor and Taylor found that by their Bouma Shapes — defined in this way — something like 87% of the words in the English language corpus they used unique. If I'm not mistaken, Taylor and Taylor used this to elucidate the contribution made to reading by parafoveal pre-processing and give an account of skipping.

hrant's picture

The term "Bouma shape" has also been used by P. Saenger. I just simplified it (being a big believer in two-syllable labels).

BTW, I would contend that:
- It's not about bouma uniqueness, but the degrees of bouma confusability.
- The silhouettes are much more significant than the interiors, so that grouping is over-simple; the 1/2/3/4 and 5/6 groupings need to be on a joined and on a lower level.
- It's all processing; no pre-processing.
- I think "skipping" might be misleading, since no content is skipped (in "real" reading).


enne_son's picture

The term Bouma shape as used by Paul Saenger in Space Between Words: The Origins of Silent Reading is clearly taken from Taylor and Taylor. But here is how Saenger defines the term in his glossary : “The shape of a word when written in upper- or lower case letters and delimited by space, as defined by the Dutch psychologist Herman Bouma.”

Saenger’s argument in the book is. “While the paleographer’s principal focus has been on the classification of individual letter forms, the student of the history of reading in the medieval West is primarily concerned with the evolution of word shape, and letter forms are important only to the degree that they play a role in determining that shape. Thus the adoption of the miniscule, that is, lower case letters, as a book script is significant for the historian of reading insofar as it contributed, in conjunction with word separation, to giving each word a distinct image.”

I originally glommed on to the “each word a distinct image” idea because of the content the Taylors gave to it, and because of Gerrit Noordzij’s account of the ‘consolidation’ of the word [Chapter 6 of The stroke: theory of writing]. Now I believe that reading starts with a low-resolution indexing of word items in parafoveal preview and proceeds to a feature-analytically based processing routine that occurs at and upon fixation. The low resolution indexing provides informative “ensemble statistics” that go beyond the envelope shape of the word or the simple pattern of ascending, descending and neutral characters, and provides a reference frame for the feature-analytic processing. The feature-analytic processing yields information about words at the level of their role-units (prototypical structures, like stems and counters): the identity of the role units, their relative positions, local combinations (in letters), and their across-the-word distributions.

So where the the Taylors’s notions might apply most directly, as far as I can see, is in relation to the kind of information the “ensemble statistics” contain. There is currently a growing body of research in this area.

But I just wanted to provide a little push-back on Thomas’s “bouma nonsense” comment.

William Berkson's picture

Thomas, these matters are by no means settled. The "word superiority effect"—letters within words are more identifiable in a short time than letters within non-words—prima facie supports the idea of whole word pattern. Word pattern does not necessarily mean the oversimplified version of word envelope, as Peter says. Peter's idea is that for skilled readers and familiar words, the visual pattern of the whole word, like a Chinese Character, is read, rather than by going the route of first identifying letters and then doing a look-up—which we can also do. (And yes, I know there are root characters in Chinese.)

Supposedly the word superiority effect is made compatible with the letter identification and look-up view by 'interactive activation', but that this operates routinely for skilled readers has not been actually tested, only simulated in computer models. It is a model that can simulate the process, but whether it actually works that way in the brain is not known. Peter and I are quite doubtful that it operates routinely as claimed, because we think it would take too much time in the brain for the rapidity of skilled, normal reading. A more more one-way link between visual pattern and sound and semantics (rather than cycling up and down) would be quicker.

My understanding is that a difference between Peter's and Hrant's views is that Hrant thinks Boumas operate mainly in the parafovea, whereas Peter thinks this is an important but limited influence, and the main impact is on letters seen in the fovea.

William Berkson's picture

By the way, Thomas, I am not seeing that upper case letters are less differentiated from one another in terms of shape. Are H and N less different than h and n? Are B and D less different than b and d? What study are you alluding to? I thought the usual explanation for slower reading of all caps, for the letter ID and look-up view, is that lower case is more familiar—which I also don't buy as an explanation, but that is a different story.

quadibloc's picture

Although h and n in one way, and b and d in another, are good examples, a more typical case would be to compare a and t to A and T. The presence of ascenders and descenders does make both words and letters easier to differentiate; all-caps text is clearly significantly less readable than lower-case text. Many studies have confirmed that, although it should be obvious for almost any typeface.

William Berkson's picture

I agree that descenders and ascenders help. My comment was to rebut the explanation that Thomas reports: that they differentiate letters more. As far as topology the caps don't seem any less differentiated. The eye may pick out ascenders more easily, but then you are getting to the added effects of letters being visually within words, which the letter ID and look-up theory deny.

Thomas Phinney's picture

Letter differentiation and even the parallel processing of letters is taking place within the context of overall size, as denoted by the other neighboring letters. I won't suggest for a moment that such does not matter. Without that there would be no meaning of x-height, ascender and descender. Topology alone isn't everything, clearly.

I had always heard tell of "bouma" as the "shape of the word" in terms of its outer silhouette. If the bouma includes the topology of indivdual letters, then... well, it seems kind of like a two-party political system where both parties are moving toward the middle. Instead of a clear dichotomy between two theories, it seems more like a spectrum. I will be curious to hear what folks think the practical differences are between a theory driven by bouma shape (that includes full topology of individual letters) and one driven by individual letters that allows for interactions and effects of neighboring letters. Really, "bouma shape" doesn't feel so top-down in this version....



Delete's picture

Rather than rely on theoretical arguments, there is experimental data from traffic signs that mixed case signs are better than all caps for comprehension.

John Hudson's picture

there is experimental data from traffic signs

Yes, but all this data confirms is that mixed case text is better for traffic signs. The data doesn't indicate why mixed case is better, which is what the theoretical arguments are about.

John Hudson's picture

Thomas: I will be curious to hear what folks think the practical differences are between a theory driven by bouma shape (that includes full topology of individual letters) and one driven by individual letters that allows for interactions and effects of neighboring letters.

I think part of the attraction of Peter's proposed feature role model is that it suggests ways in which we might bypass the problems of interactions and effects of neighbouring letters, specifically the problem of crowding, which demonstrably makes individual letter recognition difficult. If we don't need to recognise individual letters in order to make a first pass at word recognition, then that would explain why crowding doesn't seem to be a major impediment to rapid and accurate reading. If, instead of recognising individual letters as such we are taking in information from multiple letters in the foveal fixation (I'm with Peter on this, not Hrant) and recognising patterns that resolve to letter sequences, then the closeness and visual interaction of those letters that inhibit individual letter recognition actually become useful. This is how I understand boumas: a perceptual unit of recognition.

The latter point is important, I think, because it avoids having to commit to defining bouma as any specific graphical phenomenon -- such as the 'word shape' -- and, indeed, indicates why such definitions are unhelpful. A bouma is a thing in the perception, not a thing on the page. This also means, of course, that it is liable to individual variation, i.e. you and I might perceive different boumas when reading the same piece of typography.

enne_son's picture

[Thomas Phinney] I will be curious to hear what folks think the practical differences are between a theory driven by bouma shape (that includes full topology of individual letters) and one driven by individual letters that allows for interactions and effects of neighboring letters.

Thomas, for me it’s mainly a matter of compatibility with the express and defining attunements of professional type-involved practitioners. And it’s a matter of providing perceptual-processing touchstones for typography and type design.

If a feature-analytic processing of the S1 to S4 / A1 & A2 / D shapes (see above) in bounded maps of visual information is seen as a foundational dynamic at the “front end” of reading, factors like distinctive cue-value, clear delineation, proper salience, relative location and some kind of cohesive equilibrium at the elemental level of shape primitives (letter details) are easily seen to be matters of intrinsic and densely interacting importance. A rhythmic spacing, a consistent contrast-styling scheme and strategic construction here, mindful of these factors, produce a gestalt integrity at the level of the whole word. So this provides a natural fit with the express attunements of experienced type designers and typographers to matters of spacing or fit, consistent contrast styling, and strategic construction.

The feature analytic processing that leverages bouma-shape particulars isn’t the whole story in identifying words though. Feature-analytic processing is generally thought to lead to a kind of parallel letter recognition as a next step which then underpins the orthographic processing necessary to get to words. In current models the feature-analytic processing is often underspecified, but your sense of moving toward the middle is apt. The two processing routines are compatible: they can be seen as different phases or sub-routines of a single hierarchical process.

I’m exploring the equally compatible — and neuro-mechanically plausible — idea that in the normal reading of extended texts by skilled readers feature-analytic processing yields — as a result of “unsupervised” perceptual learning — a more elemental decomposition than the decomposition into individual letters. My more elemental decomposition is a decomposition into what I’ve been calling role units. This then underpins a higher level parsing of the word into what can be called role-unit string kernels.* By leveraging the overlap in these string kernels in familiar words the visual word-form resolution system gets to words and a word superiority effect.

I’m generally hesitant to use the thread-space of a topic with another focus to summarize my perspective, but I feel it might be necessary to show that what’s involved in addressing your follow-up Thomas is not just a moving to the middle.

In the context of this thread, and its bouma-shape by-way, I suspect that the lower case construction is a construction that more effectively leverages for the purposes of individuating gestalt-level units, the dimensionality of the western alphabet’s cartesian feature-space — with it’s more diversified or informationally rich implementation of the compact base-line to x-height zone and more strategic use of the ascender and descender zones.

* string kernels in this scheme are units with the structure x+(y+z)+a (where x and y and z and a are role-units, and the “+” sign inside the bracket indicates a local or contiguous combination implemented by a letter junction or a common edge, and the “+” sign outside the brackets indicates an “open” or non-contiguous combination constrained by immediate adjacency.

Thomas Phinney's picture

Thanks to all who elaborated on the topic.

quadibloc's picture

Incidentally, on the subject of how old 5-level code is: just today, I came across an ad for a 3M Whisper Writer 1000, a teletypewriter that used a thermal printer, in a 1983 issue of Datamation.

Chris Dean's picture

@IsleofGough “…there is experimental data from traffic signs that mixed case signs are better than all caps for comprehension.”


russellm's picture

my eyes.

russellm's picture

if you are looking at a sign from such a distance that you can distinguish the general shape of the words but not the letters, you will be able to understand the message - especially on a traffic or regulatory sign where there is a limited number of possible messages with less difficulty if the message is set in mixed case than if it is set in all caps. This is hardly even worth discussing.


on an 18" square sign is easier to read than


from 100 meters. Try it.

John Hudson's picture

Russell, IoG referred to 'experimental data', not anecdotal evidence, so Chris is quite justified in asking for a reference.

russellm's picture

No doubt he is, but I did suggest trying it :o)

William Berkson's picture

Here are the experimental tests on Clearview Highway, which found that U & lc was more legible—readable quickly at a greater distance—within a given sized sign, than an all caps font. This was just one comparison, though.

Kevin Larson's picture

Peter, would you predict that a lowercase string kernel nucleus is easier to recognize in the presence of a string kernel than on its own: in your example, the a is easier to recognize between h and t than a lowercase a on its own? Is this also true when the letters aren’t part of a word: would it be easier to recognize the lowercase a when it is between j and t than a lowercase a on its own? Does this effect go away for uppercase letters, or is it just diminished?

dezcom's picture

I see a written word as an interaction of shapes as well as agreed upon letter forms. Some type faces do a better job of integrating both.

enne_son's picture

Kevin, good questions. I’m thinking it through — jat is a pronounceable non-word and might partially activate two string kernels, and a letter is a string-kernel too, but with a lower dimensionality, so it gets complicated. I'll respond more fully later, but it sounds like something worth testing.

You might be interested to know that one of Jonathan Grainger’s former students Thomas Hannagan adapted the notion of string kernels to build on Grainger’s “open-bigram” model of how the brain encodes orthographic information during reading. My scheme uses a somewhat different application of the string-kernel concept than Thomas Hannagan introduced by transposing it from the level of letters to the role-unit level. My application and extension of the idea can probably account better for the speed and automaticity of lexical decision, and the speed and automaticity of “visual word-form resolution” in the “immersive” reading of skilled readers (which are probably related), while Thomas’s can probably account better for transposition effects such as those that occur in that “Cambridge” scrambled letter text. These predictions might be able to be quantified, implemented in a model and lead to a test.

Thomas’s string kernel paper — [2012] titled “Protein analysis meets visual word recognition: a case for String kernels in the brain” — is listed here:

John Hudson's picture

Point of information: jat is a word, referring to a member of a Punjabi peasant caste.

Chris Dean's picture

@enne_son: Following Larson’s remarks, you may also wish to consider the difference between what happens with pseudowords, letter strings, and random letter combinations.

enne_son's picture

Chris, Kevin: in the scheme I'm proposing string kernels are “units” in a single “hidden” layer of a 3 to 4 deck hierarchical network. In its default operational mode the network doesn’t do “explicit labelling” at the letter level. So I’m not sure I can make direct predictions that presume explicit labeling of the elements in the string kernel nucleus when the string kernel is presented in isolation — that is, outside of the context of the whole word or the full orthographic sequence, and abstracted from the full hierarchical network.

My first impulse was, however, to say that, as a consequence of the scheme, the >a< probably is easier to recognize between >h< and >t< than when it is between >j< and >t<. Even with John Hudson’s caveat, because for most jat is unfamiliar. The other cases seem less straightforward. But now I'm not so sure there would be an effect.

In my scheme the detection of a string kernel by a string kernel detection neuron builds on activity at an earlier level, which involves what I've been calling role-unit level “quantization,” local combination detection, and across-the-word pair-wise distribution mapping, which occur simultaneously and in concert with each other at a single level. So another way of asking the question is: is local combination detection easier when it is in the context of a word than when it operates without flankers, even if the flankers are part of an actual word.

That’s as far as I’ve gotten so far with an answer.

Albert Jan Pool's picture

After I read that the story about the US Navy ‘banning ALL CAPS’ on the CNN site, I commented that the next thing I’d see happen is the mixed writing of names on credit cards. Someone commented that this would be ‘impossible’ because of legal restrictions. Unfortunately I couldn’t comment on that anymore, because either Disquss or CNN prevents me from tracking and commenting that discussion (is this what Snowden is talking about? :–). Being halfway able to track what I wrote on Typophile, my question for today is wether anyone here knows about such (US?) legislation?

Btw, in Germany, the mixed writing of names on bank cards does not seem to be a problem. On some of my german EC bank cards, my name is written mixed. Also, the new DIN 1450 on legibility states that text has to be written mixed. All caps is to be used for emphasis only. At some point DIN 1450 might collide with some US credit card legislation …

John Hudson's picture

Peter, the Hannagan and Grainger paper seems to associate string kernels strongly with the open bigram coding model. Is this also a factor of your proposal?

In a recent paper, Kinoshita and Norris present experimental results that question the open bigram model.


enne_son's picture

John, my default scheme doesn't have a bigram-coding “deck.” I suspect Thomas Hannagan will find a way to adapt his string kernels idea to accommodate the new data presented by Kinoshita and Norris. Whether what emerges will still be able to bear the name "open-bigram" model, I don’t know.

The assumption underlying most current computational models of orthographic processing is that “the game of visual word recognition is played in the orthographic court of constituent letter recovery.” This is a quotation from Ram Frost in a 2011 Behavioral and Brain Sciences target paper that tries to plot the path towards a “universal” model of reading. Thomas Hannagan applies the string kernel idea to orthography. I apply it to role-units.

John Hudson's picture

BTW, have you read the Norris and Kinoshita 'noisy channel' paper?

'Reading through a noisy channel: why there's nothing special about the perception of orthography.'

I've only seen the abstract, but it sounds lively and contentious:

The goal of research on how letter identity and order are perceived during reading is often characterized as one of “cracking the orthographic code.” Here, we suggest that there is no orthographic code to crack: Words are perceived and represented as sequences of letters, just as in a dictionary. Indeed, words are perceived and represented in exactly the same way as other visual objects. The phenomena that have been taken as evidence for specialized orthographic representations can be explained by assuming that perception involves recovering information that has passed through a noisy channel: the early stages of visual perception. The noisy channel introduces uncertainty into letter identity, letter order, and even whether letters are present or absent. We develop a computational model based on this simple principle and show that it can accurately simulate lexical decision data from the lexicon projects in English, French, and Dutch, along with masked priming data that have been taken as evidence for specialized orthographic representations.

William Berkson's picture

Kevin, John, I've been batting back and forth with Peter some ideas about a brain processing model that would give a more detailed causal account of how the brain would work to identify words in the manner Peter has hypothesized. This model building is with an eye to identifying a crucial experiment to test the usual 'slot processing' of letters against Peter's whole word model.

First, Peter warns me that the phrase "word specific visual pattern" is often used in the literature to refer to all aspects of the word—the type style, weight, the slant, etc. So above I should have more accurately said that Peter's theory claims that immersive reading of skilled readers involves identifying words by the relational structure of letter features across the whole word. The key features he thinks are 'role units' such as ascenders, bowls, stems.

The model I've developed does give a prediction in answer to your insightful questions, Kevin. I call the model "recursive relationship filtering," using "matrix resonance."

Before I describe it, let me give some background that will make it more clear where I am coming from. My impression—and I only know the literature second hand in discussing with Peter—is that most of the modeling is too heavily influenced by the Anglo-American tradition of associationism, beginning with Locke, Berkeley, and Hume. The idea is that the whole story of learning is variations on the theme of repeated association of events. Kant thought that instead we have an inborn framework that we use to interpret perceptual information, and without that framework we are "blind." So association plays a role, but inner frameworks are equally important.

A key development beyond Kant came with Darwin's survival of the fittest. Psychologists and philosophers took Darwin as a model for learning, and said that Kant's idea of inner frameworks was correct, but that we actively adapt and change them, using trial and error to better and more accurately capture meaningful information from the welter of perceptual data that comes into our brains. This was the idea of Külpe in Germany, and is followers in the Gestalt and Würtzburg schools of psychology, and by the American pragmatist Charles Saunders Peirce. My teacher Popper was influenced by teachers from the Würtzburg school, as I explained in my book with John Wettersten, Learning from Error (in the German edition Lehrnen aus dem Irrtum).

The relevance of this long-standing dispute to reading models is that I don't assume that skilled readers simply use the same process as those learning to read, but just do it faster. —The processing patterns of learning and skilled readers are not isomorphic. Instead, I assume that in learning to read fluently, the brain sets up, by trial and error, a framework and pattern of connections that will very quickly and efficiently decode the structural pattern of words.

This is related to Elkhonon Goldberg's theory (See The Executive Brain) that one side of the brain deals more with novelty, and the other with learned, more efficient processing structures. In effect, as we learn, a much more efficient structure is set up on one side of the brain only. This idea is corroborated, Peter tells me, by the discovery that skilled readers have more 'lateralization'—processing only in one side—than learners, who use both sides more.

I'll post this prelim now, and then explain the model later today.

enne_son's picture

John, I've been following the Norris and Kinoshita work for several years and have read the paper you referenced. The contention that there is no orthographic code to crack is a bit of a red herring, because, in the stimulus sampling approach used by Norris and his associates, a representation of letter order is assumed. On their own account “the Bayesian Reader (Norris, 2006) assumes that word recognition involves a representation of input string consisting of sequentially ordered letter identities.” These form the “priors” in a process that decides if the pattern of letters in the stimulus can be fitted to the coded representation. The newer work proposes “an extension to the Bayesian Reader that incorporates letter position noise during sampling from perceptual input.”

My scheme doesn’t do everyday recognition on the basis of a representation of letter order, but on the basis of a relational filtering (see Bill’s post just above) at the level of role units, that detects string kernels before getting to words.

In 2012 Jeffrey S. Bowers and Colin J. Davis wrote a paper highly critical of the Bayesian approach with the provocative title “Bayesian Just-So Stories in Psychology and Neuroscience.”
See the downloadable pdf at the top of the list here:
Griffiths, Chater, Norris and Pouget replied to this here:
Bowers and Davis replied to the reply here:

So it gets complicated and contentuous, and we might have to bring in the US Navy to settle accounts using lower case so we can read it comfortably.

William Berkson's picture

So here's the processing model of 'relational filtering' to identify words, using Peter's concepts.

First, it is a decoding model, encoding structural features, and then matching them to words. I don't know whether it operates at early visual processing levels, or only after the visual pattern is already processed to the point that we would recognize any script that we can't decipher. Second, this basic model doesn't assume that Peter is right about reading whole word structural patterns, though I think he is.

The motto is Michelangelo's "the more the marble wastes, the more the statue grows." Its central feature is a filtering model, rather than a pure building-up model. It starts with a huge amount of data, and ends up with a single word through relationship filtering which arrives at a brief code.

Each filter layer I will describe in terms of functionality, as I don't know exactly how this is executed by one or masses of neurons.

1. The bottom layer or deck is a layer in which we throw a framework of an inner matrix (inborn or learned) over the incoming data.

2. All the information on the bottom layer is sent to layer two, which establishes a 'set' as to what kind of pattern it is: whether the pattern is, for example: 1. a face, 2. a three-dimensional object, or 3. writing to be decoded. If the criteria in this second layer in the 'face' area aren't met, the signal is stopped, and not sent on from this area. Since it is a word the 'word' filter will detect a match, and all of the information will be passed on to the third level.

3. On the third level, letter features are coded. The cells in the matrix in each small area are sent forward to the fourth level, in which a group of say four vertical cells is coded to represent location.

4. The information on the third level is mapped to arrays of role unit detectors on the fourth level; each detector in an array can detect one of the possible combinations of the four states of the cells. All detectors receive the information of all four cells, but only where there is a match to the code of a particular role unit is the information sent on to the next level. All the alternative ID's for a particular role unit are passed on to the next level preserving information on what is adjacent.

5. The fifth level puts together the role units in Peter's 'string kernels', as described above, and passes them directly to 'word codes', which code words as an assembly of string kernels.

6. On the sixth and final level are all words represented in whole word structural relationship codes, codes of adjacent string kernels. Each complete word code assembled on level 5 is 'painted' or sent across the entire vocabulary, or at least common vocabulary, and whatever remembered word code matches that input, responds and links to a meaning and sound. The word is detected consciously at this point.

Now this sixth level is the final decoding level only for skilled readers reading familiar words. If it is an unfamiliar name, for example, or a confusion typo, the information will be sent on to an orthographic level, where letters are identified and look-ups done on higher levels again.

In other words, reading by visual pattern, without any orthographic processing, is the preferred route, as it is quickest. But we can also do quite quickly—but a bit longer—the orthographic route. All of these routes have been built up in the learning process, and unfamiliar names etc. regularly draw on these skills. The idea is that processing by whole word structural pattern is the last learned, but the first used.

When the words are well done visually, and the memory of the code is there, this is experienced as instant reading of words: we don't see letters, we experience meaningful words. In something like the scrambled letter tests, we are conscious of some time taken to stare at a word to decipher it.

The main advantage of of model like this is that it accounts for why we read AlTeRnAtiNg CaSe more slowly, which the interactive activation model just puts down to lack of familiarity. But this explanation is not plausible because the interactive activation says we first identify separate letters as abstract entities, and we are in fact extremely familiar with both lower case and caps.

Next, I will post answering Kevin's questions with this model, and discussing all caps verse lower case. Sorry for the lack of an illustrative diagram. I don't have time to do one now.

John Hudson's picture

I'm sympathetic to both the matrix and role unit ideas but, other than a presumption of efficiency, is there a reason to favour a model that builds up from these to whole word structure recognition, rather than considering that they might contribute to letter recognition and hence to orthographic processing?

William Berkson's picture

I think it's only efficiency, in terms of being quicker. But this is a big deal in brain processing, I think. Matthew Luckiesh compared reading speed to blood pressure. The body will do almost anything to keep up blood pressure, because if it is too low you can pass out and die. Similarly, the brain wants to perceive meaning, and perceive it now. It is a matter of survival, which carries over to reading speed.

The brain is massively connected—more synapses than stars in the milky way!—and does parallel processing, but it is not that quick in cycling through a reflective process. (Reflection undoubtedly involves interaction of levels as envisaged in interactive activation.) Compare the speed of recognizing a face and adding 436 + 784. Also I don't see how going to a look-up of abstract letter identities isn't an additional step and time. You already have the letter features, and relationships in order to identify letters. If you just squeeze the ID from that, without going to abstract letter identities, you are leaving out a step, it seems to me, so that it is quicker. That's why I think we learn the whole word structural pattern—unconsciously, just as we learn grammar in our mother tongue.

Whether we need to identify the flanking role units in the manner Peter suggests is less clear to me. We might have identified the structural pattern of the letters, and of the relational pattern of the role units in whole word in another way, maybe. But I don't see why a look-up of orthography of abstract letter units is needed routinely.

An important point to note is that the word superiority effect is a time-based effect. It is only when you flash words for very short times followed by a mask that you get it. It you look longer at words and non-words, it goes away. That to me is an indication of the fact that the race to word ID is won by whole word relational pattern, before it goes to orthography.

The race is 'won' when you get a meaningful word into working memory, as that is key to consciousness. To me you get the word superiority effect because word ID by the whole word structural pattern gets into working memory before the orthographic processing route. We are also very good at the orthographic route, but it isn't as fast.

John Hudson's picture

I'm not convinced by the efficiency argument, per se, because evolution favours good-enough solutions over optimal solutions. Your comments about the word superiority effect gave me pause, but that is, after all, a letter recognition test. What I have posited is that narrow matrix alignment of role units perceived as boumas aid letter recognition in a crowded context, and the so-called 'word superiority effect' is exactly what one would anticipate in that model, because real words are made up of role units arranged in ways that produce familiar bouma shapes, and non-words are not.

I'm really looking forward to your experiment design ideas!

quadibloc's picture

It sounds as though Norris and Kinoshita are presenting an interesting idea: that people read as a naive person would assume, by recognizing the letters one by one, but the fact that they do so under adverse conditions, and with the aid (or otherwise) of a visual pre-processing system developed for other purposes makes it appear that 'bouma' and word shape play a direct role in reading.

After all, it's only occasionally in reading that one has to guess which letter is in any given position. A lot of people even move their lips when they read. Those who read more quickly are more likely to be recognizing a word at a time, but even they consciously think of themselves as reading individual letters.

enne_son's picture

John asked: is there a reason to favour a model that builds up from role-units to whole word structure recognition, rather than considering that role units might contribute to letter recognition and hence to orthographic processing?

I think it can be argued that in the part of the visual cortex where letters are recognized, the neurons’ preferred receptive field is larger than the single letter, and includes parts of adjacent letters in optimally spaced text typographically speaking. In a relational filtering environment, this mismatch makes the local combination detection required to get to letters subject to crowding, unless the role-units that are adjacent to the string-kernel nucleus, are part of an already learned and synaptically supported code-item.

According to Denis Pelli and others crowding is negligible in foveal vision. However, in a poster presentation at this year’s Vision Sciences Society Annual Meeting last month, Mara Lev, Oren Yehezkel, and Uri Polat found that when foveal processing of letter targets is interrupted by backward masking, the spatial crowding is revealed and that a release from crowding in the fovea is achieved by allowing an increase in reaction time.

So there is a psychophysical reason to favour a model that goes from role-units to string kernels instead of letters. Getting to independent letters would require squelching of elements in the area surrounding the string kernel nucleus.

William Berkson's picture

John, following the above model, the word superiority effect comes from the word getting into working memory before the letters are identified as such. We don't identify the middle letter or whatever but deduce it from knowing the word. So it's not really 'letter recognition' in my model and Peter's theory, it's word recognition and letter deduction.

About what evolution favors, I think being able to read 'meaning' in scenes and situations, very rapidly, is highly rewarded both for survival and reproduction. Language in particular is essential to social interaction, which is the big thing that gives us our advantage. Also it is important to winning and keeping a mate, so good language skills give a competitive advantage, pushing beyond 'good enough.'

The guy who can understand ladies very well is way ahead of the game. Of course I've never met one who could :)

Syndicate content Syndicate content