Legibility/Aesthetics - Improving the reader experience.

Chris Allen's picture

Afternoon all,

I'm currently looking at how the study of the effects of aesthetics on legibility can help to improve the reader experience.

Now I know this is a very much debated area, the whole area of "how do you measure legibility" (of which http://typophile.com/node/41365 is a great thread), and the fact that aesthetics are subjective (as I believe is legibility to a point - reader preference, familiarity, etc.).

What I am looking at, is if the conducting of legibility tests, followed by subjective aesthetics tests (in which various samples would be created with varying levels of creative elements and the same tests applied as in the legibility studies as well as preference, etc.) to see if it is possible to find a balance point between the two. The theory is that if you can find a point where the legibility is maximised, and then find the point where the aesthetics don't negatively affect the legibility, then you can effectively improve the reader experience.

My thought behind the possible application of this is in uses such as study materials and required reading. For example if you can increase the aesthetics to a point where they are maximised without decreasing the legibility of the text, can you in theory improve the reader experience, and would it have an additional effect on other aspects; retention, comprehension.

I know this possibly sounds a little vague, but I'm really interested in people's thoughts on this.

Thanks

gohebrew's picture

Aesthetics improves legibility.

Aesthetics draws the eyes' attention during reading.

Legibility then promotes it movement, while digesting meaning along the way.

This then is the secret of a very food text typeface. A text typeface is used to promote successful reading. When the text typeface incorporates subtle but attractive aesthetics without sacrificing legibility, the reading process advances to a successful conclusion.

William Berkson's picture

Karsten, it is possible both to track too tightly and hurt readability, and too loosely and hurt readability. In both cases the "gestalt" of words is hurt, but in different ways. Peter's theory is a gestalt theory of reading, not an 'atomistic' one, if you take the atoms to be letters.

"Crowding" refers to close proximity that hurts quick identification of letters. It is something there is a whole literature on, Peter tells me. The connection of it to the ease of what Peter calls "word form resolution", or what you could call "word gestalt" is complex, and not yet studied.

enne_son's picture

Karsten, Eben hasn't said how he defines crowding, but in the psychological literature the follwing is a representative: “Crowding is defined as impaired recognition of a suprathreshold target due to the presence of distractor elements in the neighborhood of that target.” [from: Christopher W. Tyler and Lora T. Likova, “Crowding: A neuroanalytic approach” Journal of Vision (2007) 7(2):16, 1–9 http://journalofvision.org/7/2/16/]. I’m not sure I'd say it underlies the seeing of gestalts as wholes.

You asked, “[w]hat riddle is there that needs to be solved?” I’m not sure it’s Chris Allen’s so I probably shouldn’t press it, but my riddle is, is the current way of schematizing reading, as for instance, an automatic process mediated by the activation of abstract letter identities, correct? If it is, it might be difficult to understand why it is important that the cluster of letters form an optically integral or cohesive object-like unit / gestalt.

The current way of schematizing reading would put letter-form construction into the domain of legibility, or as I like to say, perceptual discrimination affordance, and gestalt-integrity into the domain of aesthetics. I want to believe, however, that there are “perceptual processing in reading” reasons for attending, in type design, to issues of gestalt integrity, or as I like to say, the optical-grammatical cohesiveness of the letter cluster, considered as a bounded map of black and white sub-letter forms.

To make progress with Chris’s question it might be important to resolve this fundamental disconnect between the priorities of type design and a central aasumption of the science of reading.

k.l.'s picture

Indeed I read your remarks in the light of what I read about crowding before -- and placed your remarks in the same category. My mistake. Thanks for the clarification and also for an additional article.

I am aware that crowding is described as something negative, disturbance rather than formation, yet think one might reinterpret it as formation rather than disturbance. I see traces of such a reinterpretation in this introductory article but may err.
My problem with the conception of crowding is in its implicit premises. Bill, as you indicate, these studies deal with crowding only in so far as it disturbs identification of individual letters -- which presupposes an atomistic model of perception. Moreover, they try to explain perception as a mere technical process, as mechanics of the eye and brain. There is nothing wrong with describing the "technical" conditions for perception, but if the goal is to explain perception on this level alone, I fear they are on the wrong track.

Interestingly, the article which you recommend in your latest post refers to this one -- Korte: "Ueber die Gestaltauffassung im indirekten Sehen", 1923 -- and I am curious to read it. At least the title makes direct reference to gestalt. (It looks like crowding, as dealt with in articles today, is the result of transplanting a observation from gestalt context into an atomistic context. Need to read this text.)

enne_son's picture

Karsten, there are many more papers where the Tyler / Likova paper comes from. See: http://journalofvision.org/7/2/

I think you’re quite right that “these studies deal with crowding only in so far as it disturbs identification of individual letters.” The Patrick Cavanagh review deals with peripheral vision. It helped me see that the important function of crowding in parafoveal preview is to constrain it to assembling ‘accurate ensemble statistics.’ Wxactly what these ensemble statistics are or have a bearing in is unclear, but they go beyond just ‘envelope structure,’ or other markers of simple word-shape. Foveal vision is still required in visual word-form resolution, but what occurs there is contextualized and facilitated by this assembly process.

Most studies don't deal with crowding in foveal vision. I think crowding comes into play under various conditions in foveal vision. I'd love to explore this with you and others further, but maybe in a new thread.

k.l.'s picture

Many thanks! Reading and getting hold of the older article may take some time.

If Mr Allen doesn't object, maybe better stay here. Otherwise we end up with two parallel discussions.

ebensorkin's picture

I would prefer that you stay here too - as long as it's okay with Chris.

Chris Allen's picture

Eben/Karsten,

More than happy for this to stay in the thread, it's a tangent, but still as far as I can see totally relevant to the discussion.

Just like to say I'm very appreciative of the in-depth discussion on this thread and would like to thank everyone as it has given me a lot to think about. Sorry for not responding at all, but as well as trying to digest all this new information, I've been working on a draft of material for my tests which has kept me terribly busy these past few days!

Again thanks.

Chris.

dberlow's picture

"...if you take the atoms to be letters."
Yes, well, we've been there before. Only people who don't know enough about the parts that are smaller than letters, consider the letter the atom. Sheedy et al fell into this pit of uselessness when they concluded that the for-centuries-beloved-of-readers Garamond e is the least readable of all e's. lol.

"...it might be important to resolve this fundamental disconnect between the priorities of type design and a central assumption of the science of reading."

...might? ;) It should be clear that most "science of reading" is no more related to type design, than the "practice of type design" is related to the creation of charts for eye exams. The convenience of common symbols (letters), and the words we have in common for the names of letters used on eye charts, does not seem to stop "scientists" from confusion over the difference between seeing letter, and reading words, lines, paragraphs and pages and pages of text.

When one's definition and study of "crowding" is another's definition of "normal everyday composition for the purpose of reading", (as in Tyler & Likova), I say you have 20/20 proof!

Cheers!

enne_son's picture

David, my feeling about Tyler & Likova: transpose their discussion of “encoding, labels, conjuncton and attention,” and especially “network relaxation minima” to the domain of my role units (your “parts that are smaller than letters”) and entire bounded maps (full typographic words), and we've made a start toward understanding how visual wordform resolution — a species of the genus: gestalt vision / gestaltauffassung in direct vision — works in a way that is sympathetic to expert-level type design and type-formatting attunements.

Nick Shinn's picture

...sympathetic to expert-level type design...

Philosophically that 's not possible. In type design aesthetics *is* functionality, and aesthetics is impervious to scientific analysis.

In other kinds of design, it may be possible to separate function from visual style, but not type, because it only exists to be looked at.

enne_son's picture

If type only exists to be looked at, then it shouldn't be used for making distracting things like words. Words pull attention away from how it looks to what is being said.

I can be categorical as well. In type design aesthetics and functionality are intertwined. Functionality is foundational; aesthetics a humanistic experience-improving virtue or plus value. Good spacing plays a role in rapid automatic visual word-form resolution. It also looks nice.

Philosophically type design is feature-manipulation. The perceptual processing impact of feature manipulation — to what degree it affords or promotes seamless or effortless network relaxation inside the visual cortex, for instance — is not, in principle, impervious to scientific analysis. Every feature-manipulative move has an aesthetic or gestural-atmospheric impact as well. That aesthetics resists quantitative formalization and technological control or automation doesn't mean the structures it puts in place, or the type of experience it promotes can't be rigorously explored.

billtroop's picture

'but not type, because it only exists to be looked at.'

These speaks the wise auteur of Fontesque.

Nick Shinn's picture

If type only exists to be looked at, then it shouldn’t be used for making distracting things like words.

What I mean is that other design objects have functions that are independent of how they appear to the eye.

Now, the aesthetics of, say, a chair, may be how it feels to sit in, or of a knife, how it feels to cut with, and in that sense the aesthetics is very functional and no different than with type. But don't people usually use the term aesthetics as part of a duality which splits "mere" visual styling from the "ugliness" of mere function? I don't believe you can do that with type. If it doesn't work it has poor aesthetics; it can't have good aesthetics and not work.

If you look at the words to appreciate the shape of their letters, that's not reading, is it?

Good spacing plays a role in rapid automatic visual word-form resolution. It also looks nice.

Exactly; there is no difference. Spacing is quite literally all about looking nice ("precise").

That aesthetics resists quantitative formalization and technological control or automation doesn’t mean the structures it puts in place, or the type of experience it promotes can’t be rigorously explored.

Agreed, but people expect that such exploration will lead to scientific principles which can be used to engineer predictable results, and that's not possible--at least, not at the "expert" level you alluded to Peter.

Nick Shinn's picture

These speaks the wise auteur of Fontesque.

With somewhat better grammar than you, monsieur.

I think you would agree that Fontesque, even the Text version, is not an ideal face for extended reading, and in that sense, isn't its awkward aesthetics concommitant with its functionality?

enne_son's picture

What I mean is that other design objects have functions that are independent of how they appear to the eye.

I'm much more comfortable with that. Unlike in some other design domains both the functionality and the aesthetics of typographic objects are tied up with how they look.

Agreed, but people expect that such exploration will lead to scientific principles which can be used to engineer predictable results, and that’s not possible — at least, not at the “expert” level you alluded to […]

Perhaps, but case-specific affordance ranges and thresholds can be fairly tightly scripted. And these thresholds and ranges impose practical benchmarks or constraints. At distance x and type-size y typeface z will not be easily readable under nightime driving conditions from a distance d assuming a driving speed s for people with 20 / 20 vision.

dberlow's picture

"Unlike in some other design domains both the functionality and the aesthetics of typographic objects are tied up with how they..."
disappear. And don't you forget it! :)

Cheers!

enne_son's picture

My riddle again, this time framed in Denis Pelli & Katherine Tillman’s terms.

Denis Pelli & Katherine Tillman write (The uncrowdend window of object recognition, Nature Neuroscience, 2008): ”Some objects are recognized through a single combining of features over the whole object, whereas other objects require separate combining over each of several regions of the object. The distinct regions define parts. In an object with multiple parts, each part must be recognized before they are all joined together.”

What is the case with words?

Is the combining metaphor right?

If reading involves the former, improving the experience might involve facilitating a single combining of features over the whole object; if reading involves the latter, improving the experience might involve facilitating separate combining over each of several regions of the object.

What should type do?

Cognitive and perceptual science seems biased toward the latter; tpyographers and type designers seem biased toward the former.

billtroop's picture

>Is the combining metaphor right?

Where's the metaphor?

>Cognitive and perceptual science seems biased toward the latter; [typographers] and type designers seem biased toward the former.

That's because what a good type designer/typographer knows how to do is:

facilitate separate combining over each of several regions of the object

by:

facilitating a single combining of features over the whole object.

dberlow's picture

Peter Quoting Study: "In an object with multiple parts, each part must be recognized before they are all joined together.”

So, reading some arabic or most chinese, carpet or carpentry appreciation, landscape, seascape or sky cape recognition, follows recognition of each part? What should type do? As it must do everything for all readers, it does. No one here should tell you there is no recognition by letter any more than anyone "out there" should tell us that there is no recognition by word.

Peter:"Explain to me in perceptual psycho-physical or perceptual-mechanical — perceptual science — terms why it is important — if it is — that the cluster of letters form an optically integral or cohesive object-like unit."

It is important — I know it is — that the cluster of letters form an optically integral or cohesive object-like unit, for if any of the thousands of clusters expected from a cluster-making mechanism fail to present clusters of sufficient optical integrity or cohesive object-like unitization, (or proper cluster separation or deunitization), then the mechanism responsible for the failure will not likely yield sufficient financial rewards to allow its maker future cluster-making mechanism occupation.

Cheers!

enne_son's picture

“No one here should tell you there is no recognition by letter any more than anyone “out there” should tell us that there is no recognition by word.” [David Berlow]

No one does. It’s just how these things are schematized — given what is know about human physiology, the structure of the visual cortex, information processing and neuro-mechanics — that gets me going.

But why should that matter? Typography and type design have gotten pretty far with it’s own type-tribal inuitions about these thiings.

“[…] facilitate separate combining over each of several regions of the object by: facilitating a single combining of features over the whole object.” [Bill T]

I’m not sure this can be done. I'm assuming wide spacing and closed forms would do the former, while rhythmic spacing and forms open to each other might do the latter. Also, closed forms might be more susceptible to crowding.

Instead of integration or combining, I think in terms of quantization. This involves resolving the information derived from the retinal and ganglion projections of stroke complexes into role-units, like stems and bowls and counters. As I see it now, quantization occurs at salient centres, but simultaneously and interactively all across the bounded map or uncrowded span. Combining isn't done, and parts (in the Pelli / Tillman sense) don’t emerge as labelled entities, but role-unit identities and direct associations are noted and calculated, along with positional information.

When a criterial amount of information at the role-unit level is resolved, lexical identity emerges and the reader moves on.

ebensorkin's picture

I started reading about crowding over the weekend( Pelli / Tillman etc). It's pretty interesting!

jupiterboy's picture

I worry the reading experience could become so efficient as to bypass comprehension, so that vast areas of white pages would need be inserted to allow the brain to catch up with what the eyes had entered as data.

ebensorkin's picture

You worry about that happening as a consequence of what?

From what I have read so far ( not enough ) the crowding effect is many things including a bottleneck to speed that isn't going to go away anytime soon.

jupiterboy's picture

Small joke Eben. As a consequence of the letters and setting being super-optimized through the application of scientific principles.

dezcom's picture

"...reading experience could become so efficient as to bypass comprehension"

That may have happened already, James. There is no sure way to tell if a regression eye movement is caused by difficulty reading or to pause for comprehension of the text already read :-)

ChrisL

jupiterboy's picture

The world may never know. It would be difficult to call the possibility of the test subject becoming distracted negligible in any scenario I can conceive. Sometimes getting solid data depends on ignoring the right things I suppose.

dberlow's picture

"No one does."
Oh?

"Typography and type design have gotten pretty far with it’s own type-tribal inuitions about these thiings."

I'm going to assume you mean intuitions and things, and that you don't imagine us as natives of the polar regions meeting with Scandinavians in tribal conclaves. But corrected, it's a flawed statement nevertheless, because it ain't "our" intuitions we work towards you know. You do know that don't you? it's the tribe of readers, that's gotten us this far thus far.

Cheers!

enne_son's picture

By tribal I meant intuitions largely specific to the type-design and typography elite. And it is your intuitions, not those of the tribe of readers. Of course it’s the everyday reading actions of ordinary human readers in their reading that either stumble over or bite effortlessly through what you provide. But it’s you who acts on your own ideas about the root causes of their stumbles, where your readers may only be able to express frustration and report symptoms.

The intuitions are probably pretty good, but it might be nice to know they're not out to lunch in evidence-based theory-of-reading terms.

William Berkson's picture

Peter, I think David's point is that the type designer's intuitions about what works is ruthlessly tested by the market place of publishers and readers. So most of type design lore, such as related in Tracy's Letters of Credit, does reflect the input of readers over the centuries. As they say, the shoemaker may be better at making the shoe, but it is wearer who says whether the shoe pinches.

If we get to really good tests, I'm sure they will confirm much of existing lore. But there will be surprises, as there always are when breakthrough science is done. It hasn't happened yet, but as I have said I think it can and will.

enne_son's picture

Bill, there is a distinction between practical intuitions about what works and craft-based intuitions about the perceptual mechanics of reading. I think the higher service of science to typographers and type designers is in informing us about the latter.

Quincunx's picture

I think Unger also talks about some of the aspects discussed in this thead in his book While You're Reading. At least an interesting read abount reading in general and what influence typedesign/typography (thus aesthetics) have on the process of reading.

enne_son's picture

Yes, Jelmar’s right. Many aspects of reading are effectively discussed in Gerard Unger’s engaging While You’re Reading. See especially the chapters “Disappearing Types, ”The Process," and “Reader’s Eyes.” The account is accessible, worth-while and as Unger himself says, “greatly simplified.”

To my mind Unger takes at face value too much of what is concluded about foveal vision and parafoveal preview in the studies his account relies on. In a section that relates to the specifics at hand, Unger writes: “In one model, each letter is read individually and words are similarly compiled one by one. Meanings are then allocated to each word and the words are then assigned to their places in the sentance. In this way a text is reproduced and comprehension built up in a step-by-step process. This may be how it works with children learning to read and at the same time learning to link sounds to letters. In the case of experienced readers the process may be different. Once the experienced reader has got going, he is thought to project ahead of him, as it were, expectations regarding the content. He takes in the text several words at a time, testing his expectations […]”

The “each letter is read individually” might seem innocent enough, but how might the process be different in the case of experienced readers, if it is? Other models of how reading might be different at the letter-assembly stage are not discussed.

Denis Pelli's picture

"No one here should tell you there is no recognition by letter any more than anyone 'out there' should tell us that there is no recognition by word."

Actually, we did a study to measure how much each of these processes contributes to reading. We went to a lot of trouble to use methods that would be convincing. What do you think?

Pelli, D. G., & Tillman, K. A. (2007) Parts, wholes, and context in reading: A triple dissociation. PLoS ONE 2(8): e680. http://www.plosone.org/doi/pone.0000680

Kevin Larson's picture

Prof. Pelli, thank you for pointing to your article; I had not seen that paper before. As one of the people who has been strongly advocating the view that letters are the primary route to word recognition, I think this is a nice demonstration that there is are whole word and context components to word recognition.

I’m struggling with the context only condition (+S –L-W). Earlier studies that asked readers to predict the next word in a sentence found that people could only guess the next word ~20% of the time, and function words made up most of the correct guesses. How were you able to get the accuracy rate up to 80% in order to measure the reading speed of this sentence: “eE TbF FrD Qt fHe nQcM A”? Presumably there is accurate word length information here that wasn’t present in earlier studies, but 80% still seems high even with infinite time. Was some sort of adjustment made to get a RT for this condition?

Could you explain how was Table 2 created from Table 1? I haven’t figured out how you made that transformation.

Cheers, Kevin Larson

Denis Pelli's picture

dear kevin

thanks, good points.

"I’m struggling with the context only condition (+S –L-W). Earlier studies that asked readers to predict the next word in a sentence found that people could only guess the next word ~20% of the time, and function words made up most of the correct guesses. How were you able to get the accuracy rate up to 80% in order to measure the reading speed of this sentence: “eE TbF FrD Qt fHe nQcM A”? Presumably there is accurate word length information here that wasn’t present in earlier studies, but 80% still seems high even with infinite time. Was some sort of adjustment made to get a RT for this condition?"

good question. no, there's no adjustment, but bear in mind that a difference between our method and "guess the next word in the sentence" is that our observers, because they had unlimited response time, could also use context information from the FOLLOWING words in the sequence of 6 if it helped them. you are asking how people still managed to read well enough to measure an 80% threshold in the hardest cases. it was mostly a matter of getting some practice with the substitutes (overcoming the knockout), and partly that even letters that did have substitutes weren't always substituted (randomly, they were occasionally "substituted" by themselves). in the demo that you quote, we did all the substituting possible, to maximize the intensity and purity of the experience for the readers of our article. we talk about this in the methods: "The L knockout by substitution is quite effective, but not total. We attribute the residual reading rate (50 word/min) in the triple-knockout condition (see Table 1) to letter decoding, i.e. we think that L as reported here slightly underestimates the true value of the process. We tried making the knockout more severe (by not allowing a same-letter substitution when alternatives were available) but, as one would expect, this makes some conditions untestable because the observer cannot reach 80% correct at any presentation rate." thus, in the "S-only" case that you ask about, we think observers were still getting some help from L.

"Could you explain how was Table 2 created from Table 1? I haven’t figured out how you made that transformation."

getting from table 1 to table 2 is a matter of fitting the LWS model (R = L + W + S + e) to the measured reading rates, minimizing the error across all 8 conditions. table 1 shows the measured rates and table 2 is the model fits.

does that help?

best

denis and kat

Denis Pelli
Katharine Tillman

p.s.
all our papers are available at our web site as PDF files. we are very interested in the discussion here about the role of aesthetics in legibility and hope to address that in future work.
http://psych.nyu.edu/pelli/

ebensorkin's picture

Thanks Denis!

dberlow's picture

Fabulous! As one of the people who has been strongly advocating the view that letters are only among the routes to text recognition, I believe this is the best demonstration I've seen that there are whole word and context components to the process.

From the study: "We can be confident that each of these three manipulations affects only one of the three sources of word information in the text."

I can't be confident in this unfortunately. In the study's analogy to a store, this study uses lying customers (surrounding mangled words) to come 'round and "shout" at the sales people while they are trying to understand their chosen customer (word). This is done, e.g. by alternating case in words? I cannot help but think this is over-doing it (shouting), as opposed to using small caps instead of uppercase, and systematically replacing only ascending and descending letters (whispering the word shape away).

To put S&S's study into this study's store analogy, one'd be asking the store management if they thought the sales force would think it easier to serve a crowd of "expected customer types", or whether it'd be easier to serve a heavily accented crowd of unexpected tourists, who also (by mistake) turn out to be 8 inches tall and very hard to hear.

Cheers!

enne_son's picture

Denis, I share David’s sense that this moves the discussion forward, especially in the characterization of W.

However, I can’t be confident either that each of your three manipulations affects only one of the three sources of word information in the text.

My holistic process — let’s call holistic: H — involves more than your W. Your W is my holistic process’s parafoveal component, which I think of in ensemble-statistics terms rather than word-shape terms. My holistic process H also has a foveal component. The foveal component of my H is letter-feature based, but transgraphemic, or letter-boundary transgressing. In other words, it relies on location, identity and direct association information existing at the granularity of letter parts and evoked forms, but it doesn’t rely on channeling that information into independent letter slots. Instead of channeling information into slots my foveal H-process samples letter-parts and evoked form information simultaneously from all locations within the whole. The whole is considered as a bounded map of letter-parts and evoked forms, that is, as a bounded map of blacks and whites. (Channeling can however be successful in alternating case.)

This is like your “one-step assembly (features to word),” except that the “features” the assembly of ensemble statistics — your W — relies on are probably of a different order than those which are available to foveal vision.

My foveal H-process description uses ideas about feature-analytic processing developed by Frank Smith in his 1967 dissertation; ideas about pattern-unit processing developed by Neal Johnson prior to 1979, and ideas about identification matrices developed by Richard Golden in 1986.

In my scheme parafoveal pre-processing — your W — facilitates but doesn’t replace the foveal component of my H. In other words. I’m not convinced that your W process is actually “common to both central and peripheral vision.”

So here we have not one, but two, two sources for H, and together they provide an alternative processing path, not an additive compliment, to L — or so it seems to me.

When the foveal component of my H fails in the silent substitution paradigm, the perceptual processing system tries L. The letter-based identification mechanism isn't absent here, but the L-mechanism has to contend with misleading letter cues, so makes allowances for the possibility of substitutions, slowing sense-following down dramatically.

The only condition in which the foveal component of my H-process doesn’t encounter problems is with normal upper and lower case.

I wonder what this does to your analysis.

Interestingly, I discovered while trying to read your paper with partially fogged-up glasses, I could read the L-knockout text with remarkable ease and accuracy. When my glasses cleared up the L-knockout text again became very tough-going.

William Berkson's picture

Hello Dennis,

I have been discussing Peter’s theory with him for some time, and have come up with some experiments to test it against the current received views which Peter calls ‘slot processing’. These are tests of ‘gestalt’ effects that are different from those you have so far identified, as Peter notes above. Kevin Larson has also expressed some interest in these experiments being carried out.

In order to understand the ‘gestalt’ effects that Peter writes about, it has helped me to put this in terms of a model that captures at least some features of his ideas, and makes it easier for me to think about the issues. Let me explain the model, and then the experiments.

The model I call a ‘matrix-resonance’ model. The idea here is that visual processing will identify sub-letter features, and place them into a structure similar to positions in a number matrix, as in matrix algebra. The position in such a word matrix would indicate spatial relations to other features in the word. Sub-letter features would be coded in some way like "left ascender" or "counter open on the right", and would have an indicator of spatial position in relation to other features.

Now sub-matrices can be ‘read’ out in matrix algebra by multiplying the matrix with another matrix that has a bunch of zeros in the appropriate places. Similarly, the features in a word matrix could be analyzed into letter matrices, and identified as letters. And I think that in fact happens.

But simultaneously, in parallel processing, and following Peter’s ideas, I think a whole word matrix ‘resonance’ process is going on. If a single pitch tone rings out in a room it will ‘paint’ vibrations over the whole room, but only a lampshade that is ‘tuned’ to the right pitch will resonate. Similarly, I think that the brain simultaneously ‘paints’ the whole visual memory bank of word matrices, and if one matrix pattern stored in memory ‘resonates’ then that is immediately identified as semantically the right word. And the process of resolution into letters is dropped, as the eye makes the next saccade.

Whether the matrix is read as a word or as letters depends on a number of factors. If the word is unfamiliar it will require the breakdown into letters. Also, if the reader has an attentional "set" to proofread, there will also be a breakdown into letters. But generally, in my view, there is a bias favoring reading the matrix as a whole as a word. That is to say, the whole word matrix is sent to all the stored record of matrices of role units associated with semantic word meanings. And when it "resonates", to use my earlier metaphor, with the stored matrix then the meaning is checked as sensible in context, and the eye moves on--or at least doesn't back track.

The basic a priori argument for such a scheme is efficiency. After all people instantly recognize Chinese characters, where even the ‘radicals’ that make up other characters often have as many strokes as a short English word. Why shouldn’t the brain take advantage of this pattern recognition capacity, and bypass an elaborate ‘look up’ process?

Furthermore, the matrix doesn't have to match memory exactly. There is probably some kind of partial resonance, which is good enough to pick out a word, so as long as there are not competing words that are 'rung' by the matrix. Thus wrong spelling, broken letters are not even noticed.

The point about this model, which is supposed to articulate Peter’s theory—though he may not agree with all of it—is that word gestalt may be defined DIFFERENTLY from external word shape or just what is seen in the parafovea. If this model has any validity, there is often not letter by letter processing, even though this kind of processing would fall in your “letter” category.

How would one test for the existence of such such a sub-letter feature word pattern view, verses a letter by letter parallel processing only? I’ll explain the experiments in my next post, and why they are in fact likely to be sensitive to the kinds of issues types designers are typically focused on in their work.

dberlow's picture

"I discovered while trying to read your paper with partially fogged-up glasses..."

You must luuuuuuv Cleartype! ;)

"After all people instantly recognize Chinese characters..."

Do all people instantly recognize Chinese characters? This is the parallel wheel upon which many of these experiments must also turn, whether the study likes it or not.

Cheers!

William Berkson's picture

So in the matrix-resonance model, we do parallel processing, but we are able to process and identify the word from patterns of sub-letter features across a whole word. We don’t always have to first resolve the sub-letter features into letters. That’s the basic point Peter has been arguing, and he convinced me of it after I was initially skeptical.

Being a former student of Popper, I knew the key issue in science is of devising tests, preferably crucial ones, for any theory; otherwise it’s not really part of science. The question I was wrestling with is: how in the world can you show that such processing is going on, given that if you can see the sub-letter features you can also see letters.

After reading the part of Kevin Larson’s Science of Word Recognition article on the word superiority effect (WSE), I got an idea. The word superiority effect shows up when sets of letters are flashed at a person, followed by flashing a “mask”, typically a line of xxxx, to the observer. The mask is supposed to stop visual processing, because it puts new information into the visual cortex, overwriting the old. The idea, of Reicher, was to study just the visual processing. When you do this kind of test, it turns out that people are better at identifying letters in words than in non-words.

So if the initial word flashed was ‘made’, people will be better at saying correctly that the correct missing letter in “ma_e” is a ‘d’ and not a ‘k’—better than if it were a non-word, such as “dmae” and you showed them “_mae” and gave them the alternatives d and k. As Kevin notes, this word superiority effect has always been the best argument for the existence some kind of word-gestalt effect, known on Typophile, thanks to Hrant and following Taylor & Taylor’s text on reading, as the ‘Bouma’ (rhymes with trauma :).

Following Peter’s view, and my experience designing a typeface family, it seemed to me that if you more widely spaced—tracked out—the letters, this should at some point destroy the word superiority effect. The idea is that the distances between the different letter parts in a word like “the” are important, and not only the relative distances between parts of individual letters. So increasing the spacing between letters would mess up the word-gestalt, or Bouma. Hence the WSE would vanish at greater spacing. And that would show that more than letter identification is important.

Now I told this to Kevin at TypeCon in Seattle, and he was interested. Then Peter did a literature search—and he has made an amazingly broad and deep study of the reading science literature—and found that indeed this test had actually been done—and I was right in my prediction, which had been based on Peter’s ideas. The WSE did go away with wide tracking. The original paper on spacing and the WSE was by Purcell, Stanovich, and Spector in 1978, with follow up articles by Holender in 1979 and Purcell and Stanovich in 1982.

Now the fly in the ointment has been that mixed-case words liKE this ExAMpLe also show a word superiority effect. This led almost all researchers around 1980 to think that the WSE was some kind of memory and guessing effect, not a word gestalt effect, and to adopt the view that reading is totally by letter processing in parallel, with no trans-letter effects. However, as Peter pointed out to me, in these mixed-case tests there was a significantly time between the showing of the group of letters or word and the showing of the ‘mask’. This time lapse is known by the awkward term “stimulus onset asynchrony “ (SOA).

My conclusion has been that there are two WSE effects. One is influenced by orthography, and kicks in with longer exposure to the letters (longer SOAs). The other is more a word gestalt effect that is clear at short times, before the information goes further up the processing chain. This goes with my ‘matrix-resonance’ model, explained in the first post.

So based on this idea, I predict that with longer exposure to the letters, the word superiority effect will re-appear even with the widely spaced letters. –Or to make it sound much more impressive: the WSE will reappear at longer SOAs :) This will establish that there is a WSE that is due to word gestalt effects, as well as one due to memory effects. In other words, this will show that the typographer’s distinction between legibility of individual letters and readability of text is valid. Or in other words, the Bouma rides again :)

Now I should emphasize that the word gestalt effects here are not simply the word outline, or what can be seen blurrily in the parafovea, which you, Denis have tested for. This, I supposed is a gain in skipping words and in knowing where to plant a fixation, based on information gained in the parafovea. But there is much more to word-gestalt processing than this, if Peter is right, and if my model has any validity.

This is only one experiment, and I would propose a series involving the kinds of things that type designers are concerned with. These I’ll post next.

William Berkson's picture

Before I get to more experiments, a couple of responses:

David, I left out a comma. It should be: "After all, people instantly recognize Chinese characters." Also I shouldn't have written 'instantly', which is an exaggeration. I meant that proficient readers of Chinese can recognize them probably as quickly as we recognize letters. I don't know about tests of this though.

I am also aware that most Chinese characters are composed of two to four sub-characters. However, even the sub-characters (radicals) often have more strokes than a short English word. The point is that these basic characters are not modular in the way that alphabetic words are, and yet there is extremely rapid recognition. Of course we also very rapidly recognize faces, which are far more complex than letters. The basic point is that we humans have immense powers of pattern recognition. I don't see why our brain would forgo using them while reading. It seems to me it would prefer the seemingly quicker path of recognizing the word from the word pattern of sub-letter features, as Peter has argued and my model elaborates.

Denis, from the point of view of Peter's theory, and my model, with letter substitution you are not preserving 'whole word reading' and knocking out letter reading. You are just preserving whole word reading in the parafovea. In the fovea letter substitution knocks out the pattern of sub letter features. Thus by letter substitution you are completely defeating the kind of whole word reading that Peter thinks is going on routinely.

What you are doing with alternate case is quite different: forcing parallel processing of letters. You report that Frank Smith found that alternating case slows reading by 21%. That 21% may indicate in fact the difference between the efficiency of on one hand using word-pattern reading plus advance information from parafovea gains, and on the other hand using parallel processing. There could in normal reading conceivably be no word recognition by parallel processing of letters at all. The word patterns may always be winning out. Now I would not go that far, but the point here is that you are not testing for the kind of whole word reading that Peter theorizes exists and that my model represents.

Crucial tests are crucial in that they hit one theory and not another in their results. But these tests miss addressing Peter's view, and my model.

Incidentally, in Peter's view, historically the abandonment of word gestalt approaches around 1980 was based on the mistake of thinking that all there was to a gestalt view was the external outline of the word. According to Peter, people such as Frank Smith and Hermann Bouma were struggling with richer views, though these never got clearly developed.

I'll defer description of my further experiments testing Peter's view and my model until my next post.

Chris Dean's picture

track

Jongseong's picture

I'll just chime in regarding William's point about people recognizing Chinese characters on the character level.

Like many Koreans of my generation, I'm generally horrible with Chinese characters, as Chinese characters have been largely phased out of day-to-day usage in Korean. I can read a lot more than I can write, and I can vaguely recognize a whole lot more than I can recognize with certainty.

I can recognize the character 特 for instance, but wouldn't be able to recall how to write it. Also, I might vaguely recognize a character like 寶, but I wouldn't be able to tell if a few minor strokes were changed around as long as the overall pattern was not too different.

Clearly I'm not reading these characters by looking at their modular components—I don't even necessarily know what these are. I'm "reading" these characters by (vaguely) recognizing their overall shapes and pattern, not by looking at the details.

I get the sense that reading happens mostly on a level above that of the individual letters when I read texts in other scripts, but with Chinese characters it really is obvious.

William Berkson's picture

Ok, here are some additional tests showing the presence of word-gestalt reading, in the manner suggested by Peter and in accord with my model, using the Word Superiority Effect.

First, instead of simply spreading the letters out by tracking more widely, I would suggest doing irregular spacing of letters, accordion-like. In his book "While You're Reading" Gerard Unger has an example like that, and I think it will destroy the Word Superiority effect even more effectively than wider spacing.

Second, I would try whiting out portions of all the letters in a good text font, but leaving default spacing and everything else alone. I would predict that the WSE will be even more dramatic in these cases. People will be able to read words on the basis of the partial letters present, but have a much more difficult time with identifying letters in non-words. I think this would be a particularly dramatic demonstration of the way the brain uses sub-letter features across a word to get a word image.

Third, I would keep spacing the same, but mess up the rhythm of the type face by having some very wide letters, such as very circular bdeopq, along with very narrow ones. I think that will also mess up the WSE. In fact the (horrible for text) typeface "University Roman" already does that. But I would take one of those faces with a lot of widths--such as David Berlow has produced--and mix up widths.

Fourth, I would randomly alternate regular and bold weights within the same word. Again, I think this may disrupt the WSE.

Fifth, I would compare the effect of gradually wider tracking with very round faces, like Helvetica, with ones based on the oval, such as Meta. I think that in order to get good word image very wide and round faces, particularly sans, they have to be spaced too close, and this causes crowding that hurts readability. This may affect the WSE.

In fact I would also test the same spacing disruption with a good sans and serif of similar width (equalizing for x-height), and see if that affects the WSE. My theory is that it is hard or impossible to get the balance of word image and wide enough spacing for legibility with a sans compared to a serif. This might show up in the WSE.

These issues of even color and good rhythm are ones that good type designers are concerned with. David Berlow has talked about them as enabling type to "disappear" as the reader is reading. They may only make a difference of less than 5% to reading, but that 5% may be the difference between screen and print, or between a good text font and a bad one.

Finally, let me say that I do expect that such disruptions will show themselves in ease of reading, where by 'ease' I mean a combination of reading speed and comprehension over time.

These factors may not show up with easy materials for a short time, because the reader may be able to adjust for mildly adverse situations. However, if eg you have a series of SAT reading questions for people to do over a 3 hour period, with the goal to be getting as many right as possible, answering all, then I think you will get a measure of reading ease. The worse face for texts type, or the worse the combination of setting and type, the worse will be the performance. Also I would expect reports of fatigue.

Kevin has pointed to a 1948 book were there was no decline in reading performance over time, but this hasn't been repeated, and I suspect the methodology was flawed.

enne_son's picture

More from me. I stumble over the idea that Pelli and Tillman’s dissociations reveal indepenent reading processes, which each contribute a given number of words.

I wonder if it would be less troublesome to say that Pelli and Tillmans manipulations measure three interacting factors rather than three independent processes. Reinterpreting Pelli and Tillman, could we say effective role-unit quantization is the largest factor in determining reading speed, grammaticality plays an important role, and interfacilitation comes in third. Ease of quantization into role-units accounts for about 61 percent, grammaticality accounts for about 22 per cent, and the affordance of interfacilitation within the quantization process accounts for the rest, about 18 percent.

I doubt if it’s quite that simple though, because the knockouts Pelli and Tillman use introduce various bottlenecks. For example, alternating case probably introduces two: a componental abstraction — separate integrations over distinct areas — bottleneck (which probably affects attentional effort), and a sophisticated guessing bottleneck (which probably causes more reliance on grammaticality).

If they are interacting factors, changing levels of grammaticality or levels of interfacilitation affordance should change the amount of role-unit quantization required.

[Interfacilitation: when the visual cortex sees role a (for example a bowl or stem) at position x within a full bounded map, role b at position y, and role c at position z, this may facilitate a more accurate and quicker decision about what exists at position w.]

[role units are things like strokes and counters. They are elements of intermediate complexity, existing between simple features (like curvature, or abscence of closure) and letters (which in the word context are identifiable as parts). Some studies purport to show that intermediate complexity features are optimal for the basic visual task of the classification of wholes.]

[quantization is my name for the process that moves from detection of simple features to discrimination of distinctive role-units. I use quantization because this process is not purely bottom-up. It is driven by learned perceptual knowledge of these structures of intermediate complexity]

Denis Pelli's picture

Sorry to be so slow responding. I was hounded by reminder letters for overdue reviews. They’re now done. Phew!

David “dberlow” (Nov. 15): “Fabulous!” Thanks!
“This is done by … alternating case in words. I cannot help but think this is over-doing it.” You are certainly right in noting that there are subtler ways of disturbing word shape. (I like your suggestion of alternating case using small instead of large caps.) However, your concern does not apply to the needs of our paper. We estimate the value of the process (word shape) by how much the reader slows down when we knock it out. The finding, surprising to some, is that knocking out word shape causes only a small reduction in reading rate. Thus, the fact that we used a really brutal method (alternating case) yet still had only a small effect makes our result all the more convincing. Thus, given the conclusion that our paper was arguing for, it is helpful to err on the side of “overdoing it”. (If we had obtained the opposite result then it would be have been desirable to go back and try something subtler, such as you suggest.)

Sorry, I could not figure out what you’re referring to as “S&S’s study”.

Peter “eene_son” (Nov. 15, 17): “this moves the discussion forward”. Great. Thanks!

“However, I can’t be confident that each of your three manipulations affects only one of the three sources of word information”
Our paper gives reasoning and evidence to reach this conclusion. To challenge the conclusion you have to challenge the evidence or the reasoning. Simply describing another theory has no bearing. The reason that we give for concluding that the knockouts were selective is that the effect (in word/min) of each knockout on reading rate was the same, independent of whether the other knockouts were present or not.

“Interestingly, I discovered while trying to read your paper with partially fogged-up glasses, I could read the L-knockout text with remarkable ease and accuracy. When my glasses cleared up the L-knockout text again became very tough-going.”
Wow! That is very interesting. I’ll try that myself. Thanks!

“I wonder if it would be less troublesome to say that Pelli and Tillmans manipulations measure three interacting factors rather than three independent processes.”
Is your theory consistent with our data? That would be very interesting.

Bill “William Berkson” (Nov. 15-17): I like your matrix-resonance theory. It puts all the elements together in a novel way. And I also like, very much, that you propose experiments to test it. However, as you may already have noticed, suggesting a new experiment is a bit like suggesting a new typeface. It is very unusual for a type designer to create a new face that somebody else suggested (unless, of course, it’s a client paying for the privilege). Similarly, scientists usually do their own experiments. If you want to create a new typeface or do a new experiment, usually, you have to do it yourself. However, for pilot data, it can be quite informal, just printing a few sheets and taking simple measures that don’t require any fancy equipment.

As for the Word Superiority Effect, I agree with you and Kevin when you say, “this word superiority effect has always been the best argument for the existence of some kind of word-gestalt effect”. However, that was before my paper on this (Pelli et al. 2003). We replicated the effect, but presented a mathematical analysis showing that the effect is not big enough to be compelling evidence for a word-based process. The effect can be accounted for by a letter-based process. No one has contested our conclusion, so the word-superiority effect (of the typical size measured) is no longer good evidence for a word-based process. I’m afraid this reduces the impact of the elegant word-superiority-effect experiments you proposed here.

Pelli, D. G., Farell, B., & Moore, D. C. (2003) The remarkable inefficiency of word recognition. Nature, 423, 752-756.
Article:
http://www.psych.nyu.edu/pelli/pubs/pelli2003words.pdf
Supplementary demonstration:
http://www.nature.com/nature/journal/v423/n6941/extref/nature01516-s1.pdf

I think that the results presented in that paper are consistent with your matrix-resonance theory. Our results indicate that the word is not just one feature, nor is a letter, the features must be smaller than a letter. Which agrees with your theory.

However, we have another, later, paper showing that the word cannot be recognized unless the letters are separated (Martelli et al. 2005). This is the “crowding” effect that Peter has mentioned several times. I don’t think that your matrix-resonance theory can cope with these data, because, according to your theory, the break between letters should not matter.

Martelli, M., Majaj, N. J., & Pelli, D. G. (2005) Are faces processed like words? A diagnostic test for recognition by parts. Journal of Vision, 5(1), 58-70, http://journalofvision.org/5/1/6/
Pelli, D. G., & Tillman, K. A. (2008) The uncrowded window of object recognition. Nature Neuroscience, 11(10):1129 - 1135.
http://psych.nyu.edu/pelli/pubs/pelli2008uncrowded-complete.pdf

“Thus by letter substitution you are completely defeating the kind of whole word reading that Peter thinks is going on routinely. … you are not testing for the kind of whole word reading that Peter theorizes exists and that my model represents.”
Yes, that’s right. We were not aware of your theories at that time. But I think that the Martelli et al. paper above presents strong evidence against the matrix-resonance model. Letters do matter.

When you say, “a series of SAT reading questions”, what does “SAT’ stand for? Speed-accuracy tradeoff? Scholastic aptitude test?

I’ll respond to your fatigue theory in my next post.

Best

Denis

William Berkson's picture

Denis, thanks much for your reply.

Obviously, I will have to read your papers that you reference to see whether or not I find your tests and arguments convincing. That will probably take me some time, as I have a crush of other work to deal with right now.

I would just note at the outset that the matrix-resonance model I suggested does not dictate how much of reading is by letters and how much by word pattern of sub-letter features; it just allows that some of it is. I don't know how much of it is, but I do think that there is at least enough to be of importance to the issues dealt with by type designers.

As David Berlow mentioned, either in this thread or another recent one, whether the foundry can close the sale of a font to publications often depends on type designer creating type that becomes 'invisible' when you read it. 'Invisible' means that you are unaware of the type for most of the time reading, and just absorb the meaning. There are also intervals when you orient yourself on the page, looking at headlines, first lines etc., and there you do 'see' the type, and aesthetics have a strong impact in these intervals.

The factors that contribute to 'invisibility' include even color, good spacing, and letter proportions that have good rhythm. These may affect reading speed or fatigue only by a few percent, but it is enough to make the difference between good and bad type, or between the relatively low resolution screen and high resolution printed matter.

There a possible confounding factor here on the issue of word-gestalt vs letter reading, and that is that these factors may allow for more ready letter identification. But the word superiority effect may allow us to separate what is going on, which is why I suggested those experiment. Again, I'll have to read your papers to see whether I'm still optimistic about this being an indicator.

I would point out, as you reply to Peter, that you reference an alternative theory and experiments, but do not account here attempt to respond to and account for the Stanovich et al results that I "predicted", namely the decay of the word superiority effect with greater spacing. Stanovich himself dismissed it (in a rather ad hoc manner IMHO) as being somehow due to confounding effects of masking. Going to longer exposure times--long SOAs--would undermine this masking explanation, if I've got it right. Whatever the truth is about reading, it needs to account for the spacing effect. By the way, Kevin is on your side in being skeptical about the WSE indicating anything important; but he has been open-minded enough to think it's worth testing, and I'm grateful for that.

Oh, I see I also failed to mention another indicator of ease of reading, other than speed and comprehension: how easy it is to proof-read a text and catch errors. That's a noticeable difference between screen and print, as most people can testify.

Yes, by SAT I meant the Scholastic Aptitude Test. These reading tests are already standardized as regards difficulty, and are pretty tough, so they would be used to test whether speed or comprehension or both decay with time, and hence fatigue, using different type and formats.

I think such tests would be revealing, as I don't think there have been really good tests of the 'goodness' of text type for reading, to discriminate better and worse type, and better and worse layout. This is really a different testing issue than letter vs word processing, though it may be related.

As to what scientists do, your description is not quite accurate. A lot of scientists collaborate on tests that others conceive, including very often junior ones with a senior professor. Collaboration with colleagues on an equal level is also not that uncommon. Collaboration with someone outside academia is unusual, but there's actually no law against it :)

enne_son's picture

Denis, my first response questioned whether, under a different — and I think fuller or richer — account of reading, the knockouts you propose — straight-forward though they seem — are properly selective, or at least selective in the way you say they are, and consequently whether they lead to pure additive-factor measures of the processes you say they do. My second reply was to suggest that your results might indeed be compatible with — or at least potentially informative about — components of such an alternative point of view.

Bill, assuming a purely sub-letter feature based process is routine for skilled readers reading ordinary extended blocks of text, and is the key to invisibility during immersive reading, it would be useful to know not only how much, but under what circumstances a letter-process (componental abstraction) occurs.

Denis, I'll try to respond to your comment about the Martelli et. al. paper presenting strong evidence against the matrix-resonance model in a day or so.

Syndicate content Syndicate content