The great 'Cedilla vs Undercomma' debate...

Nick Job's picture

Just been thinking about cedilla options. Microsoft typography says this...

Under comma and cedilla

The under comma is the preferred form in the Romanian language for the uppercase characters S and T with under comma accent and lowercase s and t with under comma accent. Four new Unicode values have been defined to accommodate this preference: Scommaaccent U+0218 ; scommaaccent U+0219 ; Tcommaaccent U+021A ; tcommaaccent U+021B

The connecting cedilla is the preferred form in the Turkish language for the uppercase S with cedilla and lowercase s with cedilla: Scedilla U+015E ; scedilla U+015F

An under comma is an acceptable alternative to a connecting cedilla for the following characters: Ccedilla U+00c7 ; ccedilla U+00e7 ; Kcedilla U+0136 ; kcedilla U+0137 ; Lcedilla U+013b ; lcedilla U+013c ; Ncedilla U+0145 ; ncedilla U+0146 ; Rcedilla U+0156 ; rcedilla U+0157 ; Tcedilla U+0162 ; tcedilla U+0163

In the Portuguese and Catalan languages the traditional connecting style of a cedilla is more commonly preferred for the Ccedilla U+00c7 and ccedilla U+00e7.

It is common in modern designs and French typography to see a cedilla design with a stroke that is not connecting or as in common handwriting, a line that passes through the bottom or beneath the uppercase or lowercase c.

Are Microsoft right, should I do what they say?

Who is the authority on cedillas?

Does my cedilla have to be a traditional shape? Would an undercomma-style cedilla on my ccedilla and scedilla offend or upset French/Turkish/Portuguese/anyone else? Should I just knuckle down and design a traditional cedilla that connects with the c and s?

Michel Boyer's picture

> I have never seen a detached cedilla in a serif font (in French).

I am sure I have never seen a French text typeset in Goudy Old Style. Here is how that looks on my mac:

For me, that one looks quite weird; I can't see that thing under the c as a cedilla, I see it a comma even if the only possible diacritic under a c in French is a cedilla. As for Eurostyle or Futura, I see no problem; their detached and unadorned cedilla allows a much larger freedom of interpretation.

By the way, if you are on a Mac and use "Show character palette", View: Roman, Accented Latin and click on the ç on the right, you will see all the ccedillas of all the fonts installed on you mac in the bottom pannel; I was surprised by Adobe's Rosewood ccedilla.


scannerlicker's picture

Michel: can you show us the comma?

Michel Boyer's picture

You mean Goudy Old Style's comma? Here it is with the cedillas.

At that size, things seem to look better.

By the way, in his book The palaeography of Gothic manuscript books: from the twelfth to the early sixteenth century, Albert Derolez describes on page 173 a script he calls "Iberian Hybrida" (a gothic script used in Spain and Portugal) saying

Iberian Hybrida is also marked by the form of the c caudata (ç, which was used as an alternative to z): the cedilla was placed well below the baseline and is unconnected to the c (26)

And here is figure 26:


scannerlicker's picture

Wierd. Thank you, Michel.

Michel Boyer's picture

The last issue (March 2009) of Paulo Heitlinger's Cardernos de Tipografia e Design is devoted in a large part to his font Escolar uma fonte contemporânea para aprender a escrever e ler to be used in primary school to learn reading and writing. There are two versions of that font, one for Portugal, one for Brasil, and amongst the differences there is... guess what... the c cedilla. And here is a grab of his pdf:

Can someone tell me how they teach children to write cedillas in Brasil? :)


scannerlicker's picture

Can someone tell me how they teach children to write cedillas in Brasil?

Cedillas are teached the same way in both countries. What you just showed us is just wrong.


Michel Boyer's picture

Cedillas are teached the same way in both countries.

How do you know? There are slides from Florian Hardwig's manuscribe project that show that the way children are taught to write letters can vary a lot from country to country.

What you just showed us is just wrong.

The question here is not what is right and what is wrong but what is true and what is false. Is the statement "Some (many?) schools in Portugal teach children to draw cedillas as in the font Escolar Portugal true or false? The above grab of "Escolar Portugal" corresponds to the glifos + exemplos (pdf 280 K) on the font site. The answer needs to be yes or no, not right or wrong.

Miguel Sousa's picture

> There are two versions of that font, one for Portugal, one for Brasil, and amongst the differences there is... guess what... the c cedilla.

I think this differentiation is artificial. I see no reason for it to exist. I personally have no preference over one or the other. They're both acceptable, legible, readable and intelligible as a 'ccedilla'.

In practice, the cedilla in people's day-to-day handwriting is a basic downward (slightly curved) stroke under the 'c', either attached or unattached. FWIW, I can't remember the last time I wrote a hook-like cedilla on my c's, but I certainly do it in my glyphs.

gomes's picture

The question here is not what is right and what is wrong but what is true and what is false. Is the statement “Some (many?) schools in Portugal teach children to draw cedillas as in the font Escolar Portugal true or false? The above grab of “Escolar Portugal” corresponds to the glifos + exemplos (pdf 280 K) on the font site. The answer needs to be yes or no, not right or wrong.

There isn't much of a question here, really. lula_assasina's right too - the portuguese language has not adopted undercommas, only cedillas. If, by definition, cedillas connect to the character, then it seems pretty clear that the cedilla should, well, actually connect to the character. The fact that typefaces might be designed without that concern hardly means they're not wrong.

Anyway, I'm familiar with the teaching methods for the early stages in portuguese education and can say that we do teach our kids to write connected cedillas, so it's a yes to your question.

Michel Boyer's picture

we do teach our kids to write connected cedillas

Thanks for this clear and unambiguous answer.

FWIW, I can’t remember the last time I wrote a hook-like cedilla on my c’s, but I certainly do it in my glyphs.

Miguel, I often find that hooked cedillas on fat fonts look weird. Here is a cedilla that I find bold and honest without being disturbing:

It comes from the first of these sites

Enjoy. I like Chico Bento.


scannerlicker's picture

How do you know?

By asking a Brazilian primary school teacher who gave classes in both countries. Hook connected with a stroke to the bottom of the "c". At least, that's the way they teach future teachers. :)

scannerlicker's picture

Let me just be clear about one thing: cedillas and undercommas have different origins. Cedillas come from the Visigothic "z", which has a peculiar connected hook shape. Visigoths occupied the whole Iberian Peninsula.

Undercommas appeared in the Buda Lexicon, in Romania, 1825, in order resolve in graphya Romanian missing sounds.

Michel Boyer's picture

Undercommas appeared in the Buda Lexicon, in Romania, 1825, in order resolve in graphya Romanian missing sounds.

Let me also be clear on one thing. If undercommas appeared in 1825, then they could not exist in 1482. Now, here is a font that was used in the Ratdolt 1482 edition of Elementa Geometriae by Euclid and that corresponds to the link Typ.1:109R (from the preceding link).

Consequently the glyph that is at the right of the c in Ratdolt's font cannot be that a "c undercomma".


Michel Boyer's picture

And have a look at this 1540 edition of BARROS, João de, 1496-1570 Dialogo da viçiosa vergonha from the Biblioteca Nacional de Porgugal. Here is a grab (line 4, page 2v).


scannerlicker's picture

Ah, better archives you got there!
Re-checked and Buda Lexicon has the proposition to include undercommas in the letters "s" and "t" in Romanian.

Then again, they still have different origins.

Consequently the glyph that is at the right of the c in Ratdolt’s font cannot be that a “c undercomma”.

Mind your sarcasm, you don't need it, since you find enough proofs to be right.

Thanks for the great investigation again, Michel.

Michel Boyer's picture

Mind your sarcasm, you don’t need it

No sarcasm intended. My background is mathematics and I just used the style I use for mathematical proofs.


Michel Boyer's picture

Thanks for the great investigation again, Michel.

Welcome. I had other interesting things but I don't know where I put them. I have at least this link where can be found manuscripts of Dante's Divine Comedy dating from the fourteenth and fifteenth centuries. The first words "Nel mezzo" are sometimes written "Nel meçço", or "Nel meço" and the cedilla can take a wide variety of shapes. Here is a grab from the first page of this manuscript; look at the cedilla in red and the two in the word mezzo!

It seems that printing had a normalizing influence on Italian spelling. I could not find a single cedilla in incunabila of the Divine Comedy (but maybe I did not search hard enough...)


scannerlicker's picture

Michel, I checked better the Diálogo da Viçosa Vergonha and the cedillas are correct. Maybe it doesn't have enough resolution to be clear.

Here's a titling:

Michel Boyer's picture

Maybe it doesn’t have enough resolution to be clear.

It would be nice to have access to one of the high resolution versions. In the word bençã[o] above, it is hard to believe that there is ink between the c and the black diacritic. Is it possible the titling was attached and the regular detached? Else I would imagine an extremely fine line that would be beautiful in digital type but that would probably print with difficulty if the metal character is not melted properly.


Michel Boyer's picture

I think you are right about this font. Here are three cedillas (if I dont't count the lower diacritic on the "e caudata") from page 13 of João de Barros' Grammatica da lingua portuguesa, 1540, that seems to have been printed in the same font.

Part of the problem seems to come from printing (if not from broken characters) but we would need a higher resolution digitization to really see what we need to guess in the first one. I like that font.

Michel Boyer's picture

I found the files I was looking for. The cedillas come from a hand written text (in Spanish). The source is "Documents of the Hispanic Southwest: The Expedition of Francisco Vásquez de Coronado 1540-1542", Jerry R. Craddock (pdf 9.8MB). The document is concerned with palaeography and displays some fantastic cedillas. Here are a few grabs.


Edit: I should have said the hand-written text that is studied in Craddock's article is "Archivo General de Indias, Sevilla. Justicia 267, fol. 814r".

dezcom's picture

Those top two are some beauties, Michel!


Michel Boyer's picture

I found a better scan of the grammar (pdf 5.8MB). Here is what I get for the above grab (this time on page 30 on 124 of the pdf file instead of page 13v).

The attachment I had imagined on the first c does not seem to be there; it is just the c that seems to have been more inked. On the other hand, the attachment on the e is clearly there and there seems to be no problem with any of the "e caudata". It is now my feeling the cedilla is not attached in that font.


Michel Boyer's picture

Cedillas come from the Visigothic “z”, which has a peculiar connected hook shape. [Fábio Martins, o Lula Assassina]

Here is the Visigothic "z" in Juan-José Marcos' (Professor of classical languages) Paleographic Fonts for Latin Script; this is a grab from page 27 (bottom) of his font sample file (pdf, 3.15 MB).

It does not look like a digit "3" fused at the bottom of a "c" that would have shrunk to give a cedilla like the svg Wiki figure.

Does anyone have a sample (from a digitized manuscript for example) of a Visigothic "z" that would look more like the Wiki?


Added: I took the picture from the Cédille article (in French). The Cedilla (English) Wiki entry has a slightly different picture.

scannerlicker's picture

That visigothic "z" appears to be taken from this manuscript:

I'm having a hard time to find one like in the wiki.

Michel Boyer's picture

On the other hand, it is in my opinion rather clear from the text of the grammar itself that de Barros did not know the origin of the letter "ç" (the print is dated M.D.XL., 1540). Indeed, on page 10 of the pdf file (pdf 5.8MB) (leaf 3 verso, only rectos were numbered), he writes

That I understand to mean

And likewise we have that letter "ç" that appears to be an invention of the Hebrew or Moorish pronunciation.

(correct me if I am not faithful to the author) and he repeats on leaf 46 recto (page 95 of the pdf file):

where I understand that "ouuęmos" (with a nice "e ogonek" that may cause display problems) stands for "houvemos" (It appears to us that we got those letters from the Moors).

I am aware that the "ç" that was used in Italian and Spanish are now written "z" [**] and the few books on palaeography that I have seen state that the letter "ç" derives from the Visigothic letter "z". I have never seen references though. May I thus ask the question: who is the first to have stated it. Where is it soundly argumented?


[**] I know neither Spanish nor Italian. I think the old "ç" may now just be "c" in Spanish. However, the directives for transcribing Dante's Comedy were stating to always write "ç" and not "z" when a cedilla was used in the manuscript and I concluded that in Italian, a "ç" automatically became a "z" in the modern spelling.

Miguel Sousa's picture

Michel, in modern Portuguese those two passages can be written as,

E assim temos esta letra ç, que parece ser inventada para pronunciação Hebraica ou Mourisca


Nos parece que houvemos estas letras dos mouriscos que vencemos

which would be translated into something like,

And here we have this letter ç, that seems like it was invented for the Hebrew and Moorish pronunciation


It looks to us that we gained these letters from the Moorish that we've defeated

Michel Boyer's picture

Thanks Miguel for the correction. I thought that "pera" was "pela" in Modern Portuguese; I seem to have seen a few instances where the letter "r" in de Barros would now be "l" but your translation makes much (and more) sense.


Michel Boyer's picture

I was also wondering if the word "assy" could not correspond to the modern French "aussi" (also) instead of the modern Portuguese "assim". Unfortunately, the pdf is not searchable and I could not check if the word "também" figures elsewhere in the file.

Miguel Sousa's picture

There's an occurrence of "também" in leaf 45 recto

Miguel Sousa's picture

As I was perusing through the pages I noticed these few lines in leaf 25 recto

It was interesting to see the ſs ligature used along with the ſſ ligature.

Michel Boyer's picture

Better digitizations are accessible on the Library local network as can be seen by clicking on the [i] icon at the right of the Cópia interna links of

Grammatica da lingua portuguesa
Dialogo da viçiosa vergonha


Igor Freiberger's picture

Just now I saw this thread. As a Brazilian, let me add some information.

As said above, cedilla is the only diacritic you find below a character in Portuguese. It's also used just with /c/ in /ça/ço/çu/ combinations. And it never begins a word. Besides this, our contact with Romanian and Turkish cultures are almost none, so Brazilians doesn't know there is a thing like commaccent.

Almost anything you put bellow a /c/ will be easily read as a cedilla in Brazilian Portuguese. It may be a cedilla, a commaccent, a straight line, an acute... all this will work because there is no other diacritic to be confused with cedilla. And also because cedilla usage, although common, is very specific.

Students learn to wrote cedilla as an upside-down hook.

To design and use disconnected cedillas is not an issue to Brazilians. But this is not a trend in Portuguese. It's just a variation in cedilla shape made possible due to local language and culture (I believe the same applies to Portugal). In other hand, disconnected cedillas in a traditional serif font may seem strange nowadays. Even if this was not unusual in 15th and 16th Centuries, later it seems to be changed to an always-connected-shape. Remember that Portuguese is a relatively new language if compared to other Romance ones. Its consolidation came just in late 1400s/early 1500s.

And, of course, many people may use wrong cedillas due to lack of knowledge about the commaccent existence. The book 'Ação de cobrança' cited above is a good example. I'm sure its editors have no idea they're using a commaccent-like cedilla, especially considering it's a Law book (Law books are usually badly designed here).

But if one designs a font with international support, both cedilla and commaccent clearly distinguished. The designer may even adopt a disconnected cedilla if this is suitable to general font style – but this cedilla still needs to by clearly different than a commaccent. A 'multi-purpose' diacritic seems not acceptable.

Finally, for me it's quite obvious that if one must observe the correct position and shape of other diacritics, as ogoneks, exactly the same applies to cedillas. Graphical similarity and some circumstantial wrong usage does not justify a font to mix up cedilla and commaccent.

scannerlicker's picture

Beautiful sum up.

hashiama's picture

Could anyone provide some insight into this lcommaccent/lcedilla ?
On this page: Comma Accent, I see an lcommaccent in the first sentence: ļ

But when I copied it to Google search, I noticed it is rendered as a lcedilla – I realise this is because it is in Arial, and Arial's lcommaccent glyph has an lcedilla inside it.

Checking further I saw that /Gcommaaccent /Kcommaaccent /kcommaaccent /Lcommaaccent /lcommaaccent /Ncommaaccent /ncommaaccent /Rcommaaccent /rcommaaccent /kcommaaccent /lcommaaccent /ncommaaccent /rcommaaccent
all have cedillas in Arial (though Arial Unicode shows commaaccents glyphs), like the page says "It is easy to come across characters which have cedilla in place of a correct comma accent".

Checking to see other modern/recent fonts, I saw they all display a commaaccent inside the glyphs.

But not Segoe UI, all the *commaccent glyphs show commaccents except Lcommaccent (U+013B), it has a cedilla in it

Is there a particular reason/compromise for this or just an error?

hrant's picture

BTW you know you're a type geek when you think the title of this thread is totally rational. :-)


Michel Boyer's picture

Certainly not rational but real:

quadibloc's picture

It seems to me that there should be no reason for confusion here.

An undercomma is one accent, a cedilla is another.

But typefaces have different stylistic characteristics. So a particular modernistic typeface might have a disconnected cedilla. That's fine - it's a characteristic of the typeface, and if people find the typeface unattractive because of this characteristic, as for any other reason, they can use another one.

The only thing that makes it complicated is that, thanks to OpenType, typeface designers have the option of offering more than one form for characters. Well, that's a plus. Which one is the default? The one that corresponds to the designer's vision for the typeface; again, I'm hard put to argue against that in principle. It is, of course, incumbent on the designer to have a vision that accords to the requirements of his target market, and so one might wish to discourage overly self-indulgent experimentation... but some typefaces are going to be more avant-garde than others. (Personally, I incline to interest in conventional and traditional typefaces, but I don't want to label the people who make the other kind of typeface as 'bad'; we need display faces too.)

But this thread has brought up an issue of cultural consciousness - the problem isn't cultural sensitivity; it is not will that is lacking, it is that there is so much information to research, often not readily available - that gives another side to this argument. A disconnected cedilla might be pleasingly modernistic to the French, but to the Portuguese it might be hopelessly confusing.

So one uses language sensitivity in OpenType. And that would also make the accents slope more for Polish.

One problem is that OpenType support isn't the greatest in some applications. Type designers can't be expected to rewrite those applications, but they certainly can make available one font for every set of alternates where that alternate is the baseline.

This has a benefit when it comes to another problem. Just because there are many people in Brazil who aren't used to the disconnected cedilla, it does not mean that there won't be someone there who wants to use it for its shocking visual effect.

Language sensitivity leads to the situation where people might say "But they don't let me use the real Eurostile Next because I'm Brazilian"!

So, while I think that if a Century Schoolbook font transparently does what is conventional for every language using the Latin script according to its own local rules, there are likely to be few complaints, one cannot say that the same approach is unequivocally the best one for a display face.

It is good for the type designer to be as aware of the language issues as possible, but the end user of the font has to be allowed to break the rules and use the original appearance of the face instead if desired.

dezcom's picture

"...but the end user of the font has to be allowed to break the rules and use the original appearance of the face instead if desired."

Yes, that is very true. But, the language tag should make the first choice for what is normal for the particular language. The end user can always over ride that choice if he wants to.

quadibloc's picture

But, the language tag should make the first choice for what is normal for the particular language. The end user can always over ride that choice if he wants to.

Initially, I agreed with that, but now I'm not so sure.

If the typeface is a text face, and not a display one, that seems to be good advice.

Also, there's a question about what "normal" for a given language means. It is basic and fundamental that an ogonek is not a cedilla is not an undercomma. So one is indeed on very safe ground in saying that a typeface should not do things that look like nonsense or like ignorant errors in any given language.

When type designers are getting their information about what is normative for different languages from good sources, they will be able to get this right.

But I can easily imagine cases where, because of the difficulty, as yet, in getting complete information of this sort for all languages, people will study existing practice, and come to wrong conclusions. So, if in one language, a particular natively-designed typeface has happened to be popular, and its accent marks differed in some way from other Roman typefaces of the same period, that does not necessarily mean that it is expected that accent marks in all typefaces should look the same way.

Also, to some extent, the Latin script community is a single typographical community, not many different isolated national communities any longer. So attempts to meet the needs of different countries may end up in part being based on a false premise.

Michel Boyer's picture

I am no font designer and I am happy not having to care for all the needs of eventual users. What I think is that the end user (let's say the typographer, or the "learned" end user) should be given appropriate tools to adjust the fonts to his needs (and be allowed to do so); and I think that FontForge is a good choice for such a tool. Let's take for instance the case of this thread: All that is required is additional characters with dots or macrons above or under basic latin letters to produce Pali text. A simple copy and paste may produce the desired characters but then the kerning tables may need to be adjusted. Else some anchors may be added and a ccmp entry may decompose the glyph that is typed into a character and a combining diacritic that is then properly positioned by a Mark feature without having to care for the kerning (I think that is at least true for those Pali characters). Just stop taking end users for dummies and educate them instead of prohibiting them from exercising their “fundamental rights”.

Thomas Phinney's picture

> It seems to me that there should be no reason for confusion here.

> An undercomma is one accent, a cedilla is another.

Yes, but... unfortunately, errors in early versions of Unicode caused problems which echo forward to this day.

The problem is nothing to do with OpenType as such, but OpenType offers some interesting options for how to deal with it.

quadibloc's picture

Incidentally, I have been looking for information on old Portuguese accent marks, but so far I have not found out what term they used for the ogonek-like accent on the e.

Igor Freiberger's picture

Where did you find ogonek-like accent on the e?

Igor Freiberger's picture

As I said above, Brazilians would recognize diverse shapes under c as cedilla, but as somewhat stylized one. The traditional, correct, and widely used form is the connected, hook-like cedilla.

quadibloc's picture

Here is another image from that ancient book; not only is the unusual accent mark on the e shown (De quem e* esta), but another unusual thing appears... an eszet or two.

In two places, ss becomes an eszet, and in a third, it becomes two long s characters in a row instead.

hrant's picture

I wonder if they had sorts from Germany, Poland and who knows where else mixed in, and used them indiscriminately because they were close enough.


Igor Freiberger's picture

This is an e with acute (é). Ogonek was never used in Portuguese.

Thus, why it was printed with an ogonek?

Portuguese is the younger Latin-based language and Portugal was a periferic country until the great navigations started in mid-1400s. As a consequence, Portuguese from XV Century still shows inconsistencies in ortography which were already solved in other languages by that time. And up to XVII Century we find improvisations in many printed documents due to a lack of proper fonts.

During this period, cedillas and tildes were available in insufficient number or simply absent. To produce them, printers used commas (under the c to get ç) and lowercase l (over a and o to get ã and õ). This explains samples were cedilla appears disconneted.

The sample above is also an adaptation using a font without Portuguese support. Besides the e with ogonek and the ss inconsistency, there also different acutes over o and a. As the language had not direct connection with German or Polish, it seems they printed this document as better as they could.

Igor Freiberger's picture

Se preguntássem. De quem é esta árte de grammática?
pódesse responder, do principe nósso Senhor.

Modern Portuguese:
Se perguntassem: de quem é esta arte de gramática?
Pode-se responder: do príncipe nosso Senhor.

If they ask – From whom is this grammatical art?
it could be answered – From our Lord, the Prince.

Besides the typographical issues, one can also note the inconsistent use of punctuation and the word "perguntar" (to ask) still wrote as in Spanish "preguntar". This shows we are handling not only a piece in archaic Portuguese, but a language still in formation with far from defined conventions and lexicography.

quadibloc's picture

Ah. I had been looking around in other Portuguese books of the time period, and found no other instances of this accent; here is another piece from the book in question that does have it, with multiple examples:

Seeing it in verbo, governa, rege, terra, republica, and Te made me suspect that something such as pied type from Poland was going on. Only rege and Te seemed like they might even get an accented e, though that was just wild guessing on my part.

Syndicate content Syndicate content