A linguistic diversion...

Challenge:
Find the longest (English) word with no ascending characters, and the highest proportion of descending ones. A word's score is its length in characters multiplied by the descender proportion.
For example, "gray" is 4 * 50% = 2.

("Normal" Roman structure is assumed - like no descending "z"s, please.)

In the case of a tie, the word with the highest frequency (according to Kucera) wins.

BTW, the one I myself have so far scores 5 * 80% = 4.

hhp

Sounds like fun. One question: are you considering 't' as an ascending character?

-- K.

Hmmm.
Since a font can have a "t" with a tiny head and few people would mind, I guess no. Same for "i" and "j", then.

hhp

Hold everything.

My formula is stupid - it always ends up equaling the number of descenders... So the Score is simply the number of descenders in the word (and no ascenders allowed). But the Kucera frequency is still the tie-breaker. OK? OK.

BTW, the Kucera frequency for "gray" is 80. That of my 4-point word is... zero. So practically any word with 4 descenders and no ascenders beats it. Come on, you can do it!

So, what the hell is the point? To find words that can imbalance multi-lingual settings where English is mixed directly into a non-Latin paragraph, specifically in a writing system that's "top heavy"... Just find me a juicy word and I'll show you, OK?! :-)

hhp

Heck. I'm curious.

Syzygy.

-Dy

4 descenders is the best I can come up with so far....

pygmy
pyogeny
zigzaggy
gyroscopy
gypsy
gunpapery
groggy
groggery
jiggy
jipijapa
mystagogy
pappy
peppy
poppy
puppy
puppetry
spriggy

David

Hrant --

Thinking about it last night, I realized the same thing about your formula: it's circular. A word like "gypsy," with 5 letters and 4 descenders, comes out equal in score to a word like "preapproving" with 12/4.

So, are you looking for merely a high proportion of descenders? Pygmy and gypsy are 90%. Or are you looking for a long ascenderless word? Perspicaciousness is 17, but only 2 descenders. Prepossessing is 13/3. Preapproving and preoccupying are both 12/4. That's the best combination I've managed to come up with so far.

But I've been ruling out 't'. I'll think on it some more and see what else I can come up with.

-- K.

Oops, make that 80% for pygmy and gypsy, not 90.

> Syzygy

I'll be darned - that's a word! And a rich one too, in both meaning and descender-space. But of course with a frequency of zero in Kucera.

So "gypsy" is the leader now (go David!), with a frequency of 4 ("poppy" and "puppy" are 2). And I like it best because it's not mostly just a bunch of "p"s. Mine was "pygmy", btw. I also came up with "ping-pong", but it's weak (in a number of ways).

A frequency of 4 isn't great, but considering how rare descenders are in English*, it's not bad. Can anybody do a Score of 5?

* http://www.themicrofoundry.com/image/s_rome1-4.gif

> So, are you looking for merely a high proportion of descenders?

Originally I was looking for a word that has the most descenders both in number as well as proportion. So a good equation might have been d^2 * c (instead of c * d/c , the original one), where d is the number of descenders and c the number of characters. So "gray" would have scored 16 and "pygmy" would have scored 16*5 = 80.

But that's too math-y (in this context), so I figured to just settle for the number of descenders, especially since the original equation was that anyway.

> Or are you looking for a long ascenderless word?

No, but it would definitely also be useful to find:
1. The longest word with no extenders at all: the word with the smallest height.
2. The longest word with the greatest proportion of extenders: the word with the greatest height.
(Non-exotic is a big plus.)

The current search is for the word with the lowest "center of gravity". The word with the highest CoG is less interesting, because it's more in line with general behavior of English. I'm looking for aberrations.

hhp

d^2*c ? Well, that would make a word like "presupposing" a winner with a score of 16*12=192.

That's presupposing, of course, that you really are interested in the overall highest score. ;-) Then again, there's Kucera. I don't know Kucera, but with a name like that, and being (I assume) a scientist, perhaps he/she had more occasion to encounter a gypsy than to do much presupposing.

-- K.

Hmmm.
Well, more reason to stick to the plain "number of descenders" thing.

> .... perhaps he/she had more occasion to encounter a gypsy than to do much presupposing.

:->

hhp

> 1. The longest word with no extenders at all: the word with the smallest height.

If you continue to accept 't', then there's instantaneousness with 17. But that suffix is a little forced. How about inattentiveness at 15; the suffix is a little less awkward here. But, okay skip that; there's uncommunicative (15) and incommensurate (14) and intermissions (13).

If you nix the 't', it gets a little trickier, but there's inconvenience (13) or recursiveness (13) or incisiveness (12). But if you're going to throw Kucera back in the mix, then all bets are off. Obviously, the longer the word, the lower it's likely to score on a frequency scale. And without access to this mysterious Kucera, I'm just stabbing in the dark.

Clearly I must have better things to do on a beautiful Saturday afternoon. It's been fun, but now you're on your own.

-- K.

Kent, great stuff! Very useful, thank you.

Kucera:
http://www.bookfinder.com/search/?st=sl&ac=sl&qi=fFZ.Lte4LiNMWb24egr592vfFZwsK4r5:2:7
It's the standard for English word frequencies. A good library should have it, or be able to get it.

hhp

BTW, one day I'm going to write some software that takes the entire English lexicon and performs searches like this instantaneously, as well as answering questions like: "What are all the words of a frequency greater than 1% that have boumas similar to 'burn'." Or maybe even: "Show me the 'Tree of Conflict', where words of decreasing frequency are connected to all the words that have similar boumas, and so on."

I just need to become a cave hermit in Andorra for a couple of years; or find a sugar-mommy.

hhp

The link is dead. What's the author's full name and the title? I'll look for it.

-- K.

Strange.
Well, it's "Computational Analysis of Present-Day American English" by Henry Kucera* and W Nelson Francis, 1967.

* There's a hacek on that "c".

BTW, there's another one for British English (which includes some interesting analytical comparisons to American English) - let me know if you want that too.

hhp

I found one with 5 descenders:

gypping

David

Here is a very strong one: popeye (kidding!)

Jacques

David "Descender" Thometz does it again!
I have a hunch there's not a good 6 (nor a more frequent -well, less infrequent- 5), so I think that's it. And "gyp" isn't bad either, since it contains only descenders!

hhp

"gyp" and "gypping" may well be good examples,
but you must remember they *are* bad words
-- slurs -- and not even of the sort which have
comeback by reappropriation (yet).

I mean, if there are any part-Gypsies on typophile,
correct me if I'm wrong. If it were me, my heritage
being used as the basis of a verb for swindling would
be pretty damn offensive, no two ways about it.

-o__n

I agree. It's an offensive word. Don't use it -- I don't.

But it has 5 descenders.

David

Owen, I see what you mean, now that I know the origin of the verb "gyp"! BTW, my wife supposedly has a small amount of gypsy blood. Which of course explains a lot... KIDDING! :-)

In any case, "gypsy" is great, and it's a pretty normal word too.

hhp

quaggy & piggy