Google now has the reading abilities of a teenager and can read f-ligatures: “[T]he characters fi can... be represented as two characters (f and i) or a special display form . A Google search for [financials] or [office] used to not see these as equivalent – to the software they would just look like *nancials and of*ce. There are thousands of characters like this, and they occur in surprisingly many pages on the Web, especially generated PDF documents.

Syndicate content Syndicate content