Link

This article advocates using proper unicode glyphs for quotes, like “” and ‘’ instead of ASCII straight quotes "" and '', thus removing the need to escape those in code. Some word processors or text formatters (e.g. some implementations of Markdown) automatically translate straight quotes into their proper curved equivalents.

This got me checking, out of the multiple apostrophe characters in Unicode, which was the right one for using in English as possessive (Moe’s bar) or in French for ellipsis (l’heure). Although the most used it the ASCII straight apostrophe U+0027 (or typewriter apostrophe) the recommended one is the punctuation apostrophe U+2019 ’, which also serves as a right/closing single quote.

Note that there are also separate characters for the prime sign (U+2032 ′, e.g. for feet, arcminutes and minutes) and for letter apostrophe when the apostrophe is considered as a letter in some languages.

Text

Efficient accent folding in JavaScript

Accent folding, or diacritics removal, aims to replace accented letters with their English alphabet base (which is different from normalized Unicode equivalence). It has several common applications. For example:

  • sorting, simple implementation as well as certain collating sequences
  • indexing and search, auto-complete, text expansion
  • URL slugs, code names, tags…

Accent folding has been well covered with some use cases like auto-complete by Carlos Bueno in this article for A List Apart, so I’ll skip further “why”s and concentrate on the “how”.

I needed to remove diacritics for sorting list items. And I wanted a fast implementation that could run often (and/or on the client side, possibly in IE) on lists with a few hundred strings.

A search led me to this implementation which uses regexes. So I took his character map and rewrote my own which doesn’t, and ran some basic performance tests. As expected, avoiding regex performs better.

It is pretty basic, see the code on Gist:

  1. split the string into an array of characters
  2. for each characters, if there is a match in the map, then replace it with the corresponding value (and set a flag)
  3. if the flag has been set, join back the array into a string, else just return the input string

After, I found about the ALA article and realized that I had taken almost the same per-character approach as the author, except that he used a string in which he adds characters, while I use an array. For the sake of completeness, I added it in the performance tests (hint: adding to string is slower).

Also, in my implementation, I’m using hasOwnProperty to check the presence of the character in the map, thus avoiding the side-effect of (unlikely yet possible) third-party code messing with Object.prototype (more on that, see An Object is not a Hash).

Text

WebKit placeholder better than the W3C specs?

Update 2/15/2012: previous version of this article was s/WebKit/Safari/g, but Chrome also implemented this behavior in v17.

Placeholder is a cool feature for HTML5 inputs and textareas that let you specify a sample/advice text in the form field when the user has not typed anything in. Unlike a default value - the actual value (sent on submit) is empty - the text is rendered with a distinctive look, usually a faded gray for black-on-white inputs - when the field is focused or not empty, it disappears

Indeed, the W3C specifications explicitly state:

User agents should present this hint to the user, after having stripped line breaks from it, when the element’s value is the empty string and the control is not focused

Firefox, Opera, and the polyfill for older browsers implement this to the letter:
W3C implementation: placeholder text is displayed when empty + unfocusedW3C implementation: placeholder text is hidden when focused

WebKit thinks different. It waits for the user to actually type in something before removing the placeholder text:
WebKit implementation: placeholder text is still displayed when empty + focusedWebKit implementation: placeholder text is hidden only when text is entered

This behavior can be useful, because the focused field doesn’t always means the user remembers the placeholder text. He may hit tab before reading the content of the next field. He may give focus to the field, switch to another tab or application (because of an unrelated event like IM as well as to copy/paste some information) and come back to the form. And if one would like to auto-focus the field in the first place, then the W3C-compliant implementation beacomes useless.

The only drawback I see to WebKit’s approach is that some user might be confused of some text being still present in the field while it’s focused, thinking it is content and unable to delete it. But the rendering is different and WebKit also uses this technique in its search bar, so I guess this confusion rarely occurs.
WebKit search bar

Also, placeholder is not to be confused with in-field labels, which is not the same purpose and shouldn’t use the same code, preferring labels + styling. Yet it can (and should even more than for placeholder) make use of the same mechanism of waiting for user input before removing the label.

We can use the same polyfill to force this behavior, but we’re faced with a dilemma here: feature detection won’t work, since placeholder is supported by, say, Firefox. So we must either use the polyfill on all browsers (even those supporting placeholders) but waste CPU in WebKit, or do vendor detection which we don’t like to rely upon.

Link

Ever needed a quick web page to transform some text or treat it with some JavaScript? Or perform a regexp but you aren’t in front of your IDE? Or encode/decode that UTF-8 mangled paragraph ?

Here you go: online text/javascript utilities Web page. Single input/output area, plain text only.

One-click:

  • Escape (URI-style): (un)escape / en(de)codeURI / en(de)codeURIComponent
  • UTF-8 encode/decode, Base64
  • Upper/lower/switched/camel/title case
  • Trim [lines]
  • Fix quotes
  • Word & char count

Hash

  • MD4, MD5, SHA-1
  • As hexadecimal, string or base64

Find & replace

  • Regular Expression, Multiline, Case-(in)sensitive
  • Replace (single, multiple)
  • Count matches
  • Apply JavaScript function to matches
  • Split using matches & apply JavaScript function to pieces

JavaScript

  • Evaluate
  • Make Bookmarklet
  • Load scripts to add features or libraries (the URL can be used to load scripts, useful when bookmarking)
  • Apply a JS function to the whole input

Base conversion

  • String / hexadecimal / binary (optional spaces)
  • Parse & convert numbers (between bases 2-36)

The code is also available on GitHub under Creative Commons BY-fr.