How do you find a URL in a normal text and turn it into a HTML link using regular expressions? That was the challenge I faced recently. Oh - and of course not only well formed url’s (with the https:// in front of them, but any kind of url.
Why invent something, when there’s google to search? I found Ben Forta’s “How to match a URL” but unfortunately it was way to forgiving in what it parsed, so I had to expand it a bit. Here are the results of a couple of hours of labor: UPDATE: There was a nasty bug, that truncated all urls where the host started with the letters of one of the TLD’s to the reminder … www.invisible.ch got truncated to visible.ch. The code below is corrected and simplified a bit (removed a couple of unneeded groups) ([\s]|^|<p>) (https?://)? (([\w]+?[-\w\.]+?\.)+ (a[cdefgilmnoqrstuwz]|b[abdefghijmnorstvwyz]| c[acdfghiklmnoruvxyz]|d[ejkmnoz]|e[ceghrst]| f[ijkmnor]|g[adefghilmnpqrstuwy]|h[kmnrtu]| i[delmnoqrst]|j[emop]|k[eghimnprwyz]| l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]| n[acefgilopruz]|om|p[aefghklmnrstwy]|qa| r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]| u[ugkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]| com|edu|mil|gov|org|net|int|info|biz|name|pro |museum|aero|coop){1})+ (:\d+)? (/([\w/_\.]*(\?\S+)?)?)? ([\s]|</p>|<br />|$) and then replace it with: $1>a href="$2$3$6$7"<$2$3$6$7>/a<$10
...