Slugify in a shell script

When constructing nice file names or URLs, it’s often nice to “slugify” a string, so it has a form of alphanumerics separated by dashes. For instance, you may have a string like this:

Linux clover 2.6.19-gentoo-r5 i686 Genuine Intel(R) CPU T2050 @ 1.60GHz

It has uppercase and lowercase letters, digits, brackets… you need to remove all but alphanumerics while retaining readability. Basically, you may want for instance:

linux-clover-2-6-19-gentoo-r5-i686-genuine-intel-r-cpu-t2050-1-60ghz

If you append “.html” to it, it makes a very nice URL, doesn’t it?

Here’s a part of a pipe chain that slugifies strings:

sed -e 's/[^[:alnum:]]/-/g' | tr -s '-' | tr A-Z a-z

If you have a shell script and you want to slugify variable content, you can:

SLUGIFIED="$(echo -n "${VARIABLE}" | sed -e 's/[^[:alnum:]]/-/g' \
| tr -s '-' | tr A-Z a-z)"

Note that wordpress likes to mess up quotes. They are meant to be plain, double ones.

Author: automatthias

You won't believe what a skeptic I am. View all posts by automatthias

2 thoughts on “Slugify in a shell script”

Fiddled a lot with my locale variables, but couldn’t get neither coreutils’ own tr nor perl’s uc (and other) to correctly lowercase a string with polish diacritics. However, tcl’s puts [string tolower {STRING}] worked just right out of the box. Guess their legendary unicode support is a serious claim.
And while I’m writing this just thought of perl’s Text::Unaccent… Let’s see:
echo ‘ZAŻÓŁĆ’ | perl -MText::Unaccent -ne ‘print(lc(unac_string(“utf-8”, “$_”)))’
…seems to work just right.

This requires installing the Text::Unaccent CPAN module directly from CPAN or via your package manager. Either way, this solution will most probably not work with a basic Perl installation from a default OS install.

Pingback: ter Smitten's » Slug generator

Comments are closed.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Share this:

Related

Author: automatthias

2 thoughts on “Slugify in a shell script”