13
votes

A slug on this context is a string that its safe to use as an identifier, on urls or css. For example, if you have this string:

I'd like to eat at McRunchies!

Its slug would be:

i-d-like-to-eat-at-mcrunchies

I want to know whether there's a standard way of building such strings on Drupal (or php functions available from drupal). More precisely, inside a Drupal theme.

Context: I'm modifying a drupal theme so the html of the nodes it generates include their taxonomy terms as css classes on their containing div. Trouble is, some of those terms' names aren't valid css class names. I need to "slugify" them.

I've read that some people simply do this:

str_replace(" ", "-", $term->name)

This isn't really a enough for me. It doesn't replace uppercase letters with downcase, but more importantly, doesn't replace non-ascii characters (like à or é) by their ascii equivalents. It also doesn't remove "separator strings" from begining and end.

Is there a function in drupal 6 (or the php libs) that provides a way to slugify a string, and can be used on a template.php file of a drupal theme?

9

9 Answers

16
votes

You can use built in Drupal functions to do this.

$string = drupal_clean_css_identifier($string);
$slug = drupal_html_class($string);

functions will do the trick for you.

11
votes

i am a happy Zen theme user, thus i've met this wonderful function that comes with it: zen_id_safe http://api.lullabot.com/zen_id_safe

it does not depend on any other theme function, so you can just copy it to your module or theme and use it. it is a pretty small and simple function, so i will just paste it here for convenience.

function zen_id_safe($string) {
  // Replace with dashes anything that isn't A-Z, numbers, dashes, or underscores.
  return strtolower(preg_replace('/[^a-zA-Z0-9-]+/', '-', $string));
}

11
votes

I ended up using the slug function explained here (at the end of the article, you have to click in order to see the source code).

This does what I need and a couple things more, without needing to include external modules and the like.

Pasting the code below for easy future reference:

/**
 * Calculate a slug with a maximum length for a string.
 *
 * @param $string
 *   The string you want to calculate a slug for.
 * @param $length
 *   The maximum length the slug can have.
 * @return
 *   A string representing the slug
 */
function slug($string, $length = -1, $separator = '-') {
  // transliterate
  $string = transliterate($string);
 
  // lowercase
  $string = strtolower($string);
 
  // replace non alphanumeric and non underscore charachters by separator
  $string = preg_replace('/[^a-z0-9]/i', $separator, $string);
 
  // replace multiple occurences of separator by one instance
  $string = preg_replace('/'. preg_quote($separator) .'['. preg_quote($separator) .']*/', $separator, $string);
 
  // cut off to maximum length
  if ($length > -1 && strlen($string) > $length) {
    $string = substr($string, 0, $length);
  }
 
  // remove separator from start and end of string
  $string = preg_replace('/'. preg_quote($separator) .'$/', '', $string);
  $string = preg_replace('/^'. preg_quote($separator) .'/', '', $string);
 
  return $string;
}
 
/**
 * Transliterate a given string.
 *
 * @param $string
 *   The string you want to transliterate.
 * @return
 *   A string representing the transliterated version of the input string.
 */
function transliterate($string) {
  static $charmap;
  if (!$charmap) {
    $charmap = array(
      // Decompositions for Latin-1 Supplement
      chr(195) . chr(128) => 'A', chr(195) . chr(129) => 'A',
      chr(195) . chr(130) => 'A', chr(195) . chr(131) => 'A',
      chr(195) . chr(132) => 'A', chr(195) . chr(133) => 'A',
      chr(195) . chr(135) => 'C', chr(195) . chr(136) => 'E',
      chr(195) . chr(137) => 'E', chr(195) . chr(138) => 'E',
      chr(195) . chr(139) => 'E', chr(195) . chr(140) => 'I',
      chr(195) . chr(141) => 'I', chr(195) . chr(142) => 'I',
      chr(195) . chr(143) => 'I', chr(195) . chr(145) => 'N',
      chr(195) . chr(146) => 'O', chr(195) . chr(147) => 'O',
      chr(195) . chr(148) => 'O', chr(195) . chr(149) => 'O',
      chr(195) . chr(150) => 'O', chr(195) . chr(153) => 'U',
      chr(195) . chr(154) => 'U', chr(195) . chr(155) => 'U',
      chr(195) . chr(156) => 'U', chr(195) . chr(157) => 'Y',
      chr(195) . chr(159) => 's', chr(195) . chr(160) => 'a',
      chr(195) . chr(161) => 'a', chr(195) . chr(162) => 'a',
      chr(195) . chr(163) => 'a', chr(195) . chr(164) => 'a',
      chr(195) . chr(165) => 'a', chr(195) . chr(167) => 'c',
      chr(195) . chr(168) => 'e', chr(195) . chr(169) => 'e',
      chr(195) . chr(170) => 'e', chr(195) . chr(171) => 'e',
      chr(195) . chr(172) => 'i', chr(195) . chr(173) => 'i',
      chr(195) . chr(174) => 'i', chr(195) . chr(175) => 'i',
      chr(195) . chr(177) => 'n', chr(195) . chr(178) => 'o',
      chr(195) . chr(179) => 'o', chr(195) . chr(180) => 'o',
      chr(195) . chr(181) => 'o', chr(195) . chr(182) => 'o',
      chr(195) . chr(182) => 'o', chr(195) . chr(185) => 'u',
      chr(195) . chr(186) => 'u', chr(195) . chr(187) => 'u',
      chr(195) . chr(188) => 'u', chr(195) . chr(189) => 'y',
      chr(195) . chr(191) => 'y',
      // Decompositions for Latin Extended-A
      chr(196) . chr(128) => 'A', chr(196) . chr(129) => 'a',
      chr(196) . chr(130) => 'A', chr(196) . chr(131) => 'a',
      chr(196) . chr(132) => 'A', chr(196) . chr(133) => 'a',
      chr(196) . chr(134) => 'C', chr(196) . chr(135) => 'c',
      chr(196) . chr(136) => 'C', chr(196) . chr(137) => 'c',
      chr(196) . chr(138) => 'C', chr(196) . chr(139) => 'c',
      chr(196) . chr(140) => 'C', chr(196) . chr(141) => 'c',
      chr(196) . chr(142) => 'D', chr(196) . chr(143) => 'd',
      chr(196) . chr(144) => 'D', chr(196) . chr(145) => 'd',
      chr(196) . chr(146) => 'E', chr(196) . chr(147) => 'e',
      chr(196) . chr(148) => 'E', chr(196) . chr(149) => 'e',
      chr(196) . chr(150) => 'E', chr(196) . chr(151) => 'e',
      chr(196) . chr(152) => 'E', chr(196) . chr(153) => 'e',
      chr(196) . chr(154) => 'E', chr(196) . chr(155) => 'e',
      chr(196) . chr(156) => 'G', chr(196) . chr(157) => 'g',
      chr(196) . chr(158) => 'G', chr(196) . chr(159) => 'g',
      chr(196) . chr(160) => 'G', chr(196) . chr(161) => 'g',
      chr(196) . chr(162) => 'G', chr(196) . chr(163) => 'g',
      chr(196) . chr(164) => 'H', chr(196) . chr(165) => 'h',
      chr(196) . chr(166) => 'H', chr(196) . chr(167) => 'h',
      chr(196) . chr(168) => 'I', chr(196) . chr(169) => 'i',
      chr(196) . chr(170) => 'I', chr(196) . chr(171) => 'i',
      chr(196) . chr(172) => 'I', chr(196) . chr(173) => 'i',
      chr(196) . chr(174) => 'I', chr(196) . chr(175) => 'i',
      chr(196) . chr(176) => 'I', chr(196) . chr(177) => 'i',
      chr(196) . chr(178) => 'IJ', chr(196) . chr(179) => 'ij',
      chr(196) . chr(180) => 'J', chr(196) . chr(181) => 'j',
      chr(196) . chr(182) => 'K', chr(196) . chr(183) => 'k',
      chr(196) . chr(184) => 'k', chr(196) . chr(185) => 'L',
      chr(196) . chr(186) => 'l', chr(196) . chr(187) => 'L',
      chr(196) . chr(188) => 'l', chr(196) . chr(189) => 'L',
      chr(196) . chr(190) => 'l', chr(196) . chr(191) => 'L',
      chr(197) . chr(128) => 'l', chr(197) . chr(129) => 'L',
      chr(197) . chr(130) => 'l', chr(197) . chr(131) => 'N',
      chr(197) . chr(132) => 'n', chr(197) . chr(133) => 'N',
      chr(197) . chr(134) => 'n', chr(197) . chr(135) => 'N',
      chr(197) . chr(136) => 'n', chr(197) . chr(137) => 'N',
      chr(197) . chr(138) => 'n', chr(197) . chr(139) => 'N',
      chr(197) . chr(140) => 'O', chr(197) . chr(141) => 'o',
      chr(197) . chr(142) => 'O', chr(197) . chr(143) => 'o',
      chr(197) . chr(144) => 'O', chr(197) . chr(145) => 'o',
      chr(197) . chr(146) => 'OE', chr(197) . chr(147) => 'oe',
      chr(197) . chr(148) => 'R', chr(197) . chr(149) => 'r',
      chr(197) . chr(150) => 'R', chr(197) . chr(151) => 'r',
      chr(197) . chr(152) => 'R', chr(197) . chr(153) => 'r',
      chr(197) . chr(154) => 'S', chr(197) . chr(155) => 's',
      chr(197) . chr(156) => 'S', chr(197) . chr(157) => 's',
      chr(197) . chr(158) => 'S', chr(197) . chr(159) => 's',
      chr(197) . chr(160) => 'S', chr(197) . chr(161) => 's',
      chr(197) . chr(162) => 'T', chr(197) . chr(163) => 't',
      chr(197) . chr(164) => 'T', chr(197) . chr(165) => 't',
      chr(197) . chr(166) => 'T', chr(197) . chr(167) => 't',
      chr(197) . chr(168) => 'U', chr(197) . chr(169) => 'u',
      chr(197) . chr(170) => 'U', chr(197) . chr(171) => 'u',
      chr(197) . chr(172) => 'U', chr(197) . chr(173) => 'u',
      chr(197) . chr(174) => 'U', chr(197) . chr(175) => 'u',
      chr(197) . chr(176) => 'U', chr(197) . chr(177) => 'u',
      chr(197) . chr(178) => 'U', chr(197) . chr(179) => 'u',
      chr(197) . chr(180) => 'W', chr(197) . chr(181) => 'w',
      chr(197) . chr(182) => 'Y', chr(197) . chr(183) => 'y',
      chr(197) . chr(184) => 'Y', chr(197) . chr(185) => 'Z',
      chr(197) . chr(186) => 'z', chr(197) . chr(187) => 'Z',
      chr(197) . chr(188) => 'z', chr(197) . chr(189) => 'Z',
      chr(197) . chr(190) => 'z', chr(197) . chr(191) => 's',
      // Euro Sign
      chr(226) . chr(130) . chr(172) => 'E'
    );
  }
 
  // transliterate
  return strtr($string, $charmap);
}
 
function is_slug($str) {
  return $str == slug($str);
}
6
votes

There's also this from d7 which you can copy to your project:

http://api.drupal.org/api/function/drupal_clean_css_identifier/7

2
votes

This might help, I find I am doing this slugging all the time now rather then use id numbers as unique keys in my tables.

    /** class SlugMaker
    * 
    * methods to create text slugs for urls
    *
    **/

class SlugMaker {

    /** method slugify
    * 
    * cleans up a string such as a page title
    * so it becomes a readable valid url
    *
    * @param STR a string
    * @return STR a url friendly slug
    **/

    function slugifyAlnum( $str ){

    $str = preg_replace('#[^0-9a-z ]#i', '', $str );    // allow letters, numbers + spaces only
    $str = preg_replace('#( ){2,}#', ' ', $str );       // rm adjacent spaces
    $str = trim( $str ) ;

    return strtolower( str_replace( ' ', '-', $str ) ); // slugify


    }


    function slugifyAlnumAppendMonth( $str ){

    $val = $this->slugifyAlnum( $str );

    return $val . '-' . strtolower( date( "M" ) ) . '-' . date( "Y" ) ;

    }

}

Using this and .htaccess rules means you go directly from a url like:

/articles/my-pops-nuts-may-2010

Straight through to the table look up without having to unmap IDs (applying a suitable filter naturally).

Append or prepend some kind of date optionally in order to enforce a degree of uniqueness as you wish.

HTH

2
votes

For Drupal 8/9 you can use Html::getClass

$slugify = Html::getClass('A @ Stríng-that n+eeds cónvert');

Don't forget to include the namespace when needed inside module

use Drupal\Component\Utility\Html;
1
votes

I would recommend the transliteration module which path_auto uses. With it you can use the transliteration_get() function. It also does unicode transformation.

1
votes

This is what worked for me after a lot of trial and error, including for converting both French as German titles with special characters to a slug.

I Created a custom twig filter so I can use it like this:

{{ node.field_title.value|slug }}

It will convert:

Wärmeabgabe & Abmessungen
Typenübersicht
Montage- und Anschlussmaße

Into:

warmeabgabe--abmessungen
typenubersicht
montage--und-anschlussmasse

for example.

HOWTO: In a custom module, create a services.yml file: modules/custom/mymodule/mymodule.services.yml

services:
 mymodule.twig_extensions:
    class: Drupal\mymodule\HelperTwigExtensions
    tags:
      - { name: twig.extension }

Create the modules/custom/mymodule/src/HelperTwigExtensions.php file:

<?php

namespace Drupal\mymodule;

use Drupal\Component\Utility\Html;

/**
 * Extend Drupal's Twig_Extension class.
 */
class HelperTwigExtensions extends \Twig_Extension {

  /**
   * {@inheritdoc}
   */
  public function getName() {
    return 'mymodule.twig_extensions';
  }

  /**
   * {@inheritdoc}
   */
  public function getFilters() {
    return [
      new \Twig_SimpleFilter('slug', [$this, 'createSlug']),
    ];
  }

  /**
   * Create a slug from a string input.
   */
  public function createSlug($input) {
    // Convert most of the special characters.
    $slug = Html::getClass($input);
    $slug = strtolower($slug);
    // Convert accented text characters.
    $unwanted_array = [
      'Þ' => 'b',
      'ß' => 'ss',
      'à' => 'a',
      'á' => 'a',
      'â' => 'a',
      'ã' => 'a',
      'ä' => 'a',
      'å' => 'a',
      'æ' => 'a',
      'ç' => 'c',
      'è' => 'e',
      'é' => 'e',
      'ê' => 'e',
      'ë' => 'e',
      'ì' => 'i',
      'í' => 'i',
      'î' => 'i',
      'ï' => 'i',
      'ð' => 'o',
      'ñ' => 'n',
      'ò' => 'o',
      'ó' => 'o',
      'ô' => 'o',
      'õ' => 'o',
      'ö' => 'o',
      'ø' => 'o',
      'ù' => 'u',
      'ú' => 'u',
      'û' => 'u',
      'ü' => 'u',
      'ý' => 'y',
      'þ' => 'b',
      'ÿ' => 'y',
    ];
    $slug = strtr($slug, $unwanted_array);
    return $slug;
  }

}
0
votes

You can use a preg_replace and strtolower :

preg_replace('/[^a-z]/','-', strtolower($term->name));