I want a regex expression in php to strip all attributes except: 'href', 'target', 'style', 'color', 'src', 'alt', 'border', 'cellpadding', 'cellspacing', 'width', 'height', 'title'
So that these are valid attributes:
<a href=i.php>
<a href = "i.php">
<img alt= " " src ="img.png">
<p title='Desc' style=color:FFFFFF;>
but these aren't valid attributes:
<a onclick="alert('Hello');">
<div id="whatever">
<div id = "whatever">
<div id = whatever> ..etc
I tried this, but it didn't work well
$cont = $_POST['mycontent'];
$keep = array('href', 'target', 'style', 'color', 'src', 'alt', 'border', 'cellpadding', 'cellspacing', 'width', 'height', 'title');
// Get an array of all the attributes and their values in the data string
preg_match_all('/[a-z]+\s*=/iU', $cont, $attributes);
// Loop through the attribute pairs, match them against the keep array and remove
// them from $data if they don't exist in the array
foreach ($attributes[0] as $attribute) {
$attributeName = stristr(trim($attribute), '=', true);
if (!in_array($attributeName, $keep)) {
$cont = str_replace(' ' . $attribute, '', $cont);
}
}
Help?
DOM::removeAttribute()
is the safest. – Wiktor Stribiżew