I have a dict and would like to remove all the keys for which there are empty value strings.
metadata = {u'Composite:PreviewImage': u'(Binary data 101973 bytes)',
u'EXIF:CFAPattern2': u''}
What is the best way to do this?
Python 2.X
dict((k, v) for k, v in metadata.iteritems() if v)
Python 2.7 - 3.X
{k: v for k, v in metadata.items() if v}
Note that all of your keys have values. It's just that some of those values are the empty string. There's no such thing as a key in a dict without a value; if it didn't have a value, it wouldn't be in the dict.
It can get even shorter than BrenBarn's solution (and more readable I think)
{k: v for k, v in metadata.items() if v}
Tested with Python 2.7.3.
If you really need to modify the original dictionary:
empty_keys = [k for k,v in metadata.iteritems() if not v]
for k in empty_keys:
del metadata[k]
Note that we have to make a list of the empty keys because we can't modify a dictionary while iterating through it (as you may have noticed). This is less expensive (memory-wise) than creating a brand-new dictionary, though, unless there are a lot of entries with empty values.
If you want a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles, I recommend looking at the remap utility from the boltons utility package.
After pip install boltons
or copying iterutils.py into your project, just do:
from boltons.iterutils import remap
drop_falsey = lambda path, key, value: bool(value)
clean = remap(metadata, visit=drop_falsey)
This page has many more examples, including ones working with much larger objects from Github's API.
It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.
BrenBarn's solution is ideal (and pythonic, I might add). Here is another (fp) solution, however:
from operator import itemgetter
dict(filter(itemgetter(1), metadata.items()))
Based on Ryan's solution, if you also have lists and nested dictionaries:
For Python 2:
def remove_empty_from_dict(d):
if type(d) is dict:
return dict((k, remove_empty_from_dict(v)) for k, v in d.iteritems() if v and remove_empty_from_dict(v))
elif type(d) is list:
return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
else:
return d
For Python 3:
def remove_empty_from_dict(d):
if type(d) is dict:
return dict((k, remove_empty_from_dict(v)) for k, v in d.items() if v and remove_empty_from_dict(v))
elif type(d) is list:
return [remove_empty_from_dict(v) for v in d if v and remove_empty_from_dict(v)]
else:
return d
### example01 -------------------
mydict = { "alpha":0,
"bravo":"0",
"charlie":"three",
"delta":[],
"echo":False,
"foxy":"False",
"golf":"",
"hotel":" ",
}
newdict = dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(vdata) ])
print newdict
### result01 -------------------
result01 ='''
{'foxy': 'False', 'charlie': 'three', 'bravo': '0'}
'''
### example02 -------------------
mydict = { "alpha":0,
"bravo":"0",
"charlie":"three",
"delta":[],
"echo":False,
"foxy":"False",
"golf":"",
"hotel":" ",
}
newdict = dict([(vkey, vdata) for vkey, vdata in mydict.iteritems() if(str(vdata).strip()) ])
print newdict
### result02 -------------------
result02 ='''
{'alpha': 0,
'bravo': '0',
'charlie': 'three',
'delta': [],
'echo': False,
'foxy': 'False'
}
'''
Building on the answers from patriciasz and nneonneo, and accounting for the possibility that you might want to delete keys that have only certain falsy things (e.g. ''
) but not others (e.g. 0
), or perhaps you even want to include some truthy things (e.g. 'SPAM'
), then you could make a highly specific hitlist:
unwanted = ['', u'', None, False, [], 'SPAM']
Unfortunately, this doesn't quite work, because for example 0 in unwanted
evaluates to True
. We need to discriminate between 0
and other falsy things, so we have to use is
:
any([0 is i for i in unwanted])
...evaluates to False
.
Now use it to del
the unwanted things:
unwanted_keys = [k for k, v in metadata.items() if any([v is i for i in unwanted])]
for k in unwanted_keys: del metadata[k]
If you want a new dictionary, instead of modifying metadata
in place:
newdict = {k: v for k, v in metadata.items() if not any([v is i for i in unwanted])}
I read all replies in this thread and some referred also to this thread: Remove empty dicts in nested dictionary with recursive function
I originally used solution here and it worked great:
Attempt 1: Too Hot (not performant or future-proof):
def scrub_dict(d):
if type(d) is dict:
return dict((k, scrub_dict(v)) for k, v in d.iteritems() if v and scrub_dict(v))
else:
return d
But some performance and compatibility concerns were raised in Python 2.7 world:
isinstance
instead of type
for
loop for efficiencyitems
instead of iteritems
Attempt 2: Too Cold (Lacks Memoization):
def scrub_dict(d):
new_dict = {}
for k, v in d.items():
if isinstance(v,dict):
v = scrub_dict(v)
if not v in (u'', None, {}):
new_dict[k] = v
return new_dict
DOH! This is not recursive and not at all memoizant.
Attempt 3: Just Right (so far):
def scrub_dict(d):
new_dict = {}
for k, v in d.items():
if isinstance(v,dict):
v = scrub_dict(v)
if not v in (u'', None, {}):
new_dict[k] = v
return new_dict
if isinstance(v, list):
, which scrubs the list using the original scrub_dict(d)
implementation. @staticmethod
def scrub_dict(d):
new_dict = {}
for k, v in d.items():
if isinstance(v, dict):
v = scrub_dict(v)
if isinstance(v, list):
v = scrub_list(v)
if not v in (u'', None, {}):
new_dict[k] = v
return new_dict
@staticmethod
def scrub_list(d):
scrubbed_list = []
for i in d:
if isinstance(i, dict):
i = scrub_dict(i)
scrubbed_list.append(i)
return scrubbed_list
"As I also currently write a desktop application for my work with Python, I found in data-entry application when there is lots of entry and which some are not mandatory thus user can left it blank, for validation purpose, it is easy to grab all entries and then discard empty key or value of a dictionary. So my code above a show how we can easy take them out, using dictionary comprehension and keep dictionary value element which is not blank. I use Python 3.8.3
data = {'':'', '20':'', '50':'', '100':'1.1', '200':'1.2'}
dic = {key:value for key,value in data.items() if value != ''}
print(dic)
{'100': '1.1', '200': '1.2'}
To preserve 0 and False values but get rid of empty values you could use:
{k: v for k, v in metadata.items() if v or v == 0 or v is False}
For a nested dict with mixed types of values you could use:
def remove_empty_from_dict(d):
if isinstance(d, dict):
return dict((k, remove_empty_from_dict(v)) for k, v in d.items() \
if v or v == 0 or v is False and remove_empty_from_dict(v) is not None)
elif isinstance(d, list):
return [remove_empty_from_dict(v) for v in d
if v or v == 0 or v is False and remove_empty_from_dict(v) is not None]
else:
if d or d == 0 or d is False:
return d
Here is an option if you are using pandas
:
import pandas as pd
d = dict.fromkeys(['a', 'b', 'c', 'd'])
d['b'] = 'not null'
d['c'] = '' # empty string
print(d)
# convert `dict` to `Series` and replace any blank strings with `None`;
# use the `.dropna()` method and
# then convert back to a `dict`
d_ = pd.Series(d).replace('', None).dropna().to_dict()
print(d_)
Some of Methods mentioned above ignores if there are any integers and float with values 0 & 0.0
If someone wants to avoid the above can use below code(removes empty strings and None values from nested dictionary and nested list):
def remove_empty_from_dict(d):
if type(d) is dict:
_temp = {}
for k,v in d.items():
if v == None or v == "":
pass
elif type(v) is int or type(v) is float:
_temp[k] = remove_empty_from_dict(v)
elif (v or remove_empty_from_dict(v)):
_temp[k] = remove_empty_from_dict(v)
return _temp
elif type(d) is list:
return [remove_empty_from_dict(v) for v in d if( (str(v).strip() or str(remove_empty_from_dict(v)).strip()) and (v != None or remove_empty_from_dict(v) != None))]
else:
return d
In [7]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
...: dic = {k: v for k, v in dic.items() if v is not None}
1000000 loops, best of 7: 375 ns per loop
In [8]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
...: dic = dict((k, v) for k, v in dic.items() if v is not None)
1000000 loops, best of 7: 681 ns per loop
In [10]: %%timeit dic = {str(i):i for i in xrange(10)}; dic['10'] = None; dic['5'] = None
...: for k, v in dic.items():
...: if v is None:
...: del dic[k]
...:
10000000 loops, best of 7: 160 ns per loop
so loop and delete is the fastest at 160ns, list comprehension is half as slow at ~375ns and with a call to dict()
is half as slow again ~680ns.
Wrapping 3 into a function brings it back down again to about 275ns. Also for me PyPy was about twice as fast as neet python.