1
votes

I have yaml data like the input below and i need output as key value pairs

Input

a="""
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
code:
- '716'
- '718'
id:
- 488
- 499
"""

ouput needed

{'code': ['716', '718'], 'id': [488, 499]}

The default constructor was giving me an error. I tried adding new constructor and now its not giving me error but i am not able to get key value pairs. FYI, If i remove the !ruby/hash:ActiveSupport::HashWithIndifferentAccess line from my yaml then it gives me desired output.

def new_constructor(loader, tag_suffix, node):

     if type(node.value)=='list':
         val=''.join(node.value)
     else:
         val=node.value
     val=node.value
     ret_val="""
     {0}
     """.format(val)
     return ret_val

yaml.add_multi_constructor('', new_constructor)
yaml.load(a)

output

"\n     [(ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'code'), SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'716'), ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'718')])), (ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'id'), SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:int', value=u'488'), ScalarNode(tag=u'tag:yaml.org,2002:int', value=u'499')]))]\n     "

Please suggest.

2

2 Answers

1
votes

This is not a solution using PyYAML, but I recommend using ruamel.yaml instead. If for no other reason, it's more actively maintained than PyYAML. A quote from the overview

Many of the bugs filed against PyYAML, but that were never acted upon, have been fixed in ruamel.yaml

To load that string, you can do

import ruamel.yaml
parser = ruamel.yaml.YAML()

obj = parser.load(a)  # as defined above.
1
votes

I strongly recommend following @Andrew F answer, but in case you wonder why your code did not get the proper result, that is because you don't correctly process the node under the tag in your tag handling.

Although the node's value is a list (of tuples with key value pairs), you should test for the type of the node itself (using isinstance) and then hand it over to the "normal" mapping processing routine as the tag is on a mapping:

import yaml
from yaml.loader import SafeLoader

a = """\
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
code:
- '716'
- '718'
id:
- 488
- 499
"""

def new_constructor(loader, tag_suffix, node):
    if isinstance(node, yaml.nodes.MappingNode):
        return loader.construct_mapping(node, deep=True)
    raise NotImplementedError

yaml.add_multi_constructor('', new_constructor, Loader=SafeLoader)


data = yaml.load(a, Loader=SafeLoader)
print(data)

which gives:

{'code': ['716', '718'], 'id': [488, 499]}

You should not use PyYAML's yaml.load(), it is documented to be potentially unsafe and above all it is not necessary. Just add the new constructor to the SafeLoader.