How does DataWeave know what reader/writer to use? (Mule 4)

Question

I have a problem understanding a very basic topic with DW: I've read https://docs.mulesoft.com/mule-user-guide/v/4.1/dataweave-formats but to me it does not explain what exactly is meant by "DataWeave can read and write many types of data formats", that is:

1) At what point does DW decide that it is "reading", say, JSON input?

2) How exactly is that decision made, i.e., what in a Mule message determines that input should be read as JSON (payload type? attributes?)?

3) At what point does DW "write", say, JSON output?

4) How exactly does a Mule message that is created from, say, JSON output of a DW script, look like (payload type? attributes?)?

Mariano de Achaval Mariano de Achaval · Accepted Answer · 2018-03-23T15:16:07

I'm going to try to explain how DW works inside mule:

Input Part

Mule has an special object call TypedValue, this class represents a Pair Being DataType = Pair.
All variables, payload are TypedValue. And it may also be present in more nested places. For example the list operation in File returns a List so the payload is going to be TypedValue, DataType> So this allows us to simultaneously list different type of files json, xml, etc and dw to read them all.

DW uses the DataType part to determine what reader to use based on the MimeType and how to configure that reader (encoding , reader properties) based on the mimetype properties

Output Part

DW always output a TypedValue. Now the intersting part is how DW inferes the DataType part that drives what writer to use.

If the user specifies it on the script with the output directive then it's easy
If the script that is being executed is assigned to a Message Processor field then the engine will give DW the hint of what is the expected type based on the metadata of that field. E.g. if it is a Pojo then DW will know what class to instantiate and will know that it needs to use the Java Writer, so the user won't need to know all that internal stuff.
The interesting part is when we don't know for example a set-payload. Then the logic is like this:
- DW will look at the script and see what inputs are being used if they are all of the same/compatible DataTypes then it's going to use that. This means that if in your script you put <set-payload value="#[payload.foo]/> We are going to look at the type of payload and if payload is Json then we are going to use the Json writer. Now if there is more than one input used and they are from different DataTypes an error is going to be thrown I.E <set-payload value="#[payload.foo ++ vars.bar]/> being vars.bar of type xml and payload of type Json. So sometimes specially on xml you may write an expression on a set payload and you may fail because it ends up being an invalid xml (e.g. multiple roots).
- If no input is being used then the Java writer is used. So <set-payload value="#[{a: true}]/> is going to output a java.util.Map with an entry ("a", true)
For the Logger message processor we had done a special thing to avoid errors on log. We try to use the logic under #3 but if it fails because the writer can not emit that data structure then we use the DataWeave writer that can write any data structure possible since the dw language is basically a superset of all the formats it handles (it contains all their features: objects, arrays, namespaces, numbers, strings, etc...)

Hope this explains it

How does DataWeave know what reader/writer to use? (Mule 4)

1 Answers

Input Part

Output Part