Well, I had a lot of fun writing this little parsing class using Regex
and named capture groups (?<Name>group)
.
My approach was that each 'type definition' string could be broken up as a set of the following: Type Name, optional Generic Type, and optional array marker '[ ]'.
So given the classic Dictionary<string, byte[]>
you would have Dictionary
as the type name and string, byte[]
as your inner generic type string.
We can split the inner generic type on the comma (',') character and recursively parse each type string using the same Regex
. Each successful parse should be added to the parent type information and you can build a tree hierarchy.
With the previous example, we would end up with an array of {string, byte[]}
to parse. Both of these are easily parsed and set to part of Dictionary
's inner types.
On ToString()
it's simply a matter of recursively outputting each type's friendly name, including inner types. So Dictionary
would output his type name, and iterate through all inner types, outputting their type names and so forth.
class TypeInformation
{
static readonly Regex TypeNameRegex = new Regex(@"^(?<TypeName>[a-zA-Z0-9_]+)(<(?<InnerTypeName>[a-zA-Z0-9_,\<\>\s\[\]]+)>)?(?<Array>(\[\]))?$", RegexOptions.Compiled);
readonly List<TypeInformation> innerTypes = new List<TypeInformation>();
public string TypeName
{
get;
private set;
}
public bool IsArray
{
get;
private set;
}
public bool IsGeneric
{
get { return innerTypes.Count > 0; }
}
public IEnumerable<TypeInformation> InnerTypes
{
get { return innerTypes; }
}
private void AddInnerType(TypeInformation type)
{
innerTypes.Add(type);
}
private static IEnumerable<string> SplitByComma(string value)
{
var strings = new List<string>();
var sb = new StringBuilder();
var level = 0;
foreach (var c in value)
{
if (c == ',' && level == 0)
{
strings.Add(sb.ToString());
sb.Clear();
}
else
{
sb.Append(c);
}
if (c == '<')
level++;
if(c == '>')
level--;
}
strings.Add(sb.ToString());
return strings;
}
public static bool TryParse(string friendlyTypeName, out TypeInformation typeInformation)
{
typeInformation = null;
// Try to match the type to our regular expression.
var match = TypeNameRegex.Match(friendlyTypeName);
// If that fails, the format is incorrect.
if (!match.Success)
return false;
// Scrub the type name, inner type name, and array '[]' marker (if present).
var typeName = match.Groups["TypeName"].Value;
var innerTypeFriendlyName = match.Groups["InnerTypeName"].Value;
var isArray = !string.IsNullOrWhiteSpace(match.Groups["Array"].Value);
// Create the root type information.
TypeInformation type = new TypeInformation
{
TypeName = typeName,
IsArray = isArray
};
// Check if we have an inner type name (in the case of generics).
if (!string.IsNullOrWhiteSpace(innerTypeFriendlyName))
{
// Split each type by the comma character.
var innerTypeNames = SplitByComma(innerTypeFriendlyName);
// Iterate through all inner type names and attempt to parse them recursively.
foreach (string innerTypeName in innerTypeNames)
{
TypeInformation innerType = null;
var trimmedInnerTypeName = innerTypeName.Trim();
var success = TypeInformation.TryParse(trimmedInnerTypeName, out innerType);
// If the inner type fails, so does the parent.
if (!success)
return false;
// Success! Add the inner type to the parent.
type.AddInnerType(innerType);
}
}
// Return the parsed type information.
typeInformation = type;
return true;
}
public override string ToString()
{
// Create a string builder with the type name prefilled.
var sb = new StringBuilder(this.TypeName);
// If this type is generic (has inner types), append each recursively.
if (this.IsGeneric)
{
sb.Append("<");
// Get the number of inner types.
int innerTypeCount = this.InnerTypes.Count();
// Append each inner type's friendly string recursively.
for (int i = 0; i < innerTypeCount; i++)
{
sb.Append(innerTypes[i].ToString());
// Check if we need to add a comma to separate from the next inner type name.
if (i + 1 < innerTypeCount)
sb.Append(", ");
}
sb.Append(">");
}
// If this type is an array, we append the array '[]' marker.
if (this.IsArray)
sb.Append("[]");
return sb.ToString();
}
}
I made a console app to test it, it seems to work with most cases I threw at it.
Here's the code:
class MainClass
{
static readonly int RootIndentLevel = 2;
static readonly string InputString = @"BogusClass<A,B,Vector<C>>";
public static void Main(string[] args)
{
TypeInformation type = null;
Console.WriteLine("Input = {0}", InputString);
var success = TypeInformation.TryParse(InputString, out type);
if (success)
{
Console.WriteLine("Output = {0}", type.ToString());
Console.WriteLine("Graph:");
OutputGraph(type, RootIndentLevel);
}
else
Console.WriteLine("Parsing error!");
}
static void OutputGraph(TypeInformation type, int indentLevel = 0)
{
Console.WriteLine("{0}{1}{2}", new string(' ', indentLevel), type.TypeName, type.IsArray ? "[]" : string.Empty);
foreach (var innerType in type.InnerTypes)
OutputGraph(innerType, indentLevel + 2);
}
}
And here's the output:
Input = BogusClass<A,B,Vector<C>>
Output = BogusClass<A, B, Vector<C>>
Graph:
BogusClass
A
B
Vector
C
There are some possible lingering issues, such as multidimensional arrays. It will more than likely fail on something like int[,]
or string[][]
.
"Vector.<T>"
=>"System.Collections.Generic.List<T>"
. It will function as follows. Given a literal string type from the client of"Vector.<int>"
, that will match the client-type pattern"Vector.<T>"
which is paired with server-type pattern"List<T>"
, so that it resolves to"List<int>"
. – Triynko