3
votes

I am creating a StructType from a schema of another custom Java class, from which I can extract column name and data type.

From what I know, it seems like there is two way to construct a StructType:

  1. Use add method
  2. Use constructor passing in an array of StructField

I can basically use both methods since I loop through my custom schema class to extract field one by one. The question is, it seems like add method will create a new StructType each time it's being called, which seems unnecessarily complicated way of handling this, so I am actually wondering if it would really create a new object each time it's called. If not, I figured add is a better way than creating a new ArrayList of StructField

1

1 Answers

6
votes

If you check the source code of StructType class you will see that add method invokes StructType constructor with new StructField so it will create new StructType.

def add(name: String, dataType: DataType): StructType = {
    StructType(fields :+ new StructField(name, dataType, nullable = true, Metadata.empty))
}

You can verify it using below sample program.

public class QuickTest {
public static void main(String[] args) {
    SparkSession sparkSession = SparkSession
            .builder()
            .appName("QuickTest")
            .master("local[*]")
            .getOrCreate();
    //StructType
    StructType st1 = new StructType().add("name", DataTypes.StringType);
    System.out.println("hashCode "+st1.hashCode());
    System.out.println("structType "+st1.toString());

    //add
    st1.add("age", DataTypes.IntegerType);
    System.out.println("hashCode "+st1.hashCode());
    System.out.println("structType "+st1.toString());

    //add and assign
    StructType st2 = st1.add("age", DataTypes.IntegerType);
    System.out.println("hashCode "+st2.hashCode());
    System.out.println("structType "+st2.toString());

    //constructor
    StructType st3 = new StructType(new StructField[] {new StructField("name", DataTypes.StringType, true, null), new StructField("age", DataTypes.IntegerType, true, null)});
    System.out.println("hashCode "+st3.hashCode());
    System.out.println("structType "+st3.toString());
  }
}