Stata read numeric data as string using variable names

Question

I am reading a csv file into Stata using

import delimited "../data_clean/winter20.csv", encoding(UTF-8)

The raw data looks like:

y             id1
-.7709586   000000000020
-.4195721   000000003969
-.8932499   300000000021
-1.256116   200000007153
-.7858037   000000000000

The imported data become:

y             id1
-.7709586   20
-.4195721   000000003969
-.8932499   300000000021
-1.256116   200000007153
-.7858037   0

However, there are some columns of IDs which are read as numeric. I would like to import them as strings. I want to read the data exactly as how the raw data looks like.

The way I found online is:

import delimited "/Users/tianwang/Dropbox/Construction/data_clean/winter20.csv", encoding(UTF-8) stringcols(74 97 116) clear

However, the raw data may be updated and column numbers may change. The following

import delimited "/Users/tianwang/Dropbox/Construction/data_clean/winter20.csv", encoding(UTF-8) stringcols(id1 id2 id3) clear

gives error id1: invalid numlist in stringcols() option. Is there a way to specify variable names rather than column numbers?

The reason is leading zeros are missing if I read IDs as numeric. Methodtostring does not recover the leading zeros. format id1 %09.0f only works if variables have equal number of digits.

Could you please also indicate us your current Stata version and OS? — Álvaro A. Gutiérrez-Vargas
Sorry just see the message. Mine is Stata/MP 16.1 for Mac (64-bit Intel) — Tian

Álvaro A. Gutiérrez-Vargas Álvaro A. Gutiérrez-Vargas · Accepted Answer · 2020-11-13T17:25:11

I think this should do it.

import delimited "../data_clean/winter20.csv", stringcols(_all) encoding(UTF-8)  clear

PS: Tested in Stata16/Win10

Stata read numeric data as string using variable names

1 Answers