0
votes

I have a csv file, and want to use H2O to do DeepLearning. But it has some chinese and datetime that when I finish my Deeplearning need to save output to csv, it can't return to original data.

I use small data to show my problem here.

 In[1]: df = pd.DataFrame({'datetime':['2016-12-17 00:00:00'],'time':['00:00:30'],'month':['月'], 'weekend':['周六']})
        print(df.dtypes)
        df
out[1]: datetime    object
        time        object
        month       object
        weekend     object
        dtype: object
             datetime   time              month weekend
        0   2016-12-17 00:00:00 00:00:30    月   周六 

In[2]: h2o_frame = h2o.H2OFrame(df);h2o_frame ;h2o_frame.types ;h2o_frame

C:\Users\thi\Anaconda3\lib\site-packages\h2o\utils\shared_utils.py:170: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. data = _handle_python_lists(python_obj.as_matrix().tolist(), -1)[1]

out[2]: Parse progress: |█████████████████████████████████████████████████████████| 100%
                  datetime                time       month        weekend
        2016-12-17 00:00:00  1970-01-01 00:00:30    <0xA4EB>    <0xA9>P<0xA4BB>

the time I want it just only 00:00:30, any way to fix it?

month and weekends I don't find any way to let it show chinese, but I still finish my deeplearning

But when I want to let h2oframe back to dataframe and save to csv file, it save <0xA4EB> for me but not , and datetime change to int

 In[3]: dff = h2o_frame.as_data_frame();dff
out[3]:         datetime     time     month        weekend
        0   1481932800000   30000   <0xA4EB>    <0xA9>P<0xA4BB>
  • How to return correctly character from h2oframe to dataframe
  • how to retuen correctly datetime from h2oframe to dataframe

Thanks in advance.

1

1 Answers

1
votes

One simplest way to solve this is, when you convet pandas frame to H2OFrame use argument column_types ,as below:

In [69]: col_types
Out[69]: ['categorical', 'categorical', 'categorical', 'categorical']

In [70]: h2o_frame = h2o.H2OFrame(df,column_types=col_types);h2o_frame ;h2o_frame.types ;h2o_frame
Parse progress: |█████████████████████████████████████████████████████████████████████████████| 100%
Out[70]: 
datetime             month    time      weekend
-------------------  -------  --------  ---------
2016-12-17 00:00:00  月       00:00:30  周六

[1 row x 4 columns]


In [71]: dff = h2o_frame.as_data_frame();dff
Out[71]: 
              datetime month      time weekend
0  2016-12-17 00:00:00     月  00:00:30      周六

Hope this will help you..!!

Cheers..!!