Experts, i have a simple requirement but not able to find the function to achieve the goal.
I am using pyspark (spark 1.6 & Python 2.7) and have a simple pyspark dataframe column with certain values like-
1849adb0-gfhe6543-bduyre763ryi-hjdsgf87qwefdb-78a9f4811265_ABC
1849adb0-rdty4545y4-657u5h556-zsdcafdqwddqdas-78a9f4811265_1234
1849adb0-89o8iulk89o89-89876h5-432rebm787rrer-78a9f4811265_12345678
The common thing about these values is that there is a single "underscore" and after that there are certain characters (can be any number of characters). These are the characters i am interested to get in the output. I want to use a substring or regex function which will find the position of "underscore" in the column values and select "from underscore position +1" till the end of column value. So the output will look like a dataframe with values as-
ABC
1234
12345678
I tried using sub-string but could find anything to "index" the "underscore"
Thanks!