I have this PySpark DataFrame
df = pd.DataFrame(np.array([
["[email protected]",2,3], ["[email protected]",5,5],
["[email protected]",8,2], ["[email protected]",9,3]
]), columns=['user','movie','rating'])
sparkdf = sqlContext.createDataFrame(df, samplingRatio=0.1)
user movie rating
[email protected] 2 3
[email protected] 5 5
[email protected] 8 2
[email protected] 9 3
I need to add a new column with a Rank by User
I want have this output
user movie rating Rank
[email protected] 2 3 1
[email protected] 5 5 1
[email protected] 8 2 2
[email protected] 9 3 3
How can I do that?