Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
449 views
in Technique[技术] by (71.8m points)

Python pandas - unique function returning unexpected output

I have a column in a csv file with 100 records as Status with values as ['Active', 'Cancelled', 'Cancelled - Gap in Status']. I want to get unique values of this column. My code is as below:

df=pd.read_pickle(data_frame.pickle')

status = df['Status']

print(pd.unique(status))

print(len(pd.unique(status)))

Output:
['Active', 'Cancelled']

2

What am I missing here?

Sample Data: +--------------------+--------+--------------+-------+---------------------------+ | Stage | ID | Name | State | Status | +--------------------+--------+--------------+-------+---------------------------+ | 123456Peter Grunt | 123456 | Peter Grunt | DE | Active | | 123456Peter Grunt | 123456 | Peter Grunt | NY | Cancelled | | 123456Peter Grunt | 123456 | Peter Grunt | CA | Cancelled | | 123456Peter Grunt | 123456 | Peter Grunt | IA | Cancelled | | 123456Peter Grunt | 123456 | Peter Grunt | WA | Cancelled | | 123456Peter Grunt | 123456 | Peter Grunt | DE | Cancelled | | 123456Peter Grunt | 123456 | Peter Grunt | NC | Active | | 123457William Bert | 123457 | William Bert | NY | Active | | 123457William Bert | 123457 | William Bert | SD | Active | | 123457William Bert | 123457 | William Bert | WA | Cancelled - Gap in Status | | 123457William Bert | 123457 | William Bert | CA | Active | | 123457William Bert | 123457 | William Bert | IA | Active | | 123457William Bert | 123457 | William Bert | WA | Active | | 123457William Bert | 123457 | William Bert | DE | Active | | 123458John Grand | 123458 | John Grand | AL | Active | | 123458John Grand | 123458 | John Grand | AK | Cancelled | | 123458John Grand | 123458 | John Grand | MD | Cancelled | | 123458John Grand | 123458 | John Grand | MA | Cancelled | | 123458John Grand | 123458 | John Grand | AK | Cancelled | | 123458John Grand | 123458 | John Grand | NY | Cancelled - Gap in Status | | 123458John Grand | 123458 | John Grand | LA | Cancelled | | 123458John Grand | 123458 | John Grand | SD | Cancelled | | 123458John Grand | 123458 | John Grand | WA | Cancelled | | 123458John Grand | 123458 | John Grand | CA | Cancelled | | 123458John Grand | 123458 | John Grand | IA | Active | | 123458John Grand | 123458 | John Grand | WA | Active | | 123458John Grand | 123458 | John Grand | DE | Active | | 123458John Grand | 123458 | John Grand | AL | Active | +--------------------+--------+--------------+-------+---------------------------+


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Judging by your code, you only want to print() the unique values from a column:

print(df['status'].unique()) # will print the unique values

print(df['status'].nunique()) # will return the number of unique values in the column
print(df['status'].value_counts() # will return the unique values and their respective frequency

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share

2.1m questions

2.1m answers

63 comments

56.5k users

...