I have a DataFrame imported from a CSV file with a column "animal_name" that shows different animals like:
Dog
Dog
Cat
Fish
Dog
Cat
I am trying to sort the DataFrame in order of the frequency which all the animals appear, like:
Dog
Dog
Dog
Cat
Cat
Fish
So far I have been able to find the total frequencies that each of these items occurs using:
animal_data.groupby(["animal_name"]).value_counts()
animal_species_counts = pd.Series(animal_data["animal_name"].value_counts())
However, I have been unable to sort them as intended. I have tried creating an additional column for frequency and using assign(), transform(), insert(), and sort.values() as suggested on other similar questions posted here like:
animal_data.assign(animal_species_counts=tree_data_copy.groupby("tree_type_name")[:"tree_type_name"].transform("tree_type_counts")).sort_values(by=["tree_type_counts","tree_type_name"],ascending=[True, False])
I tried variations of this, but either get errors that the data is unhashable, or that the DataFrame still looks exactly the same. Not sure where it's going wrong.
Once you have the frequencies in a series (animal_species_counts
) you can use that as a map to create a new column, then sort by that column.
import pandas as pd
animal_data = pd.DataFrame({'animal_name': ['Dog', 'Dog', 'Cat', 'Fish', 'Dog', 'Cat']})
print(animal_data)
animal_species_counts = animal_data["animal_name"].value_counts()
print(animal_species_counts)
animal_data["animal_species_counts"] = animal_data["animal_name"].map(animal_species_counts)
print(animal_data)
animal_data.sort_values(by='animal_species_counts', ascending=False, inplace=True)
print(animal_data)
This puts the whole dataframe in order by the counts column.
Final output:
animal_name animal_species_counts
0 Dog 3
1 Dog 3
4 Dog 3
2 Cat 2
5 Cat 2
3 Fish 1
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments