Hello ! I have a dataframe in Python that has a column called "animal" with rows that contain the name of 4 animals: some rows with "bird", some with "dolphin", some with "dog" and finally some rows with "Others". I check the number of rows corresponding to each one of these with: Code: from collections import Counter cnt = Counter(data.animal) print(cnt) and I obtain: Code: Counter({'Others': 1366, 'dog': 922, 'bird': 133, 'dolphin': 10}) I would like to reduce the size of the classes "others" and "dog". How can I do ? I woud like to remove randomly some rows so that for example I have: Code: Counter({'Others':140, 'dog': 100, 'bird': 133, 'dolphin': 10}) I know I could use drop in this way: Code: # Set the index of the DataFrame to the column name data_with_index = data.set_index("animal") # With the index, we can drop the rows for a single animal with its name data_with_index = data_with_index.drop("Others") But I would delete all the rows with that name. Instead I would like to delete only a certain number of those. How can I do ?
Just to make the question clearer: I start from a dataframe 2431x5 (2431 rows and 5 columns, one of which was named "animal") and I would like to end up with a dataframe like 383x5 by reducing the classes "others" and "dog" which have larger size.