What is the difference between pandas agg and apply function?

David D

I can't figure out the difference between Pandas .aggregate and .apply functions.
Take the following as an example: I load a dataset, do a groupby, define a simple function, and either user .agg or .apply.

As you may see, the printing statement within my function results in the same output after using .agg and .apply. The result, on the other hand is different. Why is that?

import pandas
import pandas as pd
iris = pd.read_csv('iris.csv')
by_species = iris.groupby('Species')
def f(x):
    ...:     print type(x)
    ...:     print x.head(3)
    ...:     return 1

Using apply:

by_species.apply(f)
#<class 'pandas.core.frame.DataFrame'>
#   Sepal.Length  Sepal.Width  Petal.Length  Petal.Width Species
#0           5.1          3.5           1.4          0.2  setosa
#1           4.9          3.0           1.4          0.2  setosa
#2           4.7          3.2           1.3          0.2  setosa
#<class 'pandas.core.frame.DataFrame'>
#   Sepal.Length  Sepal.Width  Petal.Length  Petal.Width Species
#0           5.1          3.5           1.4          0.2  setosa
#1           4.9          3.0           1.4          0.2  setosa
#2           4.7          3.2           1.3          0.2  setosa
#<class 'pandas.core.frame.DataFrame'>
#    Sepal.Length  Sepal.Width  Petal.Length  Petal.Width     Species
#50           7.0          3.2           4.7          1.4  versicolor
#51           6.4          3.2           4.5          1.5  versicolor
#52           6.9          3.1           4.9          1.5  versicolor
#<class 'pandas.core.frame.DataFrame'>
#     Sepal.Length  Sepal.Width  Petal.Length  Petal.Width    Species
#100           6.3          3.3           6.0          2.5  virginica
#101           5.8          2.7           5.1          1.9  virginica
#102           7.1          3.0           5.9          2.1  virginica
#Out[33]: 
#Species
#setosa        1
#versicolor    1
#virginica     1
#dtype: int64

Using agg

by_species.agg(f)
#<class 'pandas.core.frame.DataFrame'>
#   Sepal.Length  Sepal.Width  Petal.Length  Petal.Width Species
#0           5.1          3.5           1.4          0.2  setosa
#1           4.9          3.0           1.4          0.2  setosa
#2           4.7          3.2           1.3          0.2  setosa
#<class 'pandas.core.frame.DataFrame'>
#    Sepal.Length  Sepal.Width  Petal.Length  Petal.Width     Species
#50           7.0          3.2           4.7          1.4  versicolor
#51           6.4          3.2           4.5          1.5  versicolor
#52           6.9          3.1           4.9          1.5  versicolor
#<class 'pandas.core.frame.DataFrame'>
#     Sepal.Length  Sepal.Width  Petal.Length  Petal.Width    Species
#100           6.3          3.3           6.0          2.5  virginica
#101           5.8          2.7           5.1          1.9  virginica
#102           7.1          3.0           5.9          2.1  virginica
#Out[34]: 
#           Sepal.Length  Sepal.Width  Petal.Length  Petal.Width
#Species                                                         
#setosa                 1            1             1            1
#versicolor             1            1             1            1
#virginica              1            1             1            1
TomAugspurger

apply applies the function to each group (your Species). Your function returns 1, so you end up with 1 value for each of 3 groups.

agg aggregates each column (feature) for each group, so you end up with one value per column per group.

Do read the groupby docs, they're quite helpful. There are also a bunch of tutorials floating around the web.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Pandas: Timing difference between Function and Apply to Series

From Dev

What is the difference between Pandas Series.apply() and Series.map()?

From Dev

What is the difference between Pandas Series.apply() and Series.map()?

From Java

What is the difference between call and apply?

From Dev

What is the difference between function and function*

From Dev

What is the difference between function and function*

From Dev

What is the difference between map and apply in scheme?

From Java

Kotlin: What is the difference between Apply and Also

From Java

What's the difference between commit() and apply() in SharedPreferences

From Java

What is the difference between std::invoke and std::apply?

From Dev

What is the difference between call and apply method in jQuery

From Java

Difference between map, applymap and apply methods in Pandas

From Dev

What is the difference between a Op and a Function

From Dev

What is the difference between {} and () in calling a function?

From Dev

what is the difference between Observable and a function

From Java

What is the difference between a Serverless Function, and a Lambda Function

From Dev

What is the difference between &foo::function and foo::function?

From Java

What is the difference between an abstract function and a virtual function?

From Dev

What is the difference between normal function and generator function?

From Dev

What is the difference between function and function(_:) in Swift?

From Dev

What is difference between setTimeout with function and without function ?

From Java

What is the difference between join and merge in Pandas?

From Java

What is the difference between size and count in pandas?

From Dev

What's the difference between .col and ['col'] in pandas

From Dev

What are the difference between sep and end in print function?

From Dev

What is the difference between these two function signatures?

From Dev

What's the difference between these two function expressions?

From Dev

What is the difference between a and a+ option in fopen function?

From Dev

What is the difference between Constructor and Class function

Related Related

  1. 1

    Pandas: Timing difference between Function and Apply to Series

  2. 2

    What is the difference between Pandas Series.apply() and Series.map()?

  3. 3

    What is the difference between Pandas Series.apply() and Series.map()?

  4. 4

    What is the difference between call and apply?

  5. 5

    What is the difference between function and function*

  6. 6

    What is the difference between function and function*

  7. 7

    What is the difference between map and apply in scheme?

  8. 8

    Kotlin: What is the difference between Apply and Also

  9. 9

    What's the difference between commit() and apply() in SharedPreferences

  10. 10

    What is the difference between std::invoke and std::apply?

  11. 11

    What is the difference between call and apply method in jQuery

  12. 12

    Difference between map, applymap and apply methods in Pandas

  13. 13

    What is the difference between a Op and a Function

  14. 14

    What is the difference between {} and () in calling a function?

  15. 15

    what is the difference between Observable and a function

  16. 16

    What is the difference between a Serverless Function, and a Lambda Function

  17. 17

    What is the difference between &foo::function and foo::function?

  18. 18

    What is the difference between an abstract function and a virtual function?

  19. 19

    What is the difference between normal function and generator function?

  20. 20

    What is the difference between function and function(_:) in Swift?

  21. 21

    What is difference between setTimeout with function and without function ?

  22. 22

    What is the difference between join and merge in Pandas?

  23. 23

    What is the difference between size and count in pandas?

  24. 24

    What's the difference between .col and ['col'] in pandas

  25. 25

    What are the difference between sep and end in print function?

  26. 26

    What is the difference between these two function signatures?

  27. 27

    What's the difference between these two function expressions?

  28. 28

    What is the difference between a and a+ option in fopen function?

  29. 29

    What is the difference between Constructor and Class function

HotTag

Archive