8. Advanced Grouping with Groupby() ¶
8.1. Covered in this Chapter¶
What is the Purpose of Groupby()
What are Pandas Group Objects
How to Analyze Group Objects Quantitatively
How to work with Multiple Groupings at Once
8.2. Introduction¶
When working with large quantities of data, it can sometimes be a bit difficult to understand broad patterns within your data. Often, you will need to group your data into small subsections based on some parameter, such as age, name, or some other feature. You can do this in Pandas using groupby(), which will be the main subject of this chapter. Groupby is a feature of Pandas that returns a special groupby object. This object can be called to perform different types of analyses on data, especially when leveraging the built-in quantitative features of Pandas, such as count() and sum(). In this chapter, we will explore these features and see how they can be used on a real-world dataset, the Titanic dataset.
import pandas as pd
df = pd.read_csv("data/titanic.csv")
df
PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S |
1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Th... | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26.0 | 0 | 0 | STON/O2. 3101282 | 7.9250 | NaN | S |
3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.0 | 1 | 0 | 113803 | 53.1000 | C123 | S |
4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35.0 | 0 | 0 | 373450 | 8.0500 | NaN | S |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
886 | 887 | 0 | 2 | Montvila, Rev. Juozas | male | 27.0 | 0 | 0 | 211536 | 13.0000 | NaN | S |
887 | 888 | 1 | 1 | Graham, Miss. Margaret Edith | female | 19.0 | 0 | 0 | 112053 | 30.0000 | B42 | S |
888 | 889 | 0 | 3 | Johnston, Miss. Catherine Helen "Carrie" | female | NaN | 1 | 2 | W./C. 6607 | 23.4500 | NaN | S |
889 | 890 | 1 | 1 | Behr, Mr. Karl Howell | male | 26.0 | 0 | 0 | 111369 | 30.0000 | C148 | C |
890 | 891 | 0 | 3 | Dooley, Mr. Patrick | male | 32.0 | 0 | 0 | 370376 | 7.7500 | NaN | Q |
891 rows × 12 columns
8.3. Groupby()¶
The groupby() function allows us to easily group our data in the DataFrame. Once your data are grouped, there are a lot of quantitative questions you can begin to ask. Let’s start simple. Let’s group our DataFrame by Sex.
df.groupby("Sex")
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000001F436D18910>
This output may not be quite what you expect. This is an object to which we can now pose targeted questions. Let’s try and see a DataFrame that only has “male” in the Sex column. We can do that by using get_group(“male”)
df.groupby("Sex").get_group("male")
PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S |
4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35.0 | 0 | 0 | 373450 | 8.0500 | NaN | S |
5 | 6 | 0 | 3 | Moran, Mr. James | male | NaN | 0 | 0 | 330877 | 8.4583 | NaN | Q |
6 | 7 | 0 | 1 | McCarthy, Mr. Timothy J | male | 54.0 | 0 | 0 | 17463 | 51.8625 | E46 | S |
7 | 8 | 0 | 3 | Palsson, Master. Gosta Leonard | male | 2.0 | 3 | 1 | 349909 | 21.0750 | NaN | S |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
883 | 884 | 0 | 2 | Banfield, Mr. Frederick James | male | 28.0 | 0 | 0 | C.A./SOTON 34068 | 10.5000 | NaN | S |
884 | 885 | 0 | 3 | Sutehall, Mr. Henry Jr | male | 25.0 | 0 | 0 | SOTON/OQ 392076 | 7.0500 | NaN | S |
886 | 887 | 0 | 2 | Montvila, Rev. Juozas | male | 27.0 | 0 | 0 | 211536 | 13.0000 | NaN | S |
889 | 890 | 1 | 1 | Behr, Mr. Karl Howell | male | 26.0 | 0 | 0 | 111369 | 30.0000 | C148 | C |
890 | 891 | 0 | 3 | Dooley, Mr. Patrick | male | 32.0 | 0 | 0 | 370376 | 7.7500 | NaN | Q |
577 rows × 12 columns
This argument does not have to be a string. Let’s say, we want to just get all the people who are aged 20. We can do the same thing by grouping the dataset by “Age” and then getting the group of 20 year olds.
df.groupby("Age").get_group(20)
PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
12 | 13 | 0 | 3 | Saundercock, Mr. William Henry | male | 20.0 | 0 | 0 | A/5. 2151 | 8.0500 | NaN | S |
91 | 92 | 0 | 3 | Andreasson, Mr. Paul Edvin | male | 20.0 | 0 | 0 | 347466 | 7.8542 | NaN | S |
113 | 114 | 0 | 3 | Jussila, Miss. Katriina | female | 20.0 | 1 | 0 | 4136 | 9.8250 | NaN | S |
131 | 132 | 0 | 3 | Coelho, Mr. Domingos Fernandeo | male | 20.0 | 0 | 0 | SOTON/O.Q. 3101307 | 7.0500 | NaN | S |
378 | 379 | 0 | 3 | Betros, Mr. Tannous | male | 20.0 | 0 | 0 | 2648 | 4.0125 | NaN | C |
404 | 405 | 0 | 3 | Oreskovic, Miss. Marija | female | 20.0 | 0 | 0 | 315096 | 8.6625 | NaN | S |
441 | 442 | 0 | 3 | Hampe, Mr. Leon | male | 20.0 | 0 | 0 | 345769 | 9.5000 | NaN | S |
622 | 623 | 1 | 3 | Nakid, Mr. Sahid | male | 20.0 | 1 | 1 | 2653 | 15.7417 | NaN | C |
640 | 641 | 0 | 3 | Jensen, Mr. Hans Peder | male | 20.0 | 0 | 0 | 350050 | 7.8542 | NaN | S |
664 | 665 | 1 | 3 | Lindqvist, Mr. Eino William | male | 20.0 | 1 | 0 | STON/O 2. 3101285 | 7.9250 | NaN | S |
682 | 683 | 0 | 3 | Olsvigen, Mr. Thor Anderson | male | 20.0 | 0 | 0 | 6563 | 9.2250 | NaN | S |
725 | 726 | 0 | 3 | Oreskovic, Mr. Luka | male | 20.0 | 0 | 0 | 315094 | 8.6625 | NaN | S |
762 | 763 | 1 | 3 | Barah, Mr. Hanna Assi | male | 20.0 | 0 | 0 | 2663 | 7.2292 | NaN | C |
840 | 841 | 0 | 3 | Alhomaki, Mr. Ilmari Rudolf | male | 20.0 | 0 | 0 | SOTON/O2 3101287 | 7.9250 | NaN | S |
876 | 877 | 0 | 3 | Gustafsson, Mr. Alfred Ossian | male | 20.0 | 0 | 0 | 7534 | 9.8458 | NaN | S |
8.4. Quantitative Analysis with Count() and Sum()¶
This is typically not how you would use the grouby function. It is far more powerful and often used for quantitative analysis on subsets of your data. Let’s say that I want to examine my dataset by sex and I am interested in known the quantity of column based solely on the metric of sex. I could use groupby() and .count(). When chained together, our question then becomes, how many PassengerId, Survived, Pclass, Name, etc. do we see for each column based on sex. While this question is particularly useful for the qualitative rows (such as Name) or numerical strings (such as PassengerId) because they display the total number of passengers because each person has a unique PassengerId and Name.
df.groupby("Sex").count()
PassengerId | Survived | Pclass | Name | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
---|---|---|---|---|---|---|---|---|---|---|---|
Sex | |||||||||||
female | 314 | 314 | 314 | 314 | 261 | 314 | 314 | 314 | 314 | 97 | 312 |
male | 577 | 577 | 577 | 577 | 453 | 577 | 577 | 577 | 577 | 107 | 577 |
For the quantitative rows, we can use sum() function. This will tell us the sum of all the columns that have floats or integers. Note that this is not a really good question to pose for the Age column. It is, however, very useful for the Fare column and the Survived Column. Remember, if a person survived, they have a 1; if they did not, they have a 0. We can use the sum to know how many male vs. female survivors there were.
df.groupby("Sex").sum()
PassengerId | Survived | Pclass | Age | SibSp | Parch | Fare | |
---|---|---|---|---|---|---|---|
Sex | |||||||
female | 135343 | 233 | 678 | 7286.00 | 218 | 204 | 13966.6628 |
male | 262043 | 109 | 1379 | 13919.17 | 248 | 136 | 14727.2865 |
Let’s say, though, that we are only interested in the Fare column. Before we add sum to our chain, we can specify that we want specifically the Fare column.
df.groupby("Sex").Fare.sum()
Sex
female 13966.6628
male 14727.2865
Name: Fare, dtype: float64
8.5. Working with Multiple Groups¶
Now, we have just the data on a single column. We can see that the combined fare of male passengers was greater than the combined sum of female passengers. Let’s say though that we are interested in how these sums divide over Pclass. We can pass a list to groupby, rather than just a string. This list will be a list of a strings that correspond to columns.
df.groupby(["Sex", "Pclass"]).Fare.sum()
Sex Pclass
female 1 9975.8250
2 1669.7292
3 2321.1086
male 1 8201.5875
2 2132.1125
3 4393.5865
Name: Fare, dtype: float64
The result of this new question is more nuanced. We are not looking at the sum of all fares, rather the sum of fares divided on a Pclass-by-Pclass basis. This means that we can now understand that these sums varied by class. For example, while the total fare for male passengers was greater, the total fare for first class female passengers was greater than their first class male counterparts. The male fare, however, is greater for both the 2nd Class and 3rd Class groups.
8.6. Groupings with Many Subsets¶
What if we were interested in something that would have more than just 6 neat subsections, such as 3 classes per sex. What if we also wanted to add another aspect to the groups, such as age. If we try and do that, our results are cutoff. We can try and use pd.set_option()
df.groupby(["Sex", "Pclass", "Age"]).Fare.sum()
Sex Pclass Age
female 1 2.0 151.5500
14.0 120.0000
15.0 211.3375
16.0 183.8792
17.0 165.9000
...
male 3 59.0 7.2500
61.0 6.2375
65.0 7.7500
70.5 7.7500
74.0 7.7750
Name: Fare, Length: 283, dtype: float64
In order to print off this many rows, we need to increase the amount of data Pandas will display. We can use the pd.set_options() command and pass an argument to display 1000 rows.
pd.set_option('display.max_rows', 1000)
df.groupby(["Sex", "Pclass", "Age"]).Fare.sum()
Sex Pclass Age
female 1 2.00 151.5500
14.00 120.0000
15.00 211.3375
16.00 183.8792
17.00 165.9000
18.00 569.5500
19.00 147.3625
21.00 340.3333
22.00 322.6500
23.00 376.2750
24.00 534.2625
25.00 151.5500
26.00 78.8500
29.00 211.3375
30.00 374.3542
31.00 278.1417
32.00 76.2917
33.00 229.6000
35.00 926.5375
36.00 326.6333
38.00 378.8083
39.00 329.5916
40.00 287.9625
41.00 134.5000
42.00 227.5250
43.00 211.3375
44.00 85.7000
45.00 164.8667
47.00 52.5542
48.00 65.5292
49.00 102.6584
50.00 276.2333
51.00 77.9583
52.00 171.7667
53.00 51.4792
54.00 137.6667
56.00 83.1583
58.00 326.5333
60.00 75.2500
62.00 80.0000
63.00 77.9583
2 2.00 26.0000
3.00 41.5792
4.00 62.0000
5.00 27.7500
6.00 33.0000
7.00 26.2500
8.00 26.2500
13.00 19.5000
14.00 30.0708
17.00 22.5000
18.00 36.0000
19.00 52.0000
21.00 10.5000
22.00 70.5792
23.00 13.7917
24.00 177.2500
25.00 56.0000
26.00 26.0000
27.00 45.3583
28.00 88.6500
29.00 62.5000
30.00 46.3500
31.00 26.2500
32.00 13.0000
32.50 13.0000
33.00 53.7500
34.00 79.0000
35.00 21.0000
36.00 52.0000
38.00 13.0000
40.00 67.7500
41.00 19.5000
42.00 39.0000
44.00 26.0000
45.00 39.7500
48.00 65.0000
50.00 47.0000
54.00 23.0000
55.00 16.0000
57.00 10.5000
3 0.75 38.5166
1.00 26.8750
2.00 81.9250
3.00 21.0750
4.00 52.1417
5.00 63.1208
6.00 31.2750
8.00 21.0750
9.00 108.7958
10.00 24.1500
11.00 31.2750
13.00 7.2292
14.00 19.0959
14.50 14.4542
15.00 29.7084
16.00 62.3833
17.00 22.3833
18.00 91.4667
19.00 15.7334
20.00 18.4875
21.00 59.6000
22.00 50.8792
23.00 15.4750
24.00 60.6583
25.00 15.7000
26.00 31.8792
27.00 31.5333
28.00 22.2958
29.00 46.7833
30.00 45.2875
30.50 7.7500
31.00 55.0625
32.00 15.5000
33.00 15.8500
35.00 20.2500
36.00 17.4000
37.00 9.5875
38.00 31.3875
39.00 60.4000
40.00 9.4750
41.00 59.9000
43.00 46.9000
45.00 50.1042
47.00 14.5000
48.00 34.3750
63.00 9.5875
male 1 0.92 151.5500
4.00 81.8583
11.00 120.0000
17.00 110.8833
18.00 108.9000
19.00 316.1000
21.00 77.2875
22.00 135.6333
23.00 63.3583
24.00 326.7208
25.00 146.5209
26.00 30.0000
27.00 371.8292
28.00 191.3208
29.00 96.6000
30.00 27.7500
31.00 159.4958
32.00 30.5000
33.00 5.0000
34.00 26.5500
35.00 565.1667
36.00 803.9792
37.00 135.3542
38.00 243.4625
39.00 0.0000
40.00 58.7208
42.00 130.8417
44.00 90.0000
45.00 172.0750
45.50 28.5000
46.00 140.3750
47.00 150.1083
48.00 155.2792
49.00 256.9167
50.00 295.9750
51.00 87.9292
52.00 110.1500
54.00 129.1500
55.00 30.5000
56.00 92.7458
58.00 142.9750
60.00 105.7500
61.00 65.8208
62.00 53.1000
64.00 289.0000
65.00 88.5292
70.00 71.0000
71.00 84.1584
80.00 30.0000
2 0.67 14.5000
0.83 47.7500
1.00 76.0042
2.00 26.0000
3.00 44.7500
8.00 36.7500
16.00 36.5000
18.00 109.5000
19.00 70.7500
21.00 158.5000
23.00 76.0458
24.00 97.0000
25.00 106.5792
26.00 10.5000
27.00 52.0000
28.00 70.0000
29.00 59.2208
30.00 73.5000
31.00 86.7542
32.00 110.0000
32.50 30.0708
33.00 12.2750
34.00 107.0000
35.00 36.5000
36.00 64.1250
36.50 26.0000
37.00 26.0000
39.00 52.0000
42.00 53.0000
43.00 26.2500
44.00 26.0000
46.00 26.0000
47.00 15.0000
48.00 13.0000
50.00 13.0000
51.00 12.5250
52.00 26.5000
54.00 66.0000
57.00 12.3500
59.00 13.5000
60.00 39.0000
62.00 10.5000
66.00 10.5000
70.00 10.5000
3 0.42 8.5167
1.00 107.1625
2.00 89.8875
3.00 47.2875
4.00 99.4333
6.00 12.4750
7.00 68.8125
8.00 29.1250
9.00 114.7125
10.00 27.9000
11.00 65.6875
12.00 11.2417
14.00 86.5875
15.00 7.2292
16.00 154.9042
17.00 47.3959
18.00 74.2333
19.00 94.7915
20.00 110.8751
20.50 7.2500
21.00 111.3541
22.00 108.8874
23.00 24.9750
23.50 7.2292
24.00 95.1791
24.50 8.0500
25.00 85.2125
26.00 166.3333
27.00 45.7833
28.00 153.2374
28.50 23.3292
29.00 65.3749
30.00 71.3000
30.50 8.0500
31.00 23.4500
32.00 192.5291
33.00 70.9083
34.00 36.9958
34.50 6.4375
35.00 38.1708
36.00 55.0916
37.00 7.9250
38.00 23.6083
39.00 71.2750
40.00 58.5208
40.50 22.2500
41.00 21.2333
42.00 32.2667
43.00 14.5000
44.00 40.1250
45.00 15.0250
45.50 7.2250
47.00 16.2500
48.00 7.8542
49.00 0.0000
50.00 8.0500
51.00 22.8542
55.50 8.0500
59.00 7.2500
61.00 6.2375
65.00 7.7500
70.50 7.7500
74.00 7.7750
Name: Fare, dtype: float64
What if we wanted to make this look a bit nicer, as a Pandas DataFrame? We can pass all our data back into a new DataFrame object.
df = pd.DataFrame(df.groupby(["Sex", "Pclass", "Age"]).Fare.sum())
df
Fare | |||
---|---|---|---|
Sex | Pclass | Age | |
female | 1 | 2.00 | 151.5500 |
14.00 | 120.0000 | ||
15.00 | 211.3375 | ||
16.00 | 183.8792 | ||
17.00 | 165.9000 | ||
18.00 | 569.5500 | ||
19.00 | 147.3625 | ||
21.00 | 340.3333 | ||
22.00 | 322.6500 | ||
23.00 | 376.2750 | ||
24.00 | 534.2625 | ||
25.00 | 151.5500 | ||
26.00 | 78.8500 | ||
29.00 | 211.3375 | ||
30.00 | 374.3542 | ||
31.00 | 278.1417 | ||
32.00 | 76.2917 | ||
33.00 | 229.6000 | ||
35.00 | 926.5375 | ||
36.00 | 326.6333 | ||
38.00 | 378.8083 | ||
39.00 | 329.5916 | ||
40.00 | 287.9625 | ||
41.00 | 134.5000 | ||
42.00 | 227.5250 | ||
43.00 | 211.3375 | ||
44.00 | 85.7000 | ||
45.00 | 164.8667 | ||
47.00 | 52.5542 | ||
48.00 | 65.5292 | ||
49.00 | 102.6584 | ||
50.00 | 276.2333 | ||
51.00 | 77.9583 | ||
52.00 | 171.7667 | ||
53.00 | 51.4792 | ||
54.00 | 137.6667 | ||
56.00 | 83.1583 | ||
58.00 | 326.5333 | ||
60.00 | 75.2500 | ||
62.00 | 80.0000 | ||
63.00 | 77.9583 | ||
2 | 2.00 | 26.0000 | |
3.00 | 41.5792 | ||
4.00 | 62.0000 | ||
5.00 | 27.7500 | ||
6.00 | 33.0000 | ||
7.00 | 26.2500 | ||
8.00 | 26.2500 | ||
13.00 | 19.5000 | ||
14.00 | 30.0708 | ||
17.00 | 22.5000 | ||
18.00 | 36.0000 | ||
19.00 | 52.0000 | ||
21.00 | 10.5000 | ||
22.00 | 70.5792 | ||
23.00 | 13.7917 | ||
24.00 | 177.2500 | ||
25.00 | 56.0000 | ||
26.00 | 26.0000 | ||
27.00 | 45.3583 | ||
28.00 | 88.6500 | ||
29.00 | 62.5000 | ||
30.00 | 46.3500 | ||
31.00 | 26.2500 | ||
32.00 | 13.0000 | ||
32.50 | 13.0000 | ||
33.00 | 53.7500 | ||
34.00 | 79.0000 | ||
35.00 | 21.0000 | ||
36.00 | 52.0000 | ||
38.00 | 13.0000 | ||
40.00 | 67.7500 | ||
41.00 | 19.5000 | ||
42.00 | 39.0000 | ||
44.00 | 26.0000 | ||
45.00 | 39.7500 | ||
48.00 | 65.0000 | ||
50.00 | 47.0000 | ||
54.00 | 23.0000 | ||
55.00 | 16.0000 | ||
57.00 | 10.5000 | ||
3 | 0.75 | 38.5166 | |
1.00 | 26.8750 | ||
2.00 | 81.9250 | ||
3.00 | 21.0750 | ||
4.00 | 52.1417 | ||
5.00 | 63.1208 | ||
6.00 | 31.2750 | ||
8.00 | 21.0750 | ||
9.00 | 108.7958 | ||
10.00 | 24.1500 | ||
11.00 | 31.2750 | ||
13.00 | 7.2292 | ||
14.00 | 19.0959 | ||
14.50 | 14.4542 | ||
15.00 | 29.7084 | ||
16.00 | 62.3833 | ||
17.00 | 22.3833 | ||
18.00 | 91.4667 | ||
19.00 | 15.7334 | ||
20.00 | 18.4875 | ||
21.00 | 59.6000 | ||
22.00 | 50.8792 | ||
23.00 | 15.4750 | ||
24.00 | 60.6583 | ||
25.00 | 15.7000 | ||
26.00 | 31.8792 | ||
27.00 | 31.5333 | ||
28.00 | 22.2958 | ||
29.00 | 46.7833 | ||
30.00 | 45.2875 | ||
30.50 | 7.7500 | ||
31.00 | 55.0625 | ||
32.00 | 15.5000 | ||
33.00 | 15.8500 | ||
35.00 | 20.2500 | ||
36.00 | 17.4000 | ||
37.00 | 9.5875 | ||
38.00 | 31.3875 | ||
39.00 | 60.4000 | ||
40.00 | 9.4750 | ||
41.00 | 59.9000 | ||
43.00 | 46.9000 | ||
45.00 | 50.1042 | ||
47.00 | 14.5000 | ||
48.00 | 34.3750 | ||
63.00 | 9.5875 | ||
male | 1 | 0.92 | 151.5500 |
4.00 | 81.8583 | ||
11.00 | 120.0000 | ||
17.00 | 110.8833 | ||
18.00 | 108.9000 | ||
19.00 | 316.1000 | ||
21.00 | 77.2875 | ||
22.00 | 135.6333 | ||
23.00 | 63.3583 | ||
24.00 | 326.7208 | ||
25.00 | 146.5209 | ||
26.00 | 30.0000 | ||
27.00 | 371.8292 | ||
28.00 | 191.3208 | ||
29.00 | 96.6000 | ||
30.00 | 27.7500 | ||
31.00 | 159.4958 | ||
32.00 | 30.5000 | ||
33.00 | 5.0000 | ||
34.00 | 26.5500 | ||
35.00 | 565.1667 | ||
36.00 | 803.9792 | ||
37.00 | 135.3542 | ||
38.00 | 243.4625 | ||
39.00 | 0.0000 | ||
40.00 | 58.7208 | ||
42.00 | 130.8417 | ||
44.00 | 90.0000 | ||
45.00 | 172.0750 | ||
45.50 | 28.5000 | ||
46.00 | 140.3750 | ||
47.00 | 150.1083 | ||
48.00 | 155.2792 | ||
49.00 | 256.9167 | ||
50.00 | 295.9750 | ||
51.00 | 87.9292 | ||
52.00 | 110.1500 | ||
54.00 | 129.1500 | ||
55.00 | 30.5000 | ||
56.00 | 92.7458 | ||
58.00 | 142.9750 | ||
60.00 | 105.7500 | ||
61.00 | 65.8208 | ||
62.00 | 53.1000 | ||
64.00 | 289.0000 | ||
65.00 | 88.5292 | ||
70.00 | 71.0000 | ||
71.00 | 84.1584 | ||
80.00 | 30.0000 | ||
2 | 0.67 | 14.5000 | |
0.83 | 47.7500 | ||
1.00 | 76.0042 | ||
2.00 | 26.0000 | ||
3.00 | 44.7500 | ||
8.00 | 36.7500 | ||
16.00 | 36.5000 | ||
18.00 | 109.5000 | ||
19.00 | 70.7500 | ||
21.00 | 158.5000 | ||
23.00 | 76.0458 | ||
24.00 | 97.0000 | ||
25.00 | 106.5792 | ||
26.00 | 10.5000 | ||
27.00 | 52.0000 | ||
28.00 | 70.0000 | ||
29.00 | 59.2208 | ||
30.00 | 73.5000 | ||
31.00 | 86.7542 | ||
32.00 | 110.0000 | ||
32.50 | 30.0708 | ||
33.00 | 12.2750 | ||
34.00 | 107.0000 | ||
35.00 | 36.5000 | ||
36.00 | 64.1250 | ||
36.50 | 26.0000 | ||
37.00 | 26.0000 | ||
39.00 | 52.0000 | ||
42.00 | 53.0000 | ||
43.00 | 26.2500 | ||
44.00 | 26.0000 | ||
46.00 | 26.0000 | ||
47.00 | 15.0000 | ||
48.00 | 13.0000 | ||
50.00 | 13.0000 | ||
51.00 | 12.5250 | ||
52.00 | 26.5000 | ||
54.00 | 66.0000 | ||
57.00 | 12.3500 | ||
59.00 | 13.5000 | ||
60.00 | 39.0000 | ||
62.00 | 10.5000 | ||
66.00 | 10.5000 | ||
70.00 | 10.5000 | ||
3 | 0.42 | 8.5167 | |
1.00 | 107.1625 | ||
2.00 | 89.8875 | ||
3.00 | 47.2875 | ||
4.00 | 99.4333 | ||
6.00 | 12.4750 | ||
7.00 | 68.8125 | ||
8.00 | 29.1250 | ||
9.00 | 114.7125 | ||
10.00 | 27.9000 | ||
11.00 | 65.6875 | ||
12.00 | 11.2417 | ||
14.00 | 86.5875 | ||
15.00 | 7.2292 | ||
16.00 | 154.9042 | ||
17.00 | 47.3959 | ||
18.00 | 74.2333 | ||
19.00 | 94.7915 | ||
20.00 | 110.8751 | ||
20.50 | 7.2500 | ||
21.00 | 111.3541 | ||
22.00 | 108.8874 | ||
23.00 | 24.9750 | ||
23.50 | 7.2292 | ||
24.00 | 95.1791 | ||
24.50 | 8.0500 | ||
25.00 | 85.2125 | ||
26.00 | 166.3333 | ||
27.00 | 45.7833 | ||
28.00 | 153.2374 | ||
28.50 | 23.3292 | ||
29.00 | 65.3749 | ||
30.00 | 71.3000 | ||
30.50 | 8.0500 | ||
31.00 | 23.4500 | ||
32.00 | 192.5291 | ||
33.00 | 70.9083 | ||
34.00 | 36.9958 | ||
34.50 | 6.4375 | ||
35.00 | 38.1708 | ||
36.00 | 55.0916 | ||
37.00 | 7.9250 | ||
38.00 | 23.6083 | ||
39.00 | 71.2750 | ||
40.00 | 58.5208 | ||
40.50 | 22.2500 | ||
41.00 | 21.2333 | ||
42.00 | 32.2667 | ||
43.00 | 14.5000 | ||
44.00 | 40.1250 | ||
45.00 | 15.0250 | ||
45.50 | 7.2250 | ||
47.00 | 16.2500 | ||
48.00 | 7.8542 | ||
49.00 | 0.0000 | ||
50.00 | 8.0500 | ||
51.00 | 22.8542 | ||
55.50 | 8.0500 | ||
59.00 | 7.2500 | ||
61.00 | 6.2375 | ||
65.00 | 7.7500 | ||
70.50 | 7.7500 | ||
74.00 | 7.7750 |
This is now a bit easier to read. You should now have a fairly good understanding of how to group data in Pandas using groupby() and some of the more powerful ways you can use groupby() to manipulate quantitative data.