Drafting for 3 and D
Drafting for 3 and D
Predicting the best shooters from NCAA production and their height to wingspan ratio.
Watching the draft this year I heard a lot of talk about wingspan and shooting ability. Wingspan and length have always been key attributes for any level of basketball. Long arms are more important than height alone since long arms help in rebounding, contesting shots, deflections, getting steals and just generally making it hard on the offense. Shooting is also very important but 3 point shooting has become even more important in today’s modern NBA. Outside of the lottery draft picks, it becomes a crapshoot to try to predict who will be successful.
A lot of players outside of the lottery won’t turn into All Stars but they can turn into solid role players. Every team needs guys to play a role alongside their superstar. One of the most popular current roles is a “3 and D” guy. Someone that can hit threes and play solid defense against opposing players. Historically guys like Robert Horry, Doug Christie, and Shawn Marion or more recently Danny Green, Draymond Green, Kawhi (superstar?), Trevor Ariza and others are great examples.
So I decided to look at some basic NCAA stats(no foreign stats due to inefficiency in finding data), mainly NCAA 3PT% and NCAA FT%, to predict NBA 3PT%. Also I wanted to analyze the wingspan to height ratio of players to determine their length attribute.
Importing the Modules and Getting the Data
I used Beautiful Soup and requests for scraping and parsing data from Basketball Reference. Also I used seaborn for creating some pretty data visualization. Here I also created my helper functions for accessing the webpage and finding the html table as well as the function to loop through all of the players and retrieve their stats.
%matplotlib inline
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np
import scipy.stats as ss
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
def get_cbb_game_table(player):
url = "http://www.sports-reference.com/cbb/players/%s-1.html" % (player)
r = requests.get(url).text
soup = BeautifulSoup(r, 'lxml')
perGame = soup.find(id='players_per_game')
return perGame
def get_stats(playersDF):
shooters = playersDF["Player"].str.replace(' ','-').str.lower()
playersDF['Player'] = shooters
#get ncaa stats from url
headerTable = get_cbb_game_table(shooters[0])
#get column names
header = headerTable.find('thead').findAll('th', text=True)
colNames = ['Player']
for name in header:
colNames.append(name.get_text())
stats = []
#get totals for each player
for p in shooters:
perGame = get_cbb_game_table(p)
#if player has ncaa stats available
if perGame:
perGameTotals =perGame.find('tfoot').findAll('td')
statRow = [p]
for stat in perGameTotals:
statRow.append(stat.get_text())
#handle ncaa transfer stats AKA toney douglas rule First 25 values only
stats.append(statRow[0:25])
#else remove from nba dataframe
else:
playersDF = playersDF[shooters != p]
statsDF = pd.DataFrame(stats, columns = colNames)
statsDF = statsDF.apply(pd.to_numeric, errors = 'ignore')
return statsDF
#previously downloaded a csv with all NBA players from 2000-2016 with 3 point attemps > 300
#thanks to http://www.basketball-reference.com
nba = pd.read_csv("3PAgreater300.csv")
#this will take awhile maybe grab a beer or some coffee while you wait.
#also basketball-reference.com doesn't like it when we crawl their site so crawl it once and save as a csv for later use.
ncaa = get_stats(nba)
ncaa.to_csv("ncaa.csv")
/home/scott/anaconda3/lib/python3.5/site-packages/ipykernel/__main__.py:43: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
Tidying Up the Data
I chose to only accept players who shot at least 1 three point attempt per game in college and I merged the ncaa dataframe with the nba data frame into one data frame called shot_data.
#only accepting players who took at least 1 3PA per game in college
ncaa = ncaa[ncaa['3PA']>1]
shot_data = ncaa[['Player',"3P%","FT%"]]
shot_data=shot_data.rename(columns = {'3P%':'ncaa_3P','FT%':'ncaa_FT'})
shot_data.Player = shot_data.Player.str.replace("-"," ")
shot_data.Player = shot_data.Player.str.title()
shot_data.Player = shot_data.Player.str.strip()
shot_data = pd.merge(pd.DataFrame(shot_data),pd.DataFrame(nba[["Player","3P%"]]),left_index=True,right_index=True)
shot_data=shot_data.rename(columns = {'3P%':'3P'})
Checking for Normality of the Independent Variables
#shot_data.hist( alpha=0.5, bins=20, layout=(1,3), figsize=(20,10))
f, axs = plt.subplots(1,3,figsize=(15,10))
plt.subplot(1,3,1)
sns.distplot(shot_data['ncaa_3P'])
plt.subplot(1,3,2)
sns.distplot(shot_data['ncaa_FT'])
plt.subplot(1,3,3)
sns.distplot(shot_data['3P'])
<matplotlib.axes._subplots.AxesSubplot at 0x7f4551764860>
Finding and Removing Outliers
Outliers can have a negative effect on linear regression so I decided to remove any outliers. I used the basic definition for outliers as any value less than 1.5Lower Quartile and greater than 1.5Upper Quartile.
#basic outlier test:omit observation if abs_val of observation > 1.5*IQR
q1 = shot_data.quantile(.25)
q3= shot_data.quantile(.75)
iqr = pd.concat([q1, q3], axis=1, keys=['q1', 'q3'])
iqr['iqr'] = iqr['q3'] - iqr['q1']
iqr['max'] = iqr['q3'] + 1.5*iqr['iqr']
iqr['min'] = iqr['q1'] - 1.5*iqr['iqr']
iqr.head()
q1 | q3 | iqr | max | min | |
---|---|---|---|---|---|
ncaa_3P | 0.3360 | 0.3880 | 0.0520 | 0.46600 | 0.25800 |
ncaa_FT | 0.7135 | 0.7975 | 0.0840 | 0.92350 | 0.58750 |
3P | 0.3450 | 0.3785 | 0.0335 | 0.42875 | 0.29475 |
#exlude outliers
shot_data = shot_data[shot_data['ncaa_3P']<iqr['max'][0]]
shot_data = shot_data[shot_data['ncaa_FT']<iqr['max'][1]]
shot_data = shot_data[shot_data['ncaa_3P']>iqr['min'][0]]
shot_data = shot_data[shot_data['ncaa_FT']>iqr['min'][1]]
print(shot_data)
Player_x ncaa_3P ncaa_FT Player_y \
0 Stephen Curry 0.412 0.876 stephen-curry
1 Hubert Davis 0.435 0.819 hubert-davis
2 Jason Kapono 0.446 0.830 jason-kapono
3 Steve Nash 0.401 0.867 steve-nash
4 Wesley Person 0.441 0.747 wesley-person
6 Kyle Korver 0.453 0.891 kyle-korver
7 Danny Ferry 0.388 0.775 danny-ferry
8 Anthony Morrow 0.421 0.867 anthony-morrow
9 Klay Thompson 0.390 0.827 terry-porter
10 Brent Barry 0.345 0.794 klay-thompson
11 Matt Bonner 0.395 0.740 brent-barry
12 Jon Barry 0.371 0.717 matt-bonner
13 Doug Mcdermott 0.458 0.831 jon-barry
14 Eric Piatkowski 0.358 0.777 jose-calderon
15 Kelenna Azubuike 0.373 0.752 j.j.-redick
16 Fred Hoiberg 0.400 0.844 doug-mcdermott
17 Anthony Parker 0.379 0.750 eric-piatkowski
18 Anthony Peeler 0.393 0.779 kelenna-azubuike
19 Wally Szczerbiak 0.431 0.809 fred-hoiberg
20 Daniel Gibson 0.387 0.741 anthony-parker
21 Mike Miller 0.345 0.718 anthony-peeler
22 Raja Bell 0.350 0.731 wally-szczerbiak
23 Ray Allen 0.448 0.779 c.j.-mccollum
24 Luke Babbitt 0.421 0.893 daniel-gibson
25 Danny Green 0.375 0.845 mike-miller
26 Brandon Rush 0.435 0.733 raja-bell
27 Ben Gordon 0.423 0.795 peja-stojakovic
28 Khris Middleton 0.321 0.768 ray-allen
29 Jared Dudley 0.365 0.731 luke-babbitt
30 Walt Williams 0.359 0.762 danny-green
.. ... ... ... ...
343 Lamar Odom 0.330 0.687 landry-fields
345 Chris Johnson 0.371 0.826 josh-howard
346 Jordan Crawford 0.384 0.765 shawn-marion
347 Ronnie Price 0.345 0.788 dahntay-jones
348 Evan Turner 0.362 0.758 adam-morrison
349 Donte Greene 0.345 0.707 darrell-armstrong
350 Andrew Wiggins 0.341 0.775 brad-miller
351 Antoine Wright 0.376 0.648 antoine-walker
352 Will Barton 0.299 0.733 damien-wilkins
354 Derrick Rose 0.337 0.712 trey-burke
355 Russell Westbrook 0.354 0.685 josh-childress
356 Yakhouba Diawara 0.343 0.650 jamario-moon
358 George Mccloud 0.429 0.778 jimmy-butler
359 Jamaal Tinsley 0.327 0.685 brian-shaw
360 Derrick Williams 0.431 0.813 marcus-banks
361 Rodney Stuckey 0.317 0.806 kentavious-caldwell-pope
362 Marcus Smart 0.295 0.751 raymond-felton
363 Shandon Anderson 0.291 0.634 alonzo-gee
365 Tyreke Evans 0.274 0.711 dan-majerle
366 Corey Brewer 0.356 0.708 stephon-marbury
368 Dwyane Wade 0.333 0.745 tony-parker
370 Tony Allen 0.347 0.682 norris-cole
374 Moochie Norris 0.354 0.699 tobias-harris
375 Will Bynum 0.310 0.739 solomon-hill
376 James Johnson 0.370 0.792 corey-maggette
379 Anthony Carter 0.330 0.714 jae-crowder
380 Michael Carter Williams 0.307 0.679 devin-harris
381 Ronnie Brewer 0.340 0.671 shelvin-mack
382 Marquis Daniels 0.296 0.646 dennis-schroder
384 Andre Miller 0.294 0.676 baron-davis
3P
0 0.444
1 0.440
2 0.434
3 0.432
4 0.432
6 0.429
7 0.425
8 0.425
9 0.421
10 0.420
11 0.419
12 0.414
13 0.413
14 0.412
15 0.412
16 0.410
17 0.410
18 0.409
19 0.409
20 0.409
21 0.409
22 0.409
23 0.408
24 0.407
25 0.407
26 0.406
27 0.406
28 0.403
29 0.403
30 0.403
.. ...
343 0.332
345 0.332
346 0.332
347 0.331
348 0.331
349 0.330
350 0.330
351 0.330
352 0.330
354 0.329
355 0.329
356 0.329
358 0.328
359 0.328
360 0.327
361 0.327
362 0.327
363 0.327
365 0.327
366 0.327
368 0.327
370 0.326
374 0.325
375 0.325
376 0.325
379 0.324
380 0.324
381 0.324
382 0.324
384 0.323
[327 rows x 5 columns]
Regression Analysis
After running the regression analysis I was able to calculate my coefficients for my formula as well as the R-squared value. Notice the R-squared value is really low which can be problematic. To look at this model more in depth I took a look at the residuals.
#set up and run ordinary least squares regression
Y = shot_data["3P"]
X = shot_data[["ncaa_3P","ncaa_FT"]]
X = sm.add_constant(X)
result = sm.OLS(Y,X, missing = "drop").fit()
print(result.summary())
shot_data_pred = shot_data
shot_data_pred['pred'] = result.fittedvalues
shot_data_pred['resid'] = result.resid
print(shot_data_pred[["Player_x","3P",'pred']])
OLS Regression Results
==============================================================================
Dep. Variable: 3P R-squared: 0.198
Model: OLS Adj. R-squared: 0.193
Method: Least Squares F-statistic: 40.05
Date: Fri, 08 Jul 2016 Prob (F-statistic): 2.87e-16
Time: 07:27:38 Log-Likelihood: 779.47
No. Observations: 327 AIC: -1553.
Df Residuals: 324 BIC: -1542.
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 0.2213 0.017 13.286 0.000 0.189 0.254
ncaa_3P 0.1596 0.035 4.524 0.000 0.090 0.229
ncaa_FT 0.1127 0.024 4.718 0.000 0.066 0.160
==============================================================================
Omnibus: 11.468 Durbin-Watson: 0.350
Prob(Omnibus): 0.003 Jarque-Bera (JB): 12.158
Skew: 0.465 Prob(JB): 0.00229
Kurtosis: 2.834 Cond. No. 39.8
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Player_x 3P pred
0 Stephen Curry 0.444 0.385740
1 Hubert Davis 0.440 0.382989
2 Jason Kapono 0.434 0.385983
3 Steve Nash 0.432 0.382970
4 Wesley Person 0.432 0.375835
6 Kyle Korver 0.429 0.393972
7 Danny Ferry 0.425 0.370532
8 Anthony Morrow 0.425 0.386162
9 Klay Thompson 0.421 0.376709
10 Brent Barry 0.420 0.365811
11 Matt Bonner 0.419 0.367706
12 Jon Barry 0.414 0.361286
13 Doug Mcdermott 0.413 0.388011
14 Eric Piatkowski 0.412 0.365970
15 Kelenna Azubuike 0.412 0.365548
16 Fred Hoiberg 0.410 0.380220
17 Anthony Parker 0.410 0.366280
18 Anthony Peeler 0.409 0.371781
19 Wally Szczerbiak 0.409 0.381224
20 Daniel Gibson 0.409 0.366542
21 Mike Miller 0.409 0.357249
22 Raja Bell 0.409 0.359512
23 Ray Allen 0.408 0.380557
24 Luke Babbitt 0.407 0.389091
25 Danny Green 0.407 0.376343
26 Brandon Rush 0.406 0.373300
27 Ben Gordon 0.406 0.378370
28 Khris Middleton 0.403 0.359052
29 Jared Dudley 0.403 0.361905
30 Walt Williams 0.403 0.364440
.. ... ... ...
343 Lamar Odom 0.332 0.351364
345 Chris Johnson 0.332 0.373565
346 Jordan Crawford 0.332 0.368767
347 Ronnie Price 0.331 0.365135
348 Evan Turner 0.331 0.364468
349 Donte Greene 0.330 0.356010
350 Andrew Wiggins 0.330 0.363032
351 Antoine Wright 0.330 0.354310
352 Will Barton 0.330 0.351599
354 Derrick Rose 0.329 0.355297
355 Russell Westbrook 0.329 0.354968
356 Yakhouba Diawara 0.329 0.349270
358 George Mccloud 0.328 0.377412
359 Jamaal Tinsley 0.328 0.350660
360 Derrick Williams 0.327 0.381674
361 Rodney Stuckey 0.327 0.362695
362 Marcus Smart 0.327 0.352988
363 Shandon Anderson 0.327 0.339170
365 Tyreke Evans 0.327 0.345131
366 Corey Brewer 0.327 0.357878
368 Dwyane Wade 0.327 0.358376
370 Tony Allen 0.326 0.353513
374 Moochie Norris 0.325 0.356545
375 Will Bynum 0.325 0.354030
376 James Johnson 0.325 0.369575
379 Anthony Carter 0.324 0.354405
380 Michael Carter Williams 0.324 0.346792
381 Ronnie Brewer 0.324 0.351157
382 Marquis Daniels 0.324 0.341320
384 Andre Miller 0.323 0.344380
[327 rows x 3 columns]
Residuals
Taking a look at the residuals can help determine what might be wrong with this model.
#check residual plots
f, axs = plt.subplots(2,2,figsize=(10,6))
plt.subplot(2,2,1)
sns.residplot(shot_data['ncaa_3P'],shot_data['resid'])
low = shot_data['resid'].min()
high = shot_data['resid'].max()
plt.plot((.42, .42), (low-.01,high+.02), 'r--')
plt.title("NCAA 3P% vs Residuals")
plt.subplot(2,2,2)
sns.residplot(shot_data['ncaa_FT'],shot_data['resid'])
plt.title("NCAA FT% vs Residuals")
plt.subplot(2,2,3)
ax0 = sns.residplot(shot_data['3P'],shot_data['resid'])
plt.title("Actual 3P% vs Residuals")
plt.subplot(2,2,4)
ax = sns.distplot(shot_data['resid'])
#plt.hist(shot_data['resid'])
plt.title("Residuals Distribution")
plt.tight_layout(pad=0.4, w_pad=0.5, h_pad=1)
Looking at the NCAA 3P% vs Residuals the model is not as accurate with NCAA 3P% greater than ~40%. The residuals show that this model might not be the best predictor especially for the extreme values such as 3P% > .40.
Predicting Drafted College 3 and D Players
I will use the same model from above to predict 3 point percentage for all drafted NCAA players and also display their ratio of wingspan to height.
#previously downloaded csv from http://www.sports-reference.com
draft = pd.read_csv("draftees.csv")
draft_ncaa = get_stats(draft)
print (draft)
/home/scott/anaconda3/lib/python3.5/site-packages/ipykernel/__main__.py:43: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
Pk Tm Player College
0 1 PHI ben-simmons Louisiana State University
1 2 LAL brandon-ingram Duke University
2 3 BOS jaylen-brown University of California
3 5 MIN kris-dunn Providence College
4 6 NOP buddy-hield University of Oklahoma
5 7 DEN jamal-murray University of Kentucky
6 8 SAC marquesse-chriss University of Washington
7 9 TOR jakob-poeltl University of Utah
8 11 ORL domantas-sabonis Gonzaga University
9 12 UTA taurean-prince Baylor University
10 14 CHI denzel-valentine Michigan State University
11 17 MEM wade-baldwin Vanderbilt University
12 18 DET henry-ellenson Marquette University
13 19 DEN malik-beasley Florida State University
14 20 IND caris-levert University of Michigan
15 21 ATL deandre-bembry Saint Joseph's University
16 22 CHO malachi-richardson Syracuse University
17 25 LAC brice-johnson University of North Carolina
18 27 TOR pascal-siakam New Mexico State University
19 28 PHO skal-labissiere University of Kentucky
20 29 SAS dejounte-murray University of Washington
21 30 GSW damion-jones Vanderbilt University
22 31 BOS deyonta-davis Michigan State University
23 33 LAC cheick-diallo University of Kansas
24 34 PHO tyler-ulis University of Kentucky
25 36 MIL malcolm-brogdon University of Virginia
26 37 HOU chinanu-onuaku University of Louisville
27 38 MIL patrick-mccaw University of Nevada, Las Vegas
28 40 NOP diamond-stone University of Maryland
29 41 ORL stephen-zimmerman University of Nevada, Las Vegas
30 42 UTA isaiah-whitehead Seton Hall University
31 45 BOS demetrius-jackson University of Notre Dame
32 46 DAL a.j.-hammons Purdue University
33 47 ORL jake-layman University of Maryland
34 49 DET michael-gbinije Syracuse University
35 50 IND georges-niang Iowa State University
36 51 BOS ben-bentil Providence College
37 52 UTA joel-bolomboy Weber State University
38 54 ATL kay-felder Oakland University
39 55 BRK marcus-paige University of North Carolina
40 56 DEN daniel-hamilton University of Connecticut
41 58 BOS abdel-nader Iowa State University
42 59 SAC isaiah-cousins University of Oklahoma
43 60 UTA tyrone-wallace University of California
#save to csv for easy access later
#draft_ncaa.to_csv("draft_ncaa.csv")
#draft_ncaa = pd.read_csv('draft_ncaa.csv')
draft = draft.merge(draft_ncaa, on='Player')
print(draft)
Pk Tm Player College Season \
0 1 PHI ben-simmons Louisiana State University Career
1 2 LAL brandon-ingram Duke University Career
2 3 BOS jaylen-brown University of California Career
3 5 MIN kris-dunn Providence College Career
4 6 NOP buddy-hield University of Oklahoma Career
5 7 DEN jamal-murray University of Kentucky Career
6 9 TOR jakob-poeltl University of Utah Career
7 11 ORL domantas-sabonis Gonzaga University Career
8 12 UTA taurean-prince Baylor University Career
9 14 CHI denzel-valentine Michigan State University Career
10 17 MEM wade-baldwin Vanderbilt University Career
11 18 DET henry-ellenson Marquette University Career
12 19 DEN malik-beasley Florida State University Career
13 20 IND caris-levert University of Michigan Career
14 21 ATL deandre-bembry Saint Joseph's University Career
15 22 CHO malachi-richardson Syracuse University Career
16 25 LAC brice-johnson University of North Carolina Career
17 27 TOR pascal-siakam New Mexico State University Career
18 28 PHO skal-labissiere University of Kentucky Career
19 29 SAS dejounte-murray University of Washington Career
20 31 BOS deyonta-davis Michigan State University Career
21 33 LAC cheick-diallo University of Kansas Career
22 34 PHO tyler-ulis University of Kentucky Career
23 36 MIL malcolm-brogdon University of Virginia Career
24 37 HOU chinanu-onuaku University of Louisville Career
25 38 MIL patrick-mccaw University of Nevada, Las Vegas Career
26 40 NOP diamond-stone University of Maryland Career
27 42 UTA isaiah-whitehead Seton Hall University Career
28 45 BOS demetrius-jackson University of Notre Dame Career
29 47 ORL jake-layman University of Maryland Career
30 49 DET michael-gbinije Syracuse University Career
31 50 IND georges-niang Iowa State University Career
32 51 BOS ben-bentil Providence College Career
33 52 UTA joel-bolomboy Weber State University Career
34 55 BRK marcus-paige University of North Carolina Career
35 56 DEN daniel-hamilton University of Connecticut Career
36 58 BOS abdel-nader Iowa State University Career
37 59 SAC isaiah-cousins University of Oklahoma Career
38 60 UTA tyrone-wallace University of California Career
School Conf G MP FG ... FT FTA FT% \
0 LSU NaN 33 34.9 6.5 ... 6.0 9.0 0.670
1 Duke NaN 36 34.6 5.9 ... 3.2 4.7 0.682
2 University of California NaN 34 27.6 4.8 ... 4.2 6.4 0.654
3 Providence NaN 95 31.5 4.5 ... 3.1 4.5 0.693
4 Oklahoma NaN 132 31.7 5.9 ... 2.8 3.4 0.836
5 Kentucky NaN 36 35.2 6.8 ... 3.3 4.2 0.783
6 Utah NaN 70 26.9 5.1 ... 3.1 5.2 0.607
7 Gonzaga NaN 74 26.6 5.2 ... 3.1 4.2 0.729
8 Baylor NaN 129 20.2 3.6 ... 2.0 2.8 0.718
9 Michigan State NaN 144 29.0 4.0 ... 1.5 2.0 0.779
10 Vanderbilt NaN 68 29.6 3.4 ... 3.5 4.4 0.800
11 Marquette NaN 33 33.5 5.9 ... 4.3 5.8 0.749
12 Florida State NaN 34 29.8 5.4 ... 3.1 3.8 0.813
13 Michigan NaN 103 26.4 3.6 ... 2.0 2.6 0.770
14 St. Joseph's NaN 101 36.1 5.7 ... 3.1 4.9 0.628
15 Syracuse NaN 37 34.4 4.1 ... 3.1 4.2 0.720
16 UNC NaN 148 21.0 4.8 ... 1.9 2.8 0.708
17 New Mexico State NaN 68 32.7 6.5 ... 3.5 5.0 0.711
18 Kentucky NaN 36 15.8 2.7 ... 1.1 1.7 0.661
19 Washington NaN 34 33.5 5.9 ... 3.2 4.9 0.663
20 Michigan State NaN 35 18.6 3.4 ... 0.7 1.1 0.605
21 Kansas NaN 27 7.5 1.2 ... 0.6 1.0 0.556
22 Kentucky NaN 72 30.1 3.6 ... 2.9 3.4 0.846
23 Virginia NaN 136 30.6 4.4 ... 3.1 3.5 0.876
24 Louisville NaN 66 21.0 2.8 ... 0.7 1.3 0.547
25 UNLV NaN 65 31.7 4.2 ... 1.9 2.5 0.753
26 Maryland NaN 35 23.1 4.8 ... 2.9 3.8 0.761
27 Seton Hall NaN 56 30.5 5.1 ... 3.5 4.6 0.757
28 Pacific NaN 57 NaN 4.2 ... 3.3 4.0 0.837
29 Maryland NaN 141 27.9 3.4 ... 1.9 2.6 0.759
30 Overall NaN 120 25.5 3.4 ... 1.7 2.6 0.639
31 Iowa State NaN 138 29.8 6.2 ... 2.5 3.2 0.763
32 Providence NaN 69 27.9 4.7 ... 3.6 4.8 0.761
33 Weber State NaN 130 28.8 3.8 ... 3.6 5.0 0.713
34 UNC NaN 141 32.4 4.2 ... 2.5 2.9 0.844
35 UConn NaN 71 31.6 4.2 ... 2.0 2.6 0.772
36 Overall NaN 117 24.3 3.7 ... 1.8 2.5 0.731
37 Oklahoma NaN 137 27.7 3.5 ... 1.5 2.1 0.711
38 University of California NaN 129 31.6 4.6 ... 2.6 4.2 0.613
TRB AST STL BLK TOV PF PTS
0 11.8 4.8 2.0 0.8 3.4 2.8 19.2
1 6.8 2.0 1.1 1.4 2.0 2.1 17.3
2 5.4 2.0 0.8 0.6 3.1 3.2 14.6
3 5.1 5.8 2.2 0.4 3.3 2.9 12.8
4 5.0 1.9 1.3 0.3 2.2 2.0 17.4
5 5.2 2.2 1.0 0.3 2.3 2.1 20.0
6 8.0 1.3 0.5 1.7 1.8 2.4 13.3
7 9.4 1.3 0.5 0.6 2.0 3.0 13.5
8 4.2 1.1 0.9 0.5 1.7 2.1 10.2
9 5.9 4.4 0.9 0.3 2.2 2.1 11.4
10 4.1 4.8 1.3 0.2 2.3 2.5 11.6
11 9.7 1.8 0.8 1.5 2.4 2.5 17.0
12 5.3 1.5 0.9 0.2 1.7 2.2 15.6
13 3.5 2.7 0.9 0.2 1.3 1.5 10.4
14 6.7 3.6 1.4 0.8 2.5 2.5 15.7
15 4.3 2.1 1.1 0.3 2.1 2.5 13.4
16 7.0 0.9 0.8 1.1 1.3 2.5 11.6
17 9.7 1.5 0.9 2.0 1.9 2.6 16.5
18 3.1 0.3 0.3 1.6 0.9 3.0 6.6
19 6.0 4.4 1.8 0.3 3.2 2.6 16.1
20 5.5 0.7 0.3 1.8 0.9 2.3 7.5
21 2.5 0.0 0.3 0.9 0.6 1.4 3.0
22 2.4 5.3 1.2 0.1 1.5 1.7 11.3
23 4.1 2.5 0.9 0.2 1.5 1.8 13.3
24 6.4 1.1 0.8 1.6 1.7 2.7 6.2
25 4.2 3.3 2.0 0.4 1.9 1.9 12.2
26 5.4 0.4 0.5 1.6 1.5 2.3 12.5
27 3.7 4.5 1.3 1.1 3.4 2.3 15.8
28 2.8 3.3 1.1 0.1 NaN NaN 13.4
29 4.8 1.1 0.9 0.8 1.4 1.9 10.2
30 3.1 2.6 1.3 0.3 1.6 2.1 9.8
31 5.2 3.1 0.7 0.5 2.3 3.1 16.1
32 6.3 0.9 0.6 0.7 1.6 2.7 13.8
33 10.1 0.7 0.6 1.4 2.0 2.3 11.4
34 2.8 4.3 1.4 0.2 1.9 1.7 13.1
35 8.3 4.2 1.0 0.4 2.3 2.1 11.7
36 4.3 1.1 0.8 0.6 2.3 2.5 10.3
37 3.9 2.7 1.1 0.3 1.9 2.0 9.7
38 5.2 3.4 1.1 0.4 2.2 2.5 12.6
[39 rows x 28 columns]
#use results from nba analysis to predict 3P% for draftees
X_draft = draft[["3P%","FT%"]]
X_draft = sm.add_constant(X_draft)
#add the prediction to the draft dataframe
draft["pred"] = result.predict(X_draft)
Now I will bring in the wingspan information provided from http://www.nbadraft.net/2016-nba-draft-combine-measurements.
ws = pd.read_csv('wingspan.csv')
ws.head()
draft.head()
draft = draft.merge(ws, on='Player')
draft_sorted = draft[['Player','Tm','FT%','3P%','pred','Ratio']]
draft_sorted = draft_sorted.sort_values(by='pred', ascending = False)
print(draft_sorted)
Player Tm FT% 3P% pred Ratio
8 wade-baldwin MEM 0.800 0.422 0.378774 1.12
20 malcolm-brogdon MIL 0.876 0.365 0.378240 1.09
4 buddy-hield NOP 0.836 0.390 0.377723 1.07
31 marcus-paige BRK 0.844 0.375 0.376231 1.08
19 tyler-ulis PHO 0.846 0.371 0.375818 1.08
10 malik-beasley DEN 0.813 0.387 0.374653 1.05
7 denzel-valentine CHI 0.779 0.408 0.374174 1.09
11 caris-levert IND 0.770 0.401 0.372043 1.05
25 demetrius-jackson BOS 0.837 0.348 0.371134 1.08
28 georges-niang IND 0.763 0.375 0.367106 1.04
33 isaiah-cousins SAC 0.711 0.407 0.366354 1.04
22 patrick-mccaw MIL 0.753 0.367 0.364703 1.06
26 jake-layman ORL 0.759 0.362 0.364581 1.02
24 isaiah-whitehead UTA 0.757 0.359 0.363877 1.07
1 brandon-ingram LAL 0.682 0.410 0.363566 1.09
6 taurean-prince UTA 0.718 0.376 0.362196 1.06
32 daniel-hamilton DEN 0.772 0.337 0.362056 1.04
30 joel-bolomboy UTA 0.713 0.371 0.360835 1.08
13 malachi-richardson CHO 0.720 0.353 0.358751 1.09
29 ben-bentil BOS 0.761 0.324 0.358742 1.08
3 kris-dunn MIN 0.693 0.354 0.355869 1.09
27 michael-gbinije DET 0.639 0.388 0.355211 1.03
9 henry-ellenson DET 0.749 0.288 0.351646 1.05
0 ben-simmons PHI 0.670 0.333 0.349927 1.05
2 jaylen-brown BOS 0.654 0.294 0.341902 1.08
12 deandre-bembry ATL 0.628 0.312 0.341845 1.07
15 pascal-siakam TOR 0.711 0.176 0.329493 1.09
16 skal-labissiere PHO 0.661 0.000 0.295776 1.05
5 jakob-poeltl TOR 0.607 0.000 0.289693 1.03
21 chinanu-onuaku HOU 0.547 0.000 0.282934 1.07
14 brice-johnson LAC 0.708 NaN NaN 1.04
17 deyonta-davis BOS 0.605 NaN NaN 1.06
18 cheick-diallo LAC 0.556 NaN NaN 1.11
23 diamond-stone NOP 0.761 NaN NaN 1.07
Conclusion
Anchor Down Wade Baldwin! Hmm…Wade Baldwin is at the top of both my predicted 3P% as well as having the highest wingspan to height ratio. I didn’t get to see him play much but I think Memphis has a solid young (20 yo) backup point guard to Conley and possibly a great 3 and D contributor. According to my model Baldwin is the cream of the crop but other standouts include Malcolm Brogdon, Denzel Valentine, Brandon Ingram, and Malachi Richardson.
Again there are some improvements that could be made for this model as I assumed that college 3P% and college FT% correlates with high NBA 3P%. There might be some other stats that might be better predictors including TS% and what team they are drafted by (Spurs and Warriors seem to emphasize and improve their players’ 3P%).
With league average 3P% of 35.4% last year and average wingspan ratio of ~1.6 I would expect anyone with above average 3P% and above average ratio to have the ability to be a solid 3 and D player. So below is the list of all players with pred > 35.4% and Ratio > 1.06.
threeAndD = draft_sorted[draft_sorted['pred'] > .354]
threeAndD = threeAndD[threeAndD['Ratio'] > 1.06]
threeAndD = threeAndD.dropna()
print(threeAndD)
Player Tm FT% 3P% pred Ratio
8 wade-baldwin MEM 0.800 0.422 0.378774 1.12
20 malcolm-brogdon MIL 0.876 0.365 0.378240 1.09
4 buddy-hield NOP 0.836 0.390 0.377723 1.07
31 marcus-paige BRK 0.844 0.375 0.376231 1.08
19 tyler-ulis PHO 0.846 0.371 0.375818 1.08
7 denzel-valentine CHI 0.779 0.408 0.374174 1.09
25 demetrius-jackson BOS 0.837 0.348 0.371134 1.08
24 isaiah-whitehead UTA 0.757 0.359 0.363877 1.07
1 brandon-ingram LAL 0.682 0.410 0.363566 1.09
30 joel-bolomboy UTA 0.713 0.371 0.360835 1.08
13 malachi-richardson CHO 0.720 0.353 0.358751 1.09
29 ben-bentil BOS 0.761 0.324 0.358742 1.08
3 kris-dunn MIN 0.693 0.354 0.355869 1.09
Typically it seems rookies and even second year players struggle on the defensive end until they can adjust to the NBA competition. However looking at the recent players coming from college I would expect these players to be possible contributors to floor spacing and defensive disruptions in the near future.
threeAndD.head(len(threeAndD))
Player | Tm | FT% | 3P% | pred | Ratio | |
---|---|---|---|---|---|---|
8 | wade-baldwin | MEM | 0.800 | 0.422 | 0.378774 | 1.12 |
20 | malcolm-brogdon | MIL | 0.876 | 0.365 | 0.378240 | 1.09 |
4 | buddy-hield | NOP | 0.836 | 0.390 | 0.377723 | 1.07 |
31 | marcus-paige | BRK | 0.844 | 0.375 | 0.376231 | 1.08 |
19 | tyler-ulis | PHO | 0.846 | 0.371 | 0.375818 | 1.08 |
7 | denzel-valentine | CHI | 0.779 | 0.408 | 0.374174 | 1.09 |
25 | demetrius-jackson | BOS | 0.837 | 0.348 | 0.371134 | 1.08 |
24 | isaiah-whitehead | UTA | 0.757 | 0.359 | 0.363877 | 1.07 |
1 | brandon-ingram | LAL | 0.682 | 0.410 | 0.363566 | 1.09 |
30 | joel-bolomboy | UTA | 0.713 | 0.371 | 0.360835 | 1.08 |
13 | malachi-richardson | CHO | 0.720 | 0.353 | 0.358751 | 1.09 |
29 | ben-bentil | BOS | 0.761 | 0.324 | 0.358742 | 1.08 |
3 | kris-dunn | MIN | 0.693 | 0.354 | 0.355869 | 1.09 |