About SharkWAR

About SharkWAR

About Shark WAR

Shark WAR is meant to be equivalent to the Wins Above Replacement (WAR) calculation found on baseball-reference.com and fangraphs.com. WAR combines offensive, defensive, and pitching statistics over a season into a single number that represents the number of wins the player contributed to their team, above the number of wins a "replacement" level player would be expected to contribute.

Shark WAR was created by Richard Franck, a data architect/analyst, baseball fan, and youth baseball umpire from Bethesda, Maryland. I created my own version because I wanted to perform calculations using WAR for all players and teams, and that data cannot be easily downloaded from other sites.

Definitions

The concept of Wins Above Replacement depends, first and foremost, on the concept of a Replacement Player. Essentially, if the player we are evaluating didn't play, what level of production would the team get from the player who took his place? We might think of this as being a good AAA player, but that doesn't help answer the question of how productive that player would be - how many runs they would generate on offense, or how many runs they would allow as a pitcher or fielder.

Instead, we define a replacement player based on team performance. A replacement player is the average player on a team that has a record of 48-112 in a 162 game season, or a .294 winning percentage. This applies to all phases of the game - pitching, hitting, and defense. (For comparison, the 2024 Chicago White Sox finished 41-121.)

The following definitions are used in the data tables on this site.

Batting Definitions

Shark WAR The total Wins Above Replacement, including hitting, fielding, and pitching HR Home Runs
Batting WAR The batting portion of the player's Wins Above Replacement BB Walks, including intentional walks
G Games Played SB Stolen Bases
PA Plate Appearances, calculated as At Bats + Walks + Hit by Pitch + Sacrifice Flies + Sacrifice Hits CS Caught Stealing
wRC Weighted Runs Created, a calculation of the number of runs the hitter contributed to his team as a batter H Hits
OBP Times reached base (Hits + Walks + Hit by Pitch) divided by plate appearances (excluding Sacrifice Hits) 2B Doubles
SLG Total bases (Hits + Doubles + (2 x Triples) + (3 x Home Runs)) divided by At Bats 3B Triples

Pitching Definitions

Shark WAR The total Wins Above Replacement, including hitting, fielding, and pitching IP Innings Pitched
Pitching WAR The pitching portion of the player's Wins Above Replacement W Games Won
Pitch Runs Saved The number of runs the pitcher did not allow compared to the number of runs a Replacement Player would have allowed in the same number of innings pitched L Games Lost
ERA Earned Run Average: earned runs allowed per 9 innings pitched CG Complete Games
WHIP Walks plus Hits per Innings Pitched SV Saves
SOper9 Strikeouts per 9 innings pitched R Runs allowed (earned and unearned)
G Games pitched BB Walks allowed, including intentional walks
GS Games pitched as the starting pitcher SO Strikeouts

Fielding Definitions

Fielding WAR The fielding portion of the player's Wins Above Replacement PO Putouts
Def Runs Saved The number of runs the fielder did not allow compared to the number of runs a Replacement Player would have allowed in the same number of defensive innings at that position A Assists
G Games played defensively at the position E Errors made
Inn Innings played defensively at the position DP Double Plays

Calculations

This section summarizes the calculations used in creating SharkWAR.

Principles

  1. Players are evaluated in the context of the season and league in which they played. A run is worth more in years of low scoring than in years of high scoring.
  2. Statistics are adjusted using park adjustment factors (from fangraphs.com) for number of runs above/below average for the player's home ballpark.
  3. The "Pythagorean Win Percentage" calculates a team's expected winning percentage from the number of runs scored and allowed:

    ExpectedWin% = RunsScored1.83 / (RunsScored1.83 + RunsAllowed1.83)

  4. Assuming that our team of replacement players scored the same number of runs below the league average as they allowed above the league average, solve the Pythgorean Win Percentage formula to determine how many runs the replacment team scored and allowed.
  5. The number of "Runs per Win" for a league and season is the total number of runs scored in the league divided by the total number of games.
  6. For hitting, calculate the number of runs the hitter contributed above what a replacement player would contribute in the same number of plate appearances.
  7. For fielding, calculate the number of runs allowed below what a replacement player would allow in the same number of innings on defense, for each position the player played.
  8. For pitching, calculate the number of runs allowed below what a replacement player would allow in the same number of innings pitched.
  9. Do not calculate Fielding WAR for pitchers, on the assumption that the effect of pitcher fielding is captured by pitching statistics.
  10. Convert each of these values from Runs to Wins, based the number of runs per win in that season and league.
  11. Conform the number of wins calculated to the total number of Wins above Replacment in that league and season - the difference between a .500 winning percentage (the average for the entire league) and the .294 winning percentage of our replacment team, times the number of games; allocated 50% for Offense, 50% for Defense (pitching plus fielding)

Hitting Calculations

WAR for batting is based on Weighted On-Base Average (wOBA), which uses coefficients for the average number of runs generated by each possible outcome of at bats that are not an out. For example, in the National League in 2019, an average walk generated 0.693 runs, a single 0.872 runs, and a home run 1.949 runs. I use different coefficients for the AL and NL from 1947-2019 (from Baseball Reference). After 2019, when there are no longer any rule differences between the leagues, I use the value (from FanGraphs) for MLB as a whole.

Fielding Calculations

Fielding is the most involved of the three components of WAR, and the least precise. My approach is informed by Bill James and the Win Shares methodology described in his 2001 book of that name.

My approach has some key differences, though. James starts by calculating the Win Shares for each position on a team basis, and then allots those Win Shares to individual players. I follow similar logic for the calculation of defensive value, but I assign that at the player level. Instead of "balancing" the totals at the team/season level, my totals are balanced at the league/season level.

The general approach is that for each position, the basic defensive statistics (Putouts, Assists, Double Plays, measured as above/below average per defensive inning) are weighted based on how important that statistic is to being a good defensive player at that position. For example: for second basemen, putouts are weighted at 1, assists are weighted at 3, and double plays are weighted at 4 - turning double plays is the most important thing a 2B does on defense. For third basemen, putouts are weighted at 1, assists are weighted at 5, and double plays are weighted at 1 - range (measured by making assists) is the most important factor for a 3B. These weights were first developed by Bill James and I have adjusted them based on my own analysis.

For all positions, the number or Error Runs is calculated based on the league average number of "Unearned Runs per Error" (with an adjustment for year, since the number or errors assigned by official scorers has been steadily declining since about 1975).

For Infielders and Outfielders, a player's opportunity to make plays is dependent on the team's pitching staff. For each team, a Fly Ball/Ground Ball ratio is calculated, and the number of putouts and assists for fielders is adjusted by 50% of the difference between the team's ratio and the league's ratio. This is a rough attempt to split the difference between giving fielders credit for the plays they actually made, and accounting for the increased/decreased opportunities they had for making plays due to the tendencies of the pitching staff they played behind.

For Catcher and First Base, some team calculations are used to infer defensive performance by each player based on the percentage of defensive innings that player played.

A position weighting factor is used to assign defensive value among the positions to approximately:

This is only an approximation, however, since the total number of assists and putouts by position varies from season to season by an unexpectedly large amount.

For each position, the calculation of Defensive Runs Saved is some form of: SUM (Adjusted Putouts, Assists, Double Plays, times weighting factors) * Position Factor - Error Runs

Unadjusted Fielding WAR is calculated as Defensive Runs Saved divided by Runs per Win.

Pitching Calculations

The calculations for pitchers starts by calculating Adjusted Runs Allowed as Earned Runs allowed plus one-half Unearned Runs Allowed, adjusted by:

The key calculation for pitchers is Pitching Runs Saved: (Replacement Pitcher Runs Allowed per Innings Pitched - Adjusted Runs Allowed per Innings Pitched) * Inning Pitched.

Pitching Runs Saved is then adjusted by a Leverage Factor. The leverage factor is an attempt to give pitchers more credit for pitching in high-leverage situations - when the game is on the line. For relief pitchers (fewer than 50% of games pitched as a starter), the leverage factor is 0.8 times the number of wins plus losses plus saves. For starting pitchers, the calculation is 0.35 times the number of wins plus losses, + 0.1 times the percentage of starts that were complete games. To account for the change in how pitchers have been used over the seasons, this number is added to a "base" ranging from 0.87 (prior to 1977) to 1.02 (after 2004).

Unadjusted Pitching WAR is then calculated as Pitching Runs Saved divided by Runs per Win.

Adjusting Pitching and Fielding WAR

The final step is to adjust Pitching and Fielding WAR together to account for 50% of the total WAR in a season (the difference between a .500 winning percentage and the .294 winning percentage of a replacment team, times the number of games). The target is to assign pitching 85% of this amount, and fielding 15%; but this is adjusted by a "Defense Factor" that attempts to determine for each team how pitchers and fielders should split the credit (or blame) for preventing (or allowing) runs. The Defense Factor has 6 components, each based on the team's value compared to the league average:

  1. Walks factor (includes HBP): higher value penalizes pitchers
  2. Strikeout factor: higher value favors pitchers
  3. Home Run factor: higher value penalizes pitchers
  4. Defensive Efficiency Rating - the percentage of batted balls in play that result in outs: higher value favors fielders
  5. Error/Passed Ball factor: higher value penalizes fielders
  6. Double Play factor: higher value favors fielders

In the first step of the adjustment, these six values are used to adjust the split between pitching and fielding per team up to +/-10% from the 85%/15% target.

In the second step of the adjustment, the total Pitching WAR plus Fielding WAR for a league/season is adjusted to conform to the 50% of total WAR available for the season.

Why Shark WAR?

The origin of the name "Shark WAR" comes from my time as an IT Architect, and a four-panel comic one of my co-workers found:

  1. Shark sits down at the dinner table and tells his wife, "Good news, I got the promotion!"
  2. Shark wife says "Are you the lead sharkitect now?"
  3. Shark slams his fin down and knocks the table over
  4. Shark says, "My career isn't a joke, Sharon"

From that time, my professional alter-ego was the "lead sharkitect".

Why 1947?

Restricting the Shark WAR data set to 1947 and later was done for three reasons:

  1. The early history of baseball has been analyzed to death and I am more interested in the periods I have lived through; baseball went through several large changes in how the game was played before 1945, and I chose not to worry about those.
  2. The impact of missed time from players who served in World War II is difficult to judge when analyzing questions about relative career performance of players. There are suprisingly few good players who played significant parts of their career before and after the war (and are thus impacted by my decision to start the data in 1947). A few who are impacted are Ted Williams, Stan Musial, and Bob Feller.
  3. Although racial discrimination in baseball did not disappear on April 15, 1947, that's when it started to go away.

Areas for Improvement

There are a couple of areas I want to look into for improvements in the next version of the Shark WAR calculations.