About SharkWAR

About Shark WAR

Shark WAR is meant to be equivalent to the Wins Above Replacement (WAR) calculation found on baseball-reference.com and fangraphs.com. WAR combines offensive, defensive, and pitching statistics over a season into a single number that represents the number of wins the player contributed to their team, above the number of wins a "replacement" level player would be expected to contribute.

Shark WAR was created by Richard Franck, a data architect/analyst, baseball fan, and youth baseball umpire from Bethesda, Maryland. I created my own version because I wanted to perform calculations using WAR for all players and teams, and that data cannot be easily downloaded from other sites.

Definitions

The concept of Wins Above Replacement depends, first and foremost, on the concept of a Replacement Player. Essentially, if the player we are evaluating didn't play, what level of production would the team get from the player who took his place? We might think of this as being a good AAA player, but that doesn't help answer the question of how productive that player would be - how many runs they would generate on offense, or how many runs they would allow as a pitcher or fielder.

Instead, we define a replacement player based on team performance. A replacement player is the average player on a team that has a record of 48-112 in a 162 game season, or a .294 winning percentage. This applies to all phases of the game - pitching, hitting, and defense. (For comparison, the 2024 Chicago White Sox finished 41-121.)

The following definitions are used in the data tables on this site.

Batting Definitions

Shark WAR	The total Wins Above Replacement, including hitting, fielding, and pitching	HR	Home Runs
Batting WAR	The batting portion of the player's Wins Above Replacement	BB	Walks, including intentional walks
G	Games Played	SB	Stolen Bases
PA	Plate Appearances, calculated as At Bats + Walks + Hit by Pitch + Sacrifice Flies + Sacrifice Hits	CS	Caught Stealing
wRC	Weighted Runs Created, a calculation of the number of runs the hitter contributed to his team as a batter	H	Hits
OBP	Times reached base (Hits + Walks + Hit by Pitch) divided by plate appearances (excluding Sacrifice Hits)	2B	Doubles
SLG	Total bases (Hits + Doubles + (2 x Triples) + (3 x Home Runs)) divided by At Bats	3B	Triples

Pitching Definitions

Shark WAR	The total Wins Above Replacement, including hitting, fielding, and pitching	IP	Innings Pitched
Pitching WAR	The pitching portion of the player's Wins Above Replacement	W	Games Won
Pitch Runs Saved	The number of runs the pitcher did not allow compared to the number of runs a Replacement Player would have allowed in the same number of innings pitched	L	Games Lost
ERA	Earned Run Average: earned runs allowed per 9 innings pitched	CG	Complete Games
WHIP	Walks plus Hits per Innings Pitched	SV	Saves
SOper9	Strikeouts per 9 innings pitched	R	Runs allowed (earned and unearned)
G	Games pitched	BB	Walks allowed, including intentional walks
GS	Games pitched as the starting pitcher	SO	Strikeouts

Fielding Definitions

Fielding WAR	The fielding portion of the player's Wins Above Replacement	PO	Putouts
Def Runs Saved	The number of runs the fielder did not allow compared to the number of runs a Replacement Player would have allowed in the same number of defensive innings at that position	A	Assists
G	Games played defensively at the position	E	Errors made
Inn	Innings played defensively at the position	DP	Double Plays

Calculations

This section summarizes the calculations used in creating SharkWAR.

Principles

Players are evaluated in the context of the season and league in which they played. A run is worth more in years of low scoring than in years of high scoring.
Statistics are adjusted using park adjustment factors (from fangraphs.com) for number of runs above/below average for the player's home ballpark.
The "Pythagorean Win Percentage" calculates a team's expected winning percentage from the number of runs scored and allowed:

ExpectedWin% = RunsScored^1.83 / (RunsScored^1.83 + RunsAllowed^1.83)
Assuming that our team of replacement players scored the same number of runs below the league average as they allowed above the league average, solve the Pythgorean Win Percentage formula to determine how many runs the replacment team scored and allowed.
The number of "Runs per Win" for a league and season is the total number of runs scored in the league divided by the total number of games.
For hitting, calculate the number of runs the hitter contributed above what a replacement player would contribute in the same number of plate appearances.
For fielding, calculate the number of runs allowed below what a replacement player would allow in the same number of innings on defense, for each position the player played.
For pitching, calculate the number of runs allowed below what a replacement player would allow in the same number of innings pitched.
Do not calculate Fielding WAR for pitchers, on the assumption that the effect of pitcher fielding is captured by pitching statistics.
Convert each of these values from Runs to Wins, based the number of runs per win in that season and league.
Conform the number of wins calculated to the total number of Wins above Replacment in that league and season - the difference between a .500 winning percentage (the average for the entire league) and the .294 winning percentage of our replacment team, times the number of games; allocated 50% for Offense, 50% for Defense (pitching plus fielding)

Hitting Calculations

WAR for batting is based on Weighted On-Base Average (wOBA), which uses coefficients for the average number of runs generated by each possible outcome of at bats that are not an out. For example, in the National League in 2019, an average walk generated 0.693 runs, a single 0.872 runs, and a home run 1.949 runs. I use different coefficients for the AL and NL from 1947-2019 (from Baseball Reference). After 2019, when there are no longer any rule differences between the leagues, I use the value (from FanGraphs) for MLB as a whole.

Calculate wOBA using the coefficients
Calculate Weighted Runs Created (wRC) as: (((wOBA - League Average wOBA) / wOBA Scale) + league average runs per PA) * PA
Calculate "baserunning runs" using coefficients for stolen bases and caught stealing, plus sacrifice hits (which I give one-half the value of a stolen base)
Calculate the park adjustment runs based on the number of plate appearances for the player
Calculate Runs Above Replacement: (wRC + Baserunning Runs + Park Adjustment Runs) - (Replacment Player Runs per PA * PA)
Calculate Unadjusted Batting WAR: Runs Above Replacment / Runs Per Win
Multiply the Unadjusted Batting WAR by a factor so that the total Batting WAR for the league and season is one-half the difference between a .500 winning percentage and the .294 winning percentage of a replacment team, times the number of games.

Fielding Calculations

Fielding is the most involved of the three components of WAR, and the least precise. My approach is informed by Bill James and the Win Shares methodology described in his 2001 book of that name.

My approach has some key differences, though. James starts by calculating the Win Shares for each position on a team basis, and then allots those Win Shares to individual players. I follow similar logic for the calculation of defensive value, but I assign that at the player level. Instead of "balancing" the totals at the team/season level, my totals are balanced at the league/season level.

The general approach is that for each position, the basic defensive statistics (Putouts, Assists, Double Plays, measured as above/below average per defensive inning) are weighted based on how important that statistic is to being a good defensive player at that position. For example: for second basemen, putouts are weighted at 1, assists are weighted at 3, and double plays are weighted at 4 - turning double plays is the most important thing a 2B does on defense. For third basemen, putouts are weighted at 1, assists are weighted at 5, and double plays are weighted at 1 - range (measured by making assists) is the most important factor for a 3B. These weights were first developed by Bill James and I have adjusted them based on my own analysis.

For all positions, the number or Error Runs is calculated based on the league average number of "Unearned Runs per Error" (with an adjustment for year, since the number or errors assigned by official scorers has been steadily declining since about 1975).

For Infielders and Outfielders, a player's opportunity to make plays is dependent on the team's pitching staff. For each team, a Fly Ball/Ground Ball ratio is calculated, and the number of putouts and assists for fielders is adjusted by 50% of the difference between the team's ratio and the league's ratio. This is a rough attempt to split the difference between giving fielders credit for the plays they actually made, and accounting for the increased/decreased opportunities they had for making plays due to the tendencies of the pitching staff they played behind.

For Catcher and First Base, some team calculations are used to infer defensive performance by each player based on the percentage of defensive innings that player played.

For Catchers, the number of team strikeouts is subtracted from catcher putouts, leaving the number of "Catcher Independent Putouts". This number is further adjusted based on season, to account for the fact that newer ballparks have much less foul territory than older parks did. Catchers are also given a credit for the number of wins the team has, in a very rough attempt to give credit for a catcher's ability to manage the game and pitching staff defensively. Catcher caught stealing is not used since it has not been collected consistently; assists are used instead.
For First Basemen, we estimate the number of Unassisted Putouts at the team level by subtracting a portion of 2B/SS/3B/P assists from the team total of 1B putouts. Unassisted Putouts and Assists by 1B are weighted equally. In addition, we include errors by 3B and SS (times 30%) in the Error Runs calculation for first basemen, on the assumption that a good defensive 1B prevents throwing errors by other infielders.

A position weighting factor is used to assign defensive value among the positions to approximately:

Catcher: 20%
Shortstop: 20%
Second Base: 15%
Third Base: 13%
Center Field: 13%
First Base: 7%
Left Field: 6%
Right Field: 6%

This is only an approximation, however, since the total number of assists and putouts by position varies from season to season by an unexpectedly large amount.

For each position, the calculation of Defensive Runs Saved is some form of: SUM (Adjusted Putouts, Assists, Double Plays, times weighting factors) * Position Factor - Error Runs

Unadjusted Fielding WAR is calculated as Defensive Runs Saved divided by Runs per Win.

Pitching Calculations

The calculations for pitchers starts by calculating Adjusted Runs Allowed as Earned Runs allowed plus one-half Unearned Runs Allowed, adjusted by:

Strikeouts per inning minus league average strikeouts per inning
League average walks per inning minus walks per inning
Ballpark Adjustment

The key calculation for pitchers is Pitching Runs Saved: (Replacement Pitcher Runs Allowed per Innings Pitched - Adjusted Runs Allowed per Innings Pitched) * Inning Pitched.

Pitching Runs Saved is then adjusted by a Leverage Factor. The leverage factor is an attempt to give pitchers more credit for pitching in high-leverage situations - when the game is on the line. For relief pitchers (fewer than 50% of games pitched as a starter), the leverage factor is 0.8 times the number of wins plus losses plus saves. For starting pitchers, the calculation is 0.35 times the number of wins plus losses, + 0.1 times the percentage of starts that were complete games. To account for the change in how pitchers have been used over the seasons, this number is added to a "base" ranging from 0.87 (prior to 1977) to 1.02 (after 2004).

Unadjusted Pitching WAR is then calculated as Pitching Runs Saved divided by Runs per Win.

Adjusting Pitching and Fielding WAR

The final step is to adjust Pitching and Fielding WAR together to account for 50% of the total WAR in a season (the difference between a .500 winning percentage and the .294 winning percentage of a replacment team, times the number of games). The target is to assign pitching 85% of this amount, and fielding 15%; but this is adjusted by a "Defense Factor" that attempts to determine for each team how pitchers and fielders should split the credit (or blame) for preventing (or allowing) runs. The Defense Factor has 6 components, each based on the team's value compared to the league average:

Walks factor (includes HBP): higher value penalizes pitchers
Strikeout factor: higher value favors pitchers
Home Run factor: higher value penalizes pitchers
Defensive Efficiency Rating - the percentage of batted balls in play that result in outs: higher value favors fielders
Error/Passed Ball factor: higher value penalizes fielders
Double Play factor: higher value favors fielders

In the first step of the adjustment, these six values are used to adjust the split between pitching and fielding per team up to +/-10% from the 85%/15% target.

In the second step of the adjustment, the total Pitching WAR plus Fielding WAR for a league/season is adjusted to conform to the 50% of total WAR available for the season.

Why Shark WAR?

The origin of the name "Shark WAR" comes from my time as an IT Architect, and a four-panel comic one of my co-workers found:

Shark sits down at the dinner table and tells his wife, "Good news, I got the promotion!"
Shark wife says "Are you the lead sharkitect now?"
Shark slams his fin down and knocks the table over
Shark says, "My career isn't a joke, Sharon"

From that time, my professional alter-ego was the "lead sharkitect".

Why 1947?

Restricting the Shark WAR data set to 1947 and later was done for three reasons:

The early history of baseball has been analyzed to death and I am more interested in the periods I have lived through; baseball went through several large changes in how the game was played before 1945, and I chose not to worry about those.
The impact of missed time from players who served in World War II is difficult to judge when analyzing questions about relative career performance of players. There are suprisingly few good players who played significant parts of their career before and after the war (and are thus impacted by my decision to start the data in 1947). A few who are impacted are Ted Williams, Stan Musial, and Bob Feller.
Although racial discrimination in baseball did not disappear on April 15, 1947, that's when it started to go away.

Areas for Improvement

There are a couple of areas I want to look into for improvements in the next version of the Shark WAR calculations.

For offense, the numbers change in some unusual ways after 2019. I want to try to sort out the impact of MLB rule changes (pitch clock, banning shifts, universal DH) in these shifts; and also investigate the impact of my switch from Baseball Reference wOBA coefficients to FanGraphs coefficients. This switch was done because FanGraphs calculates MLB wide, and because Baseball Reference isn't very good about updating their documentation with the latest values. The purpose of the wOBA scale factor from the reference data is also a bit of a mystery to me. I may attempt to calculate the coefficients myself instead of using the ones from baseball-reference.
For fielding, I want to look at the distrubtion of Defensive Runs Saved across positions over time, and specifically, the level of "replacement player" defensive runs saved, which impacts the portion of fielders who are assigned negative fielding WAR vs positive fielding WAR. For example, first basemen as a total are assigned negative fielding WAR for most seasons prior to 2005 - this especially dramatic for seasons prior to 1973. (I have a theory about the contribution of the DH to this - prior to 1973, you had to play your worst fielder somewhere, and 1B was often the choice.) There is an opportunity for me to adjust the weighting factors and position factors to get a more consistent distribution of Fielding WAR across positions and across seasons, and to eliminate some of the extreme values. For example, Bill Mazeroski's 39.4 career fielding WAR is almost twice the career total for any other 2B (Glenn Hubbard is second with 21.9). On the other hand, Bill Mazeroski is regarded as the best defensive 2B of all time and an absolute wizard of turning the double play, so maybe this is an accurate reflection of his worth.
For pitching, the use of walks and strikeouts as an adjustment factor may be "double-counting" the effects of these, since these already have an impact on the number of runs allowed. On the other hand, it gives some weight to these "fielding independent" pitching outcomes, which may be justified.