{"id":"23bd5727ae03206e","slug":"methods-of-comparison-compared","trashed":false,"description":"","likes":193,"publish_level":"live","forks":18,"fork_of":null,"has_importers":false,"update_time":"2020-07-14T17:00:07.083Z","first_public_version":744,"paused_version":null,"publish_time":"2023-04-10T02:24:13.820Z","publish_version":744,"latest_version":744,"thumbnail":"8615706549954cf3fa6fd285e9bf87595567d61ce274d1fac0955cca8624ecb8","default_thumbnail":"8615706549954cf3fa6fd285e9bf87595567d61ce274d1fac0955cca8624ecb8","roles":[],"sharing":null,"owner":{"id":"8c2e47252fd4995c","avatar_url":"https://avatars.observableusercontent.com/avatar/82811927da99f8938001b2ef1f552ad2c47083e46ebc55a3a146a5a5848c4519","login":"mbostock","name":"Mike Bostock","bio":"Visualization toolmaker. Founder @observablehq. Creator @d3. Former @nytgraphics. Pronounced BOSS-tock.","home_url":"https://bost.ocks.org/mike/","type":"team","tier":"starter_2024"},"creator":{"id":"074c414ad1d825f5","avatar_url":"https://avatars.observableusercontent.com/avatar/82811927da99f8938001b2ef1f552ad2c47083e46ebc55a3a146a5a5848c4519","login":"mbostock","name":"Mike Bostock","bio":"Visualization toolmaker. Founder @observablehq. Creator @d3. Former @nytgraphics. Pronounced BOSS-tock.","home_url":"https://bost.ocks.org/mike/","tier":"pro"},"authors":[{"id":"074c414ad1d825f5","avatar_url":"https://avatars.observableusercontent.com/avatar/82811927da99f8938001b2ef1f552ad2c47083e46ebc55a3a146a5a5848c4519","name":"Mike Bostock","login":"mbostock","bio":"Visualization toolmaker. Founder @observablehq. Creator @d3. Former @nytgraphics. Pronounced BOSS-tock.","home_url":"https://bost.ocks.org/mike/","tier":"pro","approved":true,"description":""}],"collections":[{"id":"d9df4fb5263ace62","type":"public","slug":"visualization","title":"Visualization","description":"Explore and explain patterns in quantitative data using D3, Vega, and Observable Plot","update_time":"2023-07-19T17:50:40.230Z","pinned":true,"ordered":true,"custom_thumbnail":"09e385d95ce7df7d392b0133d68e97dd5675378190775d038d21516ea62178ba","default_thumbnail":"32ad600cf556f7b9991b2a11f7b6b8e55abe7ada6062ef88a5a7de422bb261ab","thumbnail":"09e385d95ce7df7d392b0133d68e97dd5675378190775d038d21516ea62178ba","listing_count":84,"parent_collection_count":1,"owner":{"id":"f35c755083683fe5","avatar_url":"https://avatars.observableusercontent.com/avatar/5a51c3b908225a581d20577e488e2aba8cbc9541c52982c638638c370c3e5e8e","login":"observablehq","name":"Observable","bio":"The end-to-end solution for building and hosting better data apps, dashboards, and reports.","home_url":"https://observablehq.com","type":"team","tier":"enterprise_2024"}},{"id":"1efbaaa337070fe6","type":"public","slug":"data-analysis","title":"Data Analysis","description":"","update_time":"2023-07-17T21:43:17.395Z","pinned":false,"ordered":true,"custom_thumbnail":null,"default_thumbnail":"c02d1db05cf172800cdfd65e3dee2b8e6e8401e661e01b184c4dd6b090799a9a","thumbnail":"c02d1db05cf172800cdfd65e3dee2b8e6e8401e661e01b184c4dd6b090799a9a","listing_count":13,"parent_collection_count":0,"owner":{"id":"f35c755083683fe5","avatar_url":"https://avatars.observableusercontent.com/avatar/5a51c3b908225a581d20577e488e2aba8cbc9541c52982c638638c370c3e5e8e","login":"observablehq","name":"Observable","bio":"The end-to-end solution for building and hosting better data apps, dashboards, and reports.","home_url":"https://observablehq.com","type":"team","tier":"enterprise_2024"}},{"id":"98a964a057c43289","type":"public","slug":"county-maps","title":"County Maps","description":"A collection of examples of US County based maps for reference and inpsiration","update_time":"2020-10-06T19:45:00.816Z","pinned":false,"ordered":false,"custom_thumbnail":null,"default_thumbnail":"deca7aa1610e893b52b1f96bf4e2377c50985fd74cb81732f4ba729697ed36c8","thumbnail":"deca7aa1610e893b52b1f96bf4e2377c50985fd74cb81732f4ba729697ed36c8","listing_count":26,"parent_collection_count":1,"owner":{"id":"f35c755083683fe5","avatar_url":"https://avatars.observableusercontent.com/avatar/5a51c3b908225a581d20577e488e2aba8cbc9541c52982c638638c370c3e5e8e","login":"observablehq","name":"Observable","bio":"The end-to-end solution for building and hosting better data apps, dashboards, and reports.","home_url":"https://observablehq.com","type":"team","tier":"enterprise_2024"}}],"files":[],"comments":[],"commenting_lock":null,"suggestion_from":null,"suggestions_to":[],"version":744,"title":"Methods of Comparison, Compared","license":"isc","copyright":"Copyright 2018–2020 Mike Bostock","nodes":[{"id":0,"value":"md`# Methods of Comparison, Compared\n\nWe often wish to understand something by comparing values.\n\nFor example, say I’m happily biking along 🚲😅 at ${tex`17\\,\\text{mph}`} when I get passed by a car 🚗💨 going ${tex`51\\,\\text{mph}`}. I might think: *Yikes! That car was going ${tex`\\tfrac{51}{17} = `} three times my speed!* Or: *That car was going ${tex`51 - 17 = +34\\,\\text{mph}`} faster than me!* (Cars are scary!) Or maybe I’m a cryptocurrency speculator, and I bought one Dinglecoin at the start of the year for ${tex`\\$14{,}741`} and then sold it yesterday for ${tex`\\$6{,}638`}. I’d say: *Oops, my return was ${tex`\\tfrac{6{,}638 - 14{,}741}{14{,}741} = -55\\%`}*. Maybe I should invest elsewhere.\n\nThere are many ways to compare values. Depending on what you seek to understand, one method may be better than another. In this post, we’ll walk through some common methods and consider their uses.\n\nTo make it concrete, we’ll use data from a study by [Dwyer-Lindgren *et al.*](https://jamanetwork.com/journals/jama/fullarticle/2674665) on substance use disorders and intentional injuries, and specifically how deaths from alcoholism vary across the United States between 1980 and 2014.`","pinned":false,"mode":"js","data":null,"name":null},{"id":563,"value":"side = md`## Side-by-Side\n\nLet’s start by looking separately at 1980 and 2014. Hover over any of the counties to see the underlying values.`","pinned":false,"mode":"js","data":null,"name":null},{"id":225,"value":"md`### Deaths from Alcohol Use Disorders, 1980`","pinned":false,"mode":"js","data":null,"name":null},{"id":229,"value":"map(\n  id => color(deaths.get(id)[0]),\n  id => `${names.get(id)}\n${format(deaths.get(id)[0])} deaths per 100,000 people`\n)","pinned":false,"mode":"js","data":null,"name":null},{"id":223,"value":"legend(color, \"Deaths per 100,000 people\")","pinned":false,"mode":"js","data":null,"name":null},{"id":227,"value":"md`### Deaths from Alcohol Use Disorders, 2014`","pinned":false,"mode":"js","data":null,"name":null},{"id":219,"value":"map(\n  id => color(deaths.get(id)[1]),\n  id => `${names.get(id)}\n${format(deaths.get(id)[1])} deaths per 100,000 people`,\n  visibility\n)","pinned":false,"mode":"js","data":null,"name":null},{"id":242,"value":"md`Some patterns are already visible: the high mortality rate from alcohol in low-density, predominantly [Native American counties](https://en.wikipedia.org/wiki/Alcohol_and_Native_Americans) (such as Apache County, Arizona with ${tex`31.5`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color(31.5)}\"></div>`} deaths per 100,000 people in 2014), and the decline in alcohol-related mortality across the <a href=\"https://en.wikipedia.org/wiki/Black_Belt_(U.S._region)\">Black Belt</a>. These trends should be considered within the broader context of substance use, such as the [recent epidemic of drug overdoses](https://www.nytimes.com/interactive/2016/01/07/us/drug-overdose-deaths-in-the-us.html).\n\nWe can compare by darting back and forth between the two maps, but can we see the temporal pattern more directly?`","pinned":false,"mode":"js","data":null,"name":null},{"id":53,"value":"difference = md`## Difference\n\nSurely the simplest method to compare two values is to subtract them.\n\n${tex.block`\\text{difference}(a, b) = b - a`}`","pinned":false,"mode":"js","data":null,"name":null},{"id":278,"value":"md`### Change in Deaths from Alcohol Use Disorders, 1980–2014`","pinned":false,"mode":"js","data":null,"name":null},{"id":271,"value":"map(\n  id => color2(deaths.get(id)[1] - deaths.get(id)[0]),\n  id => `${names.get(id)}\n${formatChange(deaths.get(id)[1] - deaths.get(id)[0])} deaths per 100,000 people\n${format(deaths.get(id)[0])} per 100,000 in 1980\n${format(deaths.get(id)[1])} per 100,000 in 2014`,\n  visibility\n)","pinned":false,"mode":"js","data":null,"name":null},{"id":275,"value":"legend(color2, \"Change in deaths per 100,000 people\", \"+d\")","pinned":false,"mode":"js","data":null,"name":null},{"id":282,"value":"md`In Apache County, the mortality rate increased from ${tex`18.6`} deaths per 100,000 people in 1980 to ${tex`31.5`} per 100,000 people in 2014, a difference of ${tex`31.5 - 18.6 = +13.0`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color2(13.0)}\"></div>`} deaths per 100,000 people. Meanwhile, Plymouth County, Massachusetts increased from ${tex`3.2 - 0.9 = +2.3`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color2(2.3)}\"></div>`} deaths per 100,000 people.\n\nThe difference here conveys the *marginal change* in mortality rates: where people are becoming more or less likely to die from alcoholism, weighted according to the absolute change in likelihood.`","pinned":false,"mode":"js","data":null,"name":null},{"id":11,"value":"relative = md`## Relative Change\n\nWe may wish to normalize the difference relative to the starting value.\n\n${tex.block`\\text{relative change}(a, b) = \\frac{\\text{b} - \\text{a}}{\\text{a}}`}\n\nOften this is multiplied by 100 to produce *percentage change*.`","pinned":false,"mode":"js","data":null,"name":null},{"id":328,"value":"md`### Change in Deaths from Alcohol Use Disorders, 1980–2014`","pinned":false,"mode":"js","data":null,"name":null},{"id":303,"value":"map(\n  id => color3a((deaths.get(id)[1] - deaths.get(id)[0]) / deaths.get(id)[0]),\n  id => `${names.get(id)}\n${formatPercentChange((deaths.get(id)[1] - deaths.get(id)[0]) / deaths.get(id)[0])}\n${format(deaths.get(id)[0])} per 100,000 in 1980\n${format(deaths.get(id)[1])} per 100,000 in 2014`,\n  visibility\n)","pinned":false,"mode":"js","data":null,"name":null},{"id":311,"value":"d3.select(legend(color3a, \"Percentage change of deaths per 100,000 people\"))\n  .call(g => g.selectAll(\".tick text\").text(d => (d > 0 ? \"+\" : \"\") + d * 100))\n  .node()","pinned":false,"mode":"js","data":null,"name":null},{"id":331,"value":"md`This map differs dramatically from the previous one! The decline in mortality in the Black Belt is much more pronounced, as is an increase in the Midwest, such as the ${tex`+254.9\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(2.55)}\"></div>`} increase in Randolph County, Indiana.\n\nBut is this a good way to look at this data?\n\nConsider Plymouth County again: it increased from ${tex`0.9`} to ${tex`3.2`} deaths per 100,000, which is ${tex`\\tfrac{3.2 - 0.9}{0.9} = +243.8\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(2.44)}\"></div>`}. Meanwhile, Apache County “only” increased by ${tex`\\tfrac{31.5 - 18.6}{18.6} = +69.7\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(0.70)}\"></div>`}. Is the change ${tex`3.5\\times`} more notable in Plymouth County than in Apache County? Probably not: counties with a low initial mortality rate—the denominator in our relative change formula—have a much larger relative change for the same absolute change. On the other hand, a tripling of the alcohol-related mortality rate could be a significant concern.\n\nAlso, note that the color scale above is not symmetric: its minimum is ${tex`-83\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(-0.83)}\"></div>`} while its maximum is ${tex`+265\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(2.65)}\"></div>`}. In other words, the scale treats ${tex`+50\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(0.5)}\"></div>`} as significant as ${tex`-16\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(-0.16)}\"></div>`}, and ${tex`-50\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(-0.5)}\"></div>`} as significant as ${tex`+160\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3a(+1.6)}\"></div>`}. If we force the scale to be symmetric, we see another dramatic change.`","pinned":false,"mode":"js","data":null,"name":null},{"id":435,"value":"md`### Change in Deaths from Alcohol Use Disorders, 1980–2014`","pinned":false,"mode":"js","data":null,"name":null},{"id":437,"value":"map(\n  id => color3b((deaths.get(id)[1] - deaths.get(id)[0]) / deaths.get(id)[0]),\n  id => `${names.get(id)}\n${formatPercentChange((deaths.get(id)[1] - deaths.get(id)[0]) / deaths.get(id)[0])}\n${format(deaths.get(id)[0])} per 100,000 in 1980\n${format(deaths.get(id)[1])} per 100,000 in 2014`,\n  visibility\n)","pinned":false,"mode":"js","data":null,"name":null},{"id":439,"value":"d3.select(legend(color3b, \"Percentage change of deaths per 100,000 people\"))\n  .call(g => g.selectAll(\".tick text\").text(d => (d > 0 ? \"+\" : \"\") + d * 100))\n  .node()","pinned":false,"mode":"js","data":null,"name":null},{"id":457,"value":"md`Yeesh, what’s going on here? A percentage change of less than ${tex`-100\\%`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color3b(-1)}\"></div>`} is nonsense here: it would imply a negative number of deaths. So if the scale is symmetric and fits the data, its domain is ${tex`\\pm265\\%`} and we can’t reach most of the darker blues.\n\nIs there a way to show relative change that is symmetric?`","pinned":false,"mode":"js","data":null,"name":null},{"id":7,"value":"ratio = md`## Ratio\n\nWe can also compare two values by asking: how big is ${tex`b`} relative to ${tex`a`}?\n\n${tex.block`\\text{ratio}(a, b) = \\frac{\\text{b}}{\\text{a}}`}\n\nThis is sometimes referred to as *multiplicative change* to distinguish it from *additive change*: the difference obtained by subtracting ${tex`b - a`}.`","pinned":false,"mode":"js","data":null,"name":null},{"id":408,"value":"md`### Change in Deaths from Alcohol Use Disorders, 1980–2014`","pinned":false,"mode":"js","data":null,"name":null},{"id":394,"value":"map(\n  id => color4(deaths.get(id)[1] / deaths.get(id)[0]),\n  id => `${names.get(id)}\n${formatRatio(deaths.get(id)[1] / deaths.get(id)[0])}\n${format(deaths.get(id)[0])} per 100,000 in 1980\n${format(deaths.get(id)[1])} per 100,000 in 2014`,\n  visibility\n)","pinned":false,"mode":"js","data":null,"name":null},{"id":403,"value":"d3.select(legend(color4, \"Relative likelihood of death\", formatRatio))\n  .node()","pinned":false,"mode":"js","data":null,"name":null},{"id":414,"value":"md`This map is much closer to the previous relative change map than the first absolute difference map, but with an important difference: we can now make it symmetric for both positive and negative change by taking the logarithm of the ratio.\n\n${tex.block`\\text{log ratio}(a, b) = \\log\\frac{\\text{b}}{\\text{a}}`}\n\nLet’s return to Apache County, Arizona. The 2014 mortality rate is ${tex`\\tfrac{31.5}{18.6} = 1.7\\times`} its 1980 mortality rate, so the log ratio is ${tex`+0.53`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color4(1.7)}\"></div>`}. The log scale is symmetric, so it treats this increase from ${tex`18.6`} to ${tex`31.5`} as significant as a decrease from ${tex`31.5`} to ${tex`18.6`}, which is a log ratio of ${tex`\\log(\\tfrac{18.6}{31.5} = 0.59\\times) = -0.53`} ${html`<div style=\"display:inline-block;width:16px;height:16px;background:${color4(0.59)}\"></div>`}. For example, that’s about the same as Talbot County, Georgia’s decline from ${tex`7.3`} to ${tex`4.2`} deaths per 100,000.\n\nLog ratios are often used when considering growth, as with investment returns. For example, if a stock doubles and then halves, you’re back where you started: ${tex`\\log(\\tfrac{2}{1}) + \\log(\\tfrac{1}{2}) = 0`}. On the other hand if a stock goes up by fifty percent then down by fifty percent, you’ve lost twenty-five percent of your investment: ${tex`(1 \\times 0.5) - (1.5 \\times 0.5) = -0.25`}. This is why log scales are commonly used in stock price charts, such as this [change line chart](/@mbostock/d3-change-line-chart) and [index chart](/@mbostock/d3-index-chart).`","pinned":false,"mode":"js","data":null,"name":null},{"id":557,"value":"md`## Which is Best?\n\nI know it’s disappointing, but: none of them. No method is better universally, and none of them is “the best” even in the context of the dataset. (There are also a number of methods I did not cover, such as the [relative difference](https://en.wikipedia.org/wiki/Relative_change_and_difference).) What’s best depends on what you are trying to show. I’d favor [absolute difference](#difference) here as the simplest option, but [log ratio](#ratio) might work if you want to show rate of growth.\n\nThere’s another important variable here which we’re ignoring, but which might influence our understanding of the data: population counts. This data is *per capita* (deaths per 100,000 people per year), which is helpful for understanding how likely any individual is to die, but *not* the number of people affected. Populations vary widely from county to county, and populations move over time. This makes it especially hard to understand trends that vary both geographically and temporally.\n\nIf anything, this essay demonstrates the importance of reading the legend: each map above uses the identical (valid!) title while showing something very different. If we, as readers, do not give the legend a critical eye, we can easily misunderstand, or worse, be actively misled.`","pinned":false,"mode":"js","data":null,"name":null},{"id":704,"value":"md`#### Acknowledgments\n\n*Thanks to [Lisa Charlotte Rost](https://twitter.com/lisacrost), whose [recent posts on log scales](https://blog.datawrapper.de/weeklychart-logscale/) ([1](https://blog.datawrapper.de/weeklychart-logscale/), [2](https://blog.datawrapper.de/weeklychart-logscale2/), [3](https://blog.datawrapper.de/weeklychart-logscale3/)) inspired this essay!*`","pinned":false,"mode":"js","data":null,"name":null},{"id":215,"value":"md`---\n\n## Appendix`","pinned":false,"mode":"js","data":null,"name":null},{"id":305,"value":"formatPercentChange = d3.format(\"+.1%\")","pinned":true,"mode":"js","data":null,"name":null},{"id":399,"value":"formatRatio = {\n  const format = d3.format(\".2~r\");\n  return x => format(x) + \"×\";\n}","pinned":true,"mode":"js","data":null,"name":null},{"id":299,"value":"color3a = {\n  const values = [...deaths.values()];\n  return d3.scaleLinear()\n      .domain([d3.min(values, ([a, b]) => (b - a) / a), 0, d3.max(values, ([a, b]) => (b - a) / a)])\n      .range([-1, 0, 1])\n      .interpolate((a, b) => a < 0 \n          ? t => d3.interpolateBlues(1 - t) \n          : t => d3.interpolateReds(t));\n}","pinned":true,"mode":"js","data":null,"name":null},{"id":426,"value":"color3b = {\n  const values = [...deaths.values()];\n  const max = Math.max(-d3.min(values, ([a, b]) => (b - a) / a), d3.max(values, ([a, b]) => (b - a) / a));\n  return d3.scaleLinear()\n      .domain([-max, 0, max])\n      .range([-1, 0, 1])\n      .interpolate((a, b) => a < 0 \n          ? t => d3.interpolateBlues(1 - t) \n          : t => d3.interpolateReds(t));\n}","pinned":true,"mode":"js","data":null,"name":null},{"id":392,"value":"color4 = {\n  const values = [...deaths.values()];\n  const max = Math.max(d3.max(values, ([a, b]) => a / b), d3.max(values, ([a, b]) => b / a));\n  return d3.scaleLog()\n      .domain([1 / max, 1, max])\n      .range([-1, 0, 1])\n      .interpolate((a, b) => a < 0 \n          ? t => d3.interpolateBlues(1 - t)\n          : t => d3.interpolateReds(t));\n}","pinned":true,"mode":"js","data":null,"name":null},{"id":217,"value":"import {\n  map,\n  legend,\n  names,\n  deaths,\n  format,\n  formatChange,\n  color,\n  color2,\n  d3\n} from \"@mbostock/mortality-due-to-alcohol-use-disorder\"","pinned":true,"mode":"js","data":null,"name":null}],"resolutions":[],"schedule":null,"last_view_time":null}