{"id":105,"date":"2016-06-10T11:32:25","date_gmt":"2016-06-10T09:32:25","guid":{"rendered":"http:\/\/benediktehinger.de\/blog\/science\/?p=105"},"modified":"2019-07-22T20:35:30","modified_gmt":"2019-07-22T18:35:30","slug":"dummy-coding-and-effects-coding","status":"publish","type":"post","link":"https:\/\/benediktehinger.de\/blog\/science\/dummy-coding-and-effects-coding\/","title":{"rendered":"Dummy coding and Effects coding"},"content":{"rendered":"<p>A small fact got me into trouble (spoiler: the intercept in effects coding represents the mean of conditions, not the data-mean).<\/p>\n<h3>Update 2016-08-11<\/h3>\n<p>I found a nice paper that remedies the last point: <a href=\"http:\/\/link.springer.com\/article\/10.1007\/s00038-016-0901-1\">weighted effects coding<\/a><\/p>\n<h3>Update 2018-11-30<\/h3>\n<p>If you enjoyed reading this post, check out my <a href=\"https:\/\/benediktehinger.de\/blog\/science\/interaction-and-effect-sum-coding\/\">sucessor post on Effect\/Sum Coding<\/a><\/p>\n<h3>Update 2019-07-22<\/h3>\n<p>I highly recommend this recent paper: <a href=\"https:\/\/arxiv.org\/abs\/1807.10451\">How to capitalize on a priori contrasts in linear (mixed) models: A tutorial (2019)<\/a> which will explain all things of this blogpost in more space, more examples, more code and with better words!<\/p>\n<h3>The goal<\/h3>\n<p>We try to model two factors with two levels using a linear model. We therefore need a schema to model categorical variables <em>as if<\/em> they were continuous variables.<\/p>\n<h3>2&#215;2 &#8220;ANOVA&#8221;<\/h3>\n<p>To start, we have take a typical 2&#215;2 ANOVA design. It is balanced (each &#8216;cell&#8217; has equal number of data points) and homoskedastic (equal variance in each cell).<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-07-1.png\" rel=\"attachment wp-att-112\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-112 size-medium aligncenter\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-07-1-300x292.png\" alt=\"dummy_effects_coding-07\" width=\"300\" height=\"292\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-07-1-300x292.png 300w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-07-1-768x747.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-07-1.png 812w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>As you see here, there are two Factors, A and B, with two levels each. For example A could be &#8220;Drives a Car&#8221;, and B could be &#8220;Owns a Suit&#8221;. The dependent variable, which we try to explain, could be &#8220;Total Money&#8221;. Alternatively in Cognitive-Neuroscience, A could be &#8220;Drank coffee before experiment&#8221; and B could be &#8220;Slept before experiment&#8221;, the dependent variable in that case could be &#8220;Alpha-Band-EEG-Amplitude&#8221;<\/p>\n<p>The data are split up by A, and color coded by B. A in addition is shape-coded. If A is &#8220;no&#8221; as well as B, we get the smallest dependent variable, if both are &#8220;yes&#8221; we get the largest one.<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-01.png\" rel=\"attachment wp-att-113\"><img loading=\"lazy\" decoding=\"async\" width=\"296\" height=\"300\" class=\"size-large wp-image-113 aligncenter\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-01-296x300.png\" alt=\"dummy_effects_coding-01\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-01-296x300.png 296w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-01-768x778.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-01-1011x1024.png 1011w\" sizes=\"auto, (max-width: 296px) 100vw, 296px\" \/><\/a><br \/>\nI already put the linear regression model we are going to use in the image. As well as the respective means.<\/p>\n<h3>Main Effects<\/h3>\n<p>We are usually interested in the &#8220;main-effects&#8221;, which are depicted in the next picture.<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-06.png\" rel=\"attachment wp-att-111\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-111\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-06-296x300.png\" alt=\"dummy_effects_coding-06\" width=\"296\" height=\"300\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-06-296x300.png 296w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-06-768x778.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-06-1011x1024.png 1011w\" sizes=\"auto, (max-width: 296px) 100vw, 296px\" \/><\/a><br \/>\nHow much does the dependent variable change, if we move from A &#8220;no&#8221; to A &#8220;yes&#8221;. In this case the main effect of A is 30, and of B is 40.<\/p>\n<p>This categorical main effects can be estimated by linear regression. Because naively linear regression only works for continuous variables, we need a way to describe the categorical variables as continuos variables. In principle we want to fit the red and cyan line depicted in the plots. There are two often used methods to solve this with linear model coding. <em>Dummy Coding<\/em> and <em>Effects Coding<\/em>.<\/p>\n<h3>Dummy Coding<\/h3>\n<p>Let&#8217;s start with <strong>Dummy Coding<\/strong>. We simply set the first level (&#8216;no&#8217;) to 0, and the (&#8216;yes&#8217;) to 1. That is, we think of it as a continuous variable which has data only at two distinct values and code it as $X_A$.<br \/>\nThus for each factor we get one slope which we call $\\beta_A$ and $\\beta_B$ but in addition we could have an interaction, thus we code this as well. The interaction is simply the multiplication of the two main Factors, thus it is coded with 1, only if A and B are both 1 as well ($\\beta_{AB}$).<br \/>\nWe estimated the betas (see following image).<\/p>\n<p>How do we interprete the coefficients?<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-02.png\" rel=\"attachment wp-att-107\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-107\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-02-284x300.png\" alt=\"dummy_effects_coding-02\" width=\"284\" height=\"300\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-02-284x300.png 284w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-02-768x813.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-02-968x1024.png 968w\" sizes=\"auto, (max-width: 284px) 100vw, 284px\" \/><\/a><br \/>\nFrom the image it should become clear, that with dummy coding we are estimating the location of the cell-means. In order to calculate the main effects we would need some additional calculations.<\/p>\n<h3>Effects Coding<\/h3>\n<p>For <strong>effects coding<\/strong> we set the &#8216;no&#8217; to -1 (or -0.5 if you prefer) and the &#8216;yes&#8217; to +1 (or +0.5).<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-03.png\" rel=\"attachment wp-att-108\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-108\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-03-285x300.png\" alt=\"dummy_effects_coding-03\" width=\"285\" height=\"300\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-03-285x300.png 285w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-03-768x809.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-03-972x1024.png 972w\" sizes=\"auto, (max-width: 285px) 100vw, 285px\" \/><\/a><br \/>\nYou can clearly see, that the parameter estimates are the main-effects, not the cell-means anymore. $2\\cdot\\beta_A $ is the main effect of A. Why $\\cdot 2$? Because we coded with -1 \/ +1 (thus the difference e.g. the jump from -1 to 1 is 2), if we would use -0.5 \/ 0.5 (thus the difference is 1 as in the dummy coding above) the parameter estimate directly represents the main effects.<\/p>\n<p>How do we get the cell-means from effects coding?<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-05.png\" rel=\"attachment wp-att-110\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-110\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-05-300x288.png\" alt=\"dummy_effects_coding-05\" width=\"300\" height=\"288\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-05-300x288.png 300w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-05-768x736.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-05-1024x982.png 1024w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><br \/>\nAn example might be appropriate to better see what it means:<br \/>\n$$\\hat{y} = \\beta_0 + \\beta_A X_A + \\beta_B X_B + \\beta_{AB} X_{AB}$$<\/p>\n<p>We want to know the cell mean of B &#8216;yes&#8217; if A is &#8216;no&#8217;. Thus $X_B = +1$ and $X_A = -1$. As written before $X_{AB}$ is the multiplication, thus $X_{AB} = X_B \\cdot X_A = -1 \\cdot +1 = -1$.<\/p>\n<p>$${\\hat{y}} = \\beta_0 + \\beta_A \\cdot -1+ \\beta_B \\cdot +1 + \\beta_{AB} \\cdot -1$$<\/p>\n<h3>Intercepts<\/h3>\n<p>As visible in the graphs the intercepts of <em>dummy coding<\/em> represents the reference category, here the value of A=&#8217;no&#8217; and B=&#8217;no&#8217;. The intercept of <em>effects coding<\/em> represents the mean of the conditions. This can be very different from the the total mean of the data, if you have unbalanced data:<br \/>\n<a href=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-04.png\" rel=\"attachment wp-att-109\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-109\" src=\"http:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-04-296x300.png\" alt=\"dummy_effects_coding-04\" width=\"296\" height=\"300\" srcset=\"https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-04-296x300.png 296w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-04-768x778.png 768w, https:\/\/benediktehinger.de\/blog\/science\/upload\/sites\/2\/2016\/04\/dummy_effects_coding-04-1011x1024.png 1011w\" sizes=\"auto, (max-width: 296px) 100vw, 296px\" \/><\/a><br \/>\nHere the A=&#8217;yes&#8217; and B=&#8217;yes&#8217; condition has 2.5 times more data-points, thus it moves the total-mean upwards. But the effects did not change, thus the condition means and the mean of the condition means did not change.<br \/>\nIt is actually qiute useful, that the <em>effects coding<\/em> intercept does not represent the total mean, but the mean of condition means.<\/p>\n<p>And that&#8217;s it. When should you use which? I don&#8217;t think there is a clearcut case for one or the other. It boils down to interpretation and personal preference, but some cases it is more useful to have the one and in other the other. See for example <a href=\"http:\/\/www.ats.ucla.edu\/stat\/mult_pkg\/faq\/general\/effect.htm\">here under &#8220;why use effects coding&#8221;<\/a>.<\/p>\n<h3>Further reading<\/h3>\n<p><a href=\"https:\/\/methodology.psu.edu\/media\/techreports\/12-120.pdf\">Effect coding versus dummy coding in analysis of data from factorial experiments<\/a><br \/>\n<a href=\"http:\/\/www.ats.ucla.edu\/stat\/r\/library\/contrast_coding.htm\">A list of different coding schemes<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A small fact got me into trouble (spoiler: the intercept in effects coding represents the mean of conditions, not the data-mean). Update 2016-08-11 I found a nice paper that remedies the last point: weighted effects coding Update 2018-11-30 If you enjoyed reading this post, check out my sucessor post on Effect\/Sum Coding Update 2019-07-22 I highly recommend this recent paper: How to capitalize on a priori contrasts in linear (mixed) models: A tutorial (2019) which will explain all things of this blogpost in more space, more examples, more code and with better words! The goal We try to model two&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-105","post","type-post","status-publish","format-standard","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/posts\/105","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/comments?post=105"}],"version-history":[{"count":0,"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/posts\/105\/revisions"}],"wp:attachment":[{"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/media?parent=105"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/categories?post=105"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/benediktehinger.de\/blog\/science\/wp-json\/wp\/v2\/tags?post=105"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}