# Test for curvilinear relationship

### Curvilinear Regression

Sometimes, when you analyze data with correlation and linear you can test when doing curvilinear regression is that there is no relationship. PROCESS model 1, used for estimating, testing, and probing interactions in . the curvilinear relationship between news use and political knowledge by age is. Keywords: Curvilinear relationship, ideal point model, dominance .. To examine the curvilinear effect, one tests the null hypothesis β2 = 0 by.

Such an interaction would be symmetric. For people with little creativity, there would be little or no correlation between intelligence and productivity. For people with high creativity, there would be a strong correlation between intelligence and productivity. We could create three new graphs to show these relations. All we would have to do is take the graphs we already make and to substitute the terms "creativity" and "cognitive ability.

In regression terms, an interaction means that the level of one variable influences the slope of the other variable. We model interaction terms by computing a product vector that is, we multiply the two IVs together to get a third variableand then including this variable along with the other two in the regression equation.

A graph of the hypothesized response surface: Note how the regression line of Y on X2 becomes steeper as we move up values of X1. Also note the curved contour lines on the floor of the figure.

This means that the regression surface is curved. Here we can clearly see how the slopes become steeper as we move up values of both X variables. When we model an interaction with 2 or more IVs with regression, the test we conduct is essentially for this shape. There are many other shapes that we might think of as representing the idea of interaction one variable influences the importance of the otherbut these other shapes are not tested by the product term in regression things are different for categorical variables and product terms; there we can support many different shapes.

Pedhazur's Views of the Interaction In Pedhazur's view, it only makes sense to speak of interactions when 1 the IVs are orthogonal, and 2 the IVs are manipulated, so that one cannot influence the other.

In other words, Pedhazur only wants to talk about interactions in the context of highly controlled research, essentially when data are collected in an ANOVA design. He acknowledges that we can have interactions in nonexperimental research, but he wants to call them something else, like multiplicative effects.

Nobody else seems to take this view.

### spss - How to test a curvilinear relationship in a logistic regression - Cross Validated

The effect is modeled identically both mathematically and statistically in experimental and nonexperimental research. True, they often mean something different, but that is true of experimental and nonexperimental designs generally. If we follow his reasoning for independent variables that do not interact, we might as well adopt the term 'main effect' for experimental designs and 'additive effect' for nonexperimental designs.

I don't understand his point about not having interactions when the IVs are correlated. Clearly we lose power to detect interactions when the IVs are correlated, but in my view, if we find them, they are interpreted just the same as when the IVs are orthogonal. But I may have missed something important here Conducting Significance Tests for Interactions The product term is created by multiplying the two vectors that contain the two IVs together.

The product terms tend to be highly correlated with the original IVs. Most people recommend that we subtract the mean of the IV from the IV before we form the cross-product.

This will reduce the size of the correlation between the IV and the cross product term, but leave the test for increase in R-square intact. It will, however, affect the b weights. When you find a significant interaction, you must include the original variables and the interaction as a block, regardless of whether some of the IV terms are nonsignificant unless all three are uncorrelated, an unlikely event.

Regress Y onto X1 and X2. Test whether the difference in R-square from steps 2 and 3 is significant. Alternatively, skip step 2 and check whether the b weight for the product term is significant in step 3, that is, in a simultaneous regression with Type III sums of squares. If the b weight for the product term is significant, you have an interaction. Now you need to graph your regression equation to see how to interpret it. You may have to split your data to understand the interaction.

If the b weight for the product term is not significant, you do not have an interaction bearing in mind the sorts of errors we make in statistical work. Drop the product term, go back to step 2, and interpret your b weights for the independent variables as you ordinarily would.

### Curvilinear regression - Handbook of Biological Statistics

Moderators and Mediators Some people talk about moderators and moderated regression. The moderator variable is one whose values influence the importance of another variable. An example would be that cognitive ability moderates the relations between creativity and productivity. How the test works In polynomial regression, you add different powers of the X variable X, X2, X3… to an equation to see whether they increase the R2 significantly. The R2 will always increase when you add a higher-order term, but the question is whether the increase in R2 is significantly greater than expected due to chance.

You can keep doing this until adding another term does not increase R2 significantly, although in most cases it is hard to imagine a biological meaning for exponents greater than 3. Even though the usual procedure is to test the linear regression first, then the quadratic, then the cubic, you don't need to stop if one of these is not significant. For example, if the graph looks U-shaped, the linear regression may not be significant, but the quadratic could be.

Examples Fernandez-Juricic et al. They counted breeding sparrows per hectare in 18 parks in Madrid, Spain, and also counted the number of people per minute walking through each park both measurement variables. Graph of sparrow abundance vs. This seems biologically plausible; the data suggest that there is some intermediate level of human traffic that is best for house sparrows.

Perhaps areas with too many humans scare the sparrows away, while areas with too few humans favor other birds that outcompete the sparrows for nest sites or something. Even though the cubic equation fits significantly better than the quadratic, it's more difficult to imagine a plausible biological explanation for this. I'd want to see more samples from areas with more than 35 people per hectare per minute before I accepted that the sparrow abundance really starts to increase again above that level of pedestrian traffic.

The data are shown below in the SAS example. Adding the cubic and quartic terms does not significantly increase the R2. The first part of the graph is not surprising; it's easy to imagine why bigger tortoises would have more eggs. The decline in egg number above mm carapace length is the interesting result; it suggests that egg production declines in these tortoises as they get old and big.

X-ray of a tortoise, showing eggs. Graph of clutch size number of eggs vs. Graphing the results As shown above, you graph a curvilinear regression the same way you would a linear regression, a scattergraph with the independent variable on the X axis and the dependent variable on the Y axis.

In general, you shouldn't show the regression line for values outside the range of observed X values, as extrapolation with polynomial regression is even more likely than linear regression to yield ridiculous results. For example, extrapolating the quadratic equation relating tortoise carapace length and number of eggs predicts that tortoises with carapace length less than mm or greater than mm would have negative numbers of eggs.

Similar tests Before performing a curvilinear regression, you should try different transformations when faced with an obviously curved relationship between an X and a Y variable. A linear equation relating transformed variables is simpler and more elegant than a curvilinear equation relating untransformed variables. You should also remind yourself of your reason for doing a regression.

If your purpose is prediction of unknown values of Y corresponding to known values of X, then you need an equation that fits the data points well, and a polynomial regression may be appropriate if transformations do not work.

However, if your purpose is testing the null hypothesis that there is no relationship between X and Y, and a linear regression gives a significant result, you may want to stick with the linear regression even if curvilinear gives a significantly better fit. Using a less-familiar technique that yields a more-complicated equation may cause your readers to be a bit suspicious of your results; they may feel you went fishing around for a statistical test that supported your hypothesis, especially if there's no obvious biological reason for an equation with terms containing exponents.

Spearman rank correlation is a nonparametric test of the association between two variables. It will work well if there is a steady increase or decrease in Y as X increases, but not if Y goes up and then goes down. Polynomial regression is a form of multiple regression.

## Handbook of Biological Statistics

In multiple regression, there is one dependent Y variable and multiple independent X variables, and the X variables X1, X2, X In polynomial regression, the independent "variables" are just X, X2, X3, etc. How to do the test Spreadsheet I have prepared a spreadsheet that will help you perform a polynomial regression.

It tests equations up to quartic, and it will handle up to observations.

Web pages There is a very powerful web page that will fit just about any equation you can think of to your data not just polynomial. R Salvatore Mangiafico's R Companion has sample R programs for polynomial regression and other forms of regression that I don't discuss here B-spline regression and other forms of nonlinear regression.

SAS To do polynomial regression in SAS, you create a data set containing the square of the independent variable, the cube, etc.

It's possible to do this as a multiple regressionbut I think it's less confusing to use multiple model statements, adding one term to each model. There doesn't seem to be an easy way to test the significance of the increase in R2 in SAS, so you'll have to do that by hand.

If R2i is the R2 for the ith order, and R2j is the R2 for the next higher order, and d. It has j degrees of freedom in the numerator and d.

Here's an example, using the data on tortoise carapace length and clutch size from Ashton et al. So the quadratic equation fits the data significantly better than the linear equation.

Once you've figured out which equation is best the quadratic, for our example, since the cubic and quartic equations do not significantly increase the R2look for the parameters in the output: References X-ray of a tortoise from The Tortoise Shop. Geographic variation in body and clutch size of gopher tortoises.

**Curvilinear relationships**

Testing the risk-disturbance hypothesis in a fragmented landscape: