StatisticalAnalysis for Factors Influencing Pouring of Coffee in a Cup
Descriptionof analysis
Inthis section, we will conduct an experiment to investigate theeffects of the explanatory variables on the dependent variable. Inour study as major interest will be to investigate factors thataffect coffee in the cup. In our study we will conduct an experimenton factors such as the amount of sugar in a cup regarding grams,walking speed of individual in meters per ssecond, type of cup inconsideration to its width and level of the coffee in milliliters. Inthis experiment we shall consider a sample size of 20 measurementstaken from each observation and then recorded in the table,(Montgomery, Runger, & Hubele, 2009).
Allour measurements in this data collection are continuous that willenable us to perform normality test in order to give us the way tocarry out any parametric test. In the analysis, descriptivestatistics will be used to give a summary to all variables that wereinvestigated in the experiment. This will be displayed in tabulatedforms with full details of each finding, (Merletti, & Parker,2004)
Otherpredictive analysis such ANOVA will be applied to compare meanswithin and between the treatments and check any significance betweenthem. This will be further elaborated on the Fstatistics value andthe pvalue. Inferential statistics such correlation will be used totest the strength of association between the dependent variable andexplanatory variables,( Tidd, Pavitt, & Bessant, 2001)
Regressionanalysis will also help to determine the effect of the independentvariables on the dependent variable and the effect size of thecoefficient parameters. Graphical data presentation will be anotherimportant aspect of analysis that explains more about the data in avisual manner. Some of the graphical methods that will be applied inthe analysis will include box plot, scatter plot, histogram, residualplot and QQ plot for test of normality, (Zhang., Wang, Ye2008)
Table1.0 Descriptive statistics for Amount of coffee in cup (milliliters)
Descriptive Statistics 
  
  
  

Column 5 AMOUNT OF COFFE 
Minimum: 50 
  

  
1st Quartile:92.5 
  

Sample Size, n: 20 
2nd Quartile:160 
  

Mean: 226.5 
3rd Quartile:355 
  

Median: 160 
Maximum: 500 
  

Midrange: 275 
  

RMS: 273.8613 
Sum: 4530 
  

Variance, s^2: 24945 
Sum Sq:1.500000e+6 

St. Dev., s: 157.9399 
Range: 450 
  

Mean Abs Dev:140.15 
Coeff. Of Var. 69.73% 

95% CI for the Mean: 
  

152.5818 < mean <300.4182 
95% CI for the variance: 

  
14426.8462 < VAR < 53214.5014 

95% CI for the St. Dev.: 
120.1118 < SD < 230.6827 
FromTable 1.0 it clearly shows that the mean average of the amount ofcoffee that was an experiment in different cups of different sizeswas 226.5 milliliters with a confidence interval of 152.5818 and300.4182 as lower and upper limit intervals respectively with astandard deviation of 157 milliliters.
Thenormality plot is as shown below by histogram and box plot. The plotof the normality still is shown in the QQ plot that explains moreabout the linear association between the observed values.
Table1.1 Descriptive statistics for type of cup in width (cm)
Descriptive Statistics 
Minimum: 35 
  

Column 4 Type of cup(width) 
1st Quartile:59.5 
  

  
2nd Quartile:84 
  

Sample Size, n: 20 
3rd Quartile:175 
  

Mean: 112.9 
Maximum: 250 
  

Median: 84 
  

Midrange: 142.5 
Sum: 2258 
  

RMS: 132.0901 
Sum Sq:348956 
  

Variance, s^2: 4948.832 
  

St. Dev., s: 70.34793 
Range: 215 
  

Mean Abs Dev:60.97 
Coeff. Of Var. 62.31% 

5% CI for the Mean: 
95% CI for the variance: 

152.5818 < mean <300.4182 
14426.8462 < VAR < 53214.5014 

95% CI for the St. Dev.: 
120.1118 < SD < 230.6827 
Thetable clearly indicates that the measurement of the speed that one ismoving while having the cup of coffee had a mean score of 7.8 metersper second with a standard deviation of 3.4.The mean age had aconfidence 95% confidence interval intervals of 6.1959 and 9.4041as lower and upper respectively.
Thenormality plot is clearly shown below by the boxplot, histogram andQQ plot. The QQ plot explains more about the linear trend between theobserved values.
Figure1.0 showing normality plot of the amount of sugar in the cup
Figure1.1 showing normality plot of the type of cup in terms of width
Table1.3 Descriptive statistics for walking speed (meters/sec)
Explore Data – Walking speed(meters/sec) 
  
  
  

  
  

Sample Size, n: 20 
Sum: 156 
  

Mean: 7.8 
  

Median: 7.5 
95% CI for the Mean: 

Midrange: 8 
6.1959 < mean <9.4041 

RMS: 8.485281 
  

Variance, s^2:11.74737 
95% CI for the St. Dev.: 

St. Dev., s: 3.427443 
2.6065 < SD < 5.006 
  

Mean Abs Dev: 2.8 
  

Range: 12 
95% CI for the variance: 

Coeff. Of Var.43.94% 
6.794 < VAR < 25.0603 

Maximum: 14 
1st Quartile: 6 
  

Minimum: 2 
2nd Quartile: 7.5 
  

3rd Quartile: 10 
  
  
  
  
Thetable clearly indicates that the measurement of the speed that one ismoving while having the cup of coffee had a mean score of 7.8 metersper second with a standard deviation of 3.4.The mean rage had aconfidence95% confidence interval intervals of 6.1959 and 9.4041 as lower and upper respectively.
Thenormality plot is clearly shown below by the box plot, histogram andQQ plot. The QQ plot explains more about the linear trend between theobserved values.
Table1.4 Descriptive statistics amount of sugar in cup
Explore Data – Amount of sugar 
Sum: 308 

  
  

Sample Size, n: 20 
95% CI for the Mean: 

Mean: 15.4 
12.5658 < mean <18.2342 

Median: 15.5 
  

Midrange: 15 
95% CI for the St. Dev.: 

RMS: 16.49242 
4.6054 < SD < 8.8451 

Variance, s^2:36.67368 
  

St. Dev., s: 6.05588 
95% CI for the variance: 

Mean Abs Dev: 5 
21.2101 < VAR < 78.235 

Range: 20 
2nd Quartile: 15.5 

Coeff. Of Var.39.32% 
3rd Quartile: 20 

Minimum: 5 
Maximum: 25 

1st Quartile: 10 
  
  
  
Table1.4 indicates the amount of sugar that was in cups. The mean score ofthe sugar amount was 15.4 grams and standard deviation of 6 grams.The mean score had 95% mean confidence interval of 12.5658and 18.2342 as lower and upper limits. The test of normality isindicated in the figure 1.4 from the histogram and QQ plot.
Figure1.3 showing the normality plot of walking speed of individual withcup of coffee
Figure1.4 showing normality tests for amount of sugar in the cup
TheQQ plot shows the linear trend of association between the variables. The line shows that there is linear association between the observedvalues of sugar in cup.
Table1.5 Descriptive statistics of level of coffee in cup in milliliters
Explore Data – Level of cup of coffee 
Sum: 2465 

  
  

Sample Size, n: 20 
95% CI for the Mean: 

Mean: 123.25 
95.794 < mean <150.706 

Median: 116 
  

Midrange: 137.5 
95% CI for the St. Dev.: 

RMS: 135.8678 
44.6141 < SD < 85.6843 

Variance, s^2:3441.566 
  

St. Dev., s: 58.66486 
95% CI for the variance: 

Mean Abs Dev: 42.9 
1990.4165 < VAR < 7341.8003 

Range: 225 
Minimum: 25 

Coeff. Of Var.47.60% 
1st Quartile: 85 

3rd Quartile: 142.5 
2nd Quartile: 116 

  
  
  
Maximum: 250 
Table1.5 clearly indicates that the mean score of the level of coffee indifferent cups of different size was 123.25 with a standard deviationof 58.7 and 95% confidence of 95.794and 150.706 lower and upper limits respectively. The test ofnormality is depicted in the figures below. The QQ plot tries toexplain linear association between the observed values.
Figure1.5 showing normality tests between the observed values.
Table1.6 showing ANOVA table
Source: DF: SS: MS: Test Stat, F: Critical F: PValue: 

Treatment: 3 228387.3375 76129.112536.0852 2.724947 0.000 

Error: 76160337.55 2109.704605 
  

Total: 79388724.8875 
  
  
Table1.6 confirms that the interaction of means within the dependentvariable and independent variable is significance with a pvalue of<0.0001*.This is from Fisher information on the table that thevalue of Fstatistics 2.724947 and critical value of 36.0852.Analysisof variance gives comparisons using a different treatment. This tellsus all our variables that are used to explain the pouring of coffeeare very strong explanatory variables and have very closerelationship.
RegressionModelling
Linearregression of Amount of coffee and individual factor.
Amountof coffee and Type of cup (width)
Theamount of coffee is defined by the type of cup (width) with a linearregression equation
Y= 0.0021X_{1}+213.8, which implies that the intercept is 213.8 and the amount ofcoffee is predicted by the type of cup (width) by 0.0021 units forevery unit. The coefficient of determination R^{2}=0.000007137 indicating that 0.00% of the data fits the model whichclearly indicates that the model is a bad model thus no relationshipbetween the amount of coffee and type of cup (width) and therefore,it cannot be used to predict the amount of coffee alone.
Amountof coffee and Walking speed
Theamount of coffee is defined by the walking speed with a linearregression equation
Y= 0.5618X_{2}+219.06, which implies that the intercept is 219.06 and the amount ofcoffee is predicted by the walking speed by 0.5618 units for everyunit. The coefficient of determination R^{2}=0.3153 equivalent to 31.53% which explains that 31.53% of the datafits the model, hence the model is a bad model for this data andtherefore, it cannot be used to predict the amount of coffee alone.
Amountof coffee and Amount of sugar
Theamount of coffee is defined by the amount of sugar with a linearregression equation
Y= 0.6236X_{3}+222.23, which implies that the intercept is 222.23 and the amount ofcoffee is predicted by the amount of sugar by 0.6236 units for everyunit. The coefficient of determination R^{2}=0.0094 which is equivalent to 0.94% and it explains that only 0.94%of the data fits the model, hence the model is a bad model for thisdata and therefore, it cannot be used to predict the amount of coffeealone.
Amountof coffee and Level of coffee
Theamount of coffee is defined by the level of coffee with a linearregression equation
Y= 0.961X_{4}+97.494, which implies that the intercept is 97.494 and the amount ofcoffee is predicted by the level of coffee by 0.961 units for everyunit. The coefficient of determination R^{2}=0.0453 and it explains that only 4.53% of the data fits the givenlinear regression model. This implies that this is a bad model forthe data and therefore, it cannot be used to predict the amount ofcoffee alone.
Table1.7 Multiple Regression Modelling
Regression Statistics 

Multiple R 
0.64236 

R Square 
0.412626 

Adjusted R Square 
0.244805 

Standard Error 
11.92799 

Observations 
19 

  
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 95.0% 
Upper 95.0% 
Intercept 
361.0139 
167.35 
2.1572 
0.0489 
2.0784 
719.949 
2.07843 
719.95 
Type of cup(width) 
0.17166 
0.1959 
0.8764 
0.3956 
0.5917 
0.24843 
0.5917 
0.2484 
Walking speed 
0.66925 
0.2608 
2.5663 
0.0224 
1.2286 
0.1099 
1.2286 
0.11 
Amount of sugar 
0.95972 
1.68 
0.5713 
0.5769 
4.563 
2.64359 
4.563 
2.6436 
Level of coffee 
0.88043 
1.3417 
0.6562 
0.5223 
3.7581 
1.99721 
3.7581 
1.9972 
Number of columns used: 
  
  
  

Dependent column: coffee in cup 
Total Variation: 473955 

  
Explained Variation: 112919 

Intercept b0: 390.2004 

Walking speed b1: 3.158828 
Standard Error:155.1421 

Amount of sugar , b2: 11.33396 
Coeff of Det, R^2:0.2382484 

Level of the coffee b3: 0.3327429 
Adjusted R^2: 0.0351146 

Width of the cup, b4: 0.0489733 
P Value: 0.003620063 
Theamount of coffee spilled (y) is affected by all the factors presentwhich are the walking speed b1, amountof sugar b2, level of the coffee b3, and finally width of the cupb4. Nowthe regression equation that will define how the amount of coffeespilled (y) is affected by each factor will be written in the form:
Y= 361.0139 – 0.17166b1 – 0.66925b2 0.95972b3 – 0.88043b4.
Thisis known as the best line that fits the regression model, butremember that the coefficient of determinant is only 0.412626 whichis equivalent to 41.26%, implying that only 41.26% of the data wehave fits the regression model we have and therefore, the model isnot a good model.
Inorder to be able to set prediction about the dependent andindependent variables a multiple linear regression was fitted. Thisregression modeling was important in that given one variable one wasable to predict the effect size of it on dependent variables when allother factors are kept constant. From the regression model, it isclearly our data can only explain 3.5% of the model as indicatedadjusted R2 and the other parts are explained by the intercept. Witha pvalue of: 0.003620063 which is very significant.
Fromthe coefficient of the walking speed, we can see that coffee in thecup has a negative association with speed. This can be statisticallyinterpreted that unit increase in speed leads to the purring ofcoffee by 3.1 units. This association is significant in that from atheoretical point of view if speed the amount of coffee in cupdecreases.
Theamount of sugar is another explanatory variable that can be used totell more about the pouring of coffee in the cup. This also shows anegative association in that if the amount of sugar in cup increasesthere would be a decrease in the amount of coffee by 11 units andthis association is significant.
Thewidth of the cup that contains the coffee is another strongexplanatory variable in that it also shows a negative associationwith the amount of coffee. If the width is bigger, there is bigsurface area thus the pouring of the coffee is lower while if thewidth is the small pouring of coffee is by: 0.0489733.
Conclusionand recommendation
Fromthe experiment, we have been able to investigate some of theparameters that affect pouring of coffee in the cup. From the generallinear regression, we have found that amount of sugar, walking speed,the level of the coffee in the cup are very significant factors thatare explaining pouring of coffee. I recommend further research workto be done to investigate about other factors that contribute topouring of coffee as our model can only explain 3.5%.
References
Montgomery,D. C., Runger, G. C., & Hubele, N. F. (2009). Engineeringstatistics.John Wiley
&Sons.
Merletti,R., & Parker, P. A. (2004). Electromyography:physiology, engineering, and non
invasiveapplications(Vol. 11). John Wiley & Sons.
Tidd,J., Pavitt, K., & Bessant, J. (2001). Managinginnovation(Vol. 3). Chichester: Wiley.
MurraySmith,R., & Johansen, T. (Eds.). (1997). Multiplemodel approaches to nonlinear
modellingand control.CRC press.
Zhang,H., Wang, J., & Ye, S. (2008). Predictions of acidity, solublesolids and firmness of pear using electronic nose technique. Journalof Food Engineering,86(3),370378.