*(Sandipan Dey, ** July 25, 2017)*

The following appeared as a project assignment (using **Open Science Framework**) in the coursera course **Improving your Statistical Inferences **(by Eindhoven University of Technology). The project is available here.

First we need to do **pre-registration **to control the (type-I) error rates and reduce publication bias, as required by the OSF and shown below:

## Theoretical hypothesis

The theoretical hypothesis we are going to test is the following: both **Satyajit Ray** (from Kolkata, India) and **Akira Kurosawa** (from Japan) are great directors, both of them won the *Academy Award* for their *Lifetime Achievement*. Because they are both great, the *movies they directed* are *equally good*.

## Dependent Variables to be measured

- The dependent variables to be measured are the
**IMDB ratings**(scores),**# Users rated each movie**. - First IMDB search will be used separately for the two legendary directors separately to get all the hits.
- Then the search results will be sorted based on the release date, and 29 most recent full movies (excluding documentaries / TV series) will be used.
- So in this case, we shall use the 29 last movies
**Satyajit Ray**and**Akira Kurosawa**directed in from today (excluding documentaries / TV series), the moment we did the IMDB search. - The following table shows the data collected for
**Satyajit Ray**movies.

Movie | Rating (Out of 10) | #Users Rated | Release Year |
---|---|---|---|

The Stranger (Agantuk) | 8.1 | 1760 | 1991 |

Shakha Prosakha | 7.6 | 453 | 1990 |

Ganashatru | 7.3 | 662 | 1989 |

Ghare Baire | 7.7 | 812 | 1984 |

Hirak Rajar Deshe | 8.8 | 1387 | 1980 |

Jai Baba Felunath | 7.9 | 1086 | 1979 |

Shatranj Ke Khilari | 7.8 | 2370 | 1977 |

Jana Aranya | 8.3 | 887 | 1976 |

Sonar Kella | 8.5 | 1308 | 1974 |

Distant Thunder (Ashani Sanket) | 8.2 | 908 | 1973 |

Company Limited (Seemabaddha) | 8.0 | 782 | 1971 |

Pratidwandi | 8.2 | 1051 | 1970 |

Days and Nights in the Forest (Aranyer Din Ratri) | 8.3 | 1720 | 1970 |

Goopy Gayen Bagha Bayen | 8.8 | 1495 | 1969 |

Chiriyakhana | 7.2 | 477 | 1967 |

Nayak | 8.3 | 1974 | 1966 |

Mahapurush | 7.3 | 719 | 1965 |

The Coward (Kapurush) | 7.8 | 858 | 1965 |

Charulata | 8.3 | 3597 | 1964 |

Mahanagar | 8.3 | 2275 | 1963 |

Abhijaan | 8.0 | 781 | 1962 |

Kanchenjungha | 8.0 | 706 | 1962 |

Teen Kanya | 8.2 | 991 | 1961 |

Devi | 8.0 | 1407 | 1960 |

The World of Apu (Apur Sansar) | 8.2 | 8058 | 1959 |

The Music Room (Jalshaghar) | 8.1 | 3872 | 1958 |

Paras-Pathar | 7.8 | 723 | 1958 |

Aparajito | 8.2 | 7880 | 1956 |

Pather Panchali | 8.4 | 15799 | 1955 |

- The following table shows the data collected for
**Akira Kurosawa**movies.

Movie | Rating (Out of 10) | #Users Rated | Release Year |
---|---|---|---|

Maadadayo | 7.4 | 4035 | 1993 |

Rhapsody in August | 7.3 | 5131 | 1991 |

Dreams | 7.8 | 19373 | 1990 |

Ran | 8.2 | 84277 | 1985 |

Kagemusha | 8.0 | 25284 | 1980 |

Dersu Uzala | 8.3 | 18898 | 1975 |

Dodes’ka-den | 7.5 | 4839 | 1970 |

Red Beard | 8.3 | 12295 | 1965 |

High and Low | 8.4 | 19989 | 1963 |

Sanjuro | 8.2 | 22296 | 1962 |

Yojimbo | 8.3 | 80906 | 1961 |

The Bad Sleep Well | 8.1 | 8082 | 1960 |

The Hidden Fortress | 8.1 | 25980 | 1958 |

The Lower Depths | 7.5 | 3776 | 1957 |

Throne of Blood | 8.1 | 34723 | 1957 |

I Live in Fear | 7.4 | 3090 | 1955 |

Seven Samurai | 8.7 | 247406 | 1954 |

Ikiru | 8.3 | 46692 | 1952 |

The Idiot | 7.4 | 3533 | 1951 |

Rashomon | 8.3 | 112668 | 1950 |

Scandal | 7.4 | 2580 | 1950 |

Stray Dog | 7.9 | 11789 | 1949 |

The Quiet Duel | 7.5 | 2131 | 1949 |

Drunken Angel | 7.8 | 7422 | 1948 |

One Wonderful Sunday | 7.3 | 1988 | 1947 |

Waga seishun ni kuinashi | 7.2 | 2158 | 1946 |

Asu o tsukuru hitobito | 6.6 | 119 | 1946 |

The Men Who Tread on the Tiger’s Tail | 6.8 | 2567 | 1945 |

Zoku Sugata Sanshirô | 6.2 | 1419 | 1945 |

Zoku Sugata Sanshirô | 5.8 | 1229 | 1944 |

## Justify the sample size

- We want to predict
**no difference**, and thus we shall do a**power analysis**for an**equivalence test**. We want to be pretty sure that we can reject our**smallest effect size**of interest, so We shall design a study with 84%**power**. For this educational assignment, we do not collect a huge amount of data. - As long as we can exclude a
**large effect**(**Cohen’s****d=0.8**or larger) we shall be happy for this assignment. - The
**power analysis**estimates that the**sample size**we need to show the difference between the ratings for movies directed by**Satyajit Ray**and**Akira Kurosawa**is smaller than**Cohen’s**d=0.8 (assuming the**true effect size**is**0**, and with n α of 0.05, when we aim for 84%**power**) is 29 movie ratings from**Satyajit Ray**, and**29**movie ratings from**Akira Kurosawa**, as can be seen from the following R code and the figures. - The αα-level I found acceptable is 0.05.
- we performed a
**two-sided**test. - we used 84%
**power**for this study. - The effect size expected is 0.78948 < 0.8, as shown below.
- Given that
**Satyajit Ray**has a total 29 full movies directed, we can only collect 29 observations for him, also we collected equal amount of sample

data (29 movies) for each of the directors.

The following **theory** is going to be used for the statistical tests:

Results

- As can be seen from above, the sample size required to obtain
**84% power**is**29**.

## Specify the statistical test to conduct

- We need to translate our
**theoretical hypothesis**to a**statistical hypothesis**. - Let’s calculate the (90%)
**CI**around the**effect size**. - When the 90%
**CI**falls below, and excludes a**Cohen’s****d**of 0.8, we can consider the ratings of the movies directed by**Satyajit Ray**and**Akira Kurosawa**as equivalent.

- As can be seen from the
**NHST**test above that the effects are**statistically significant**, since**90%****confidence interval**around the effect size does not contain**0**. - Also, the
**TOST**procedure results shown above indicates that the observed effect size**d=0.69**was**not significantly**within the**equivalent bounds**of**d=-0.8**and**d=0.8**,**t(29)=−2.86**,**p=0.997**. - Also, the
**90%****CI****(0.24,1.14)**around the effect size includes a**Cohen’s****d**of**0.8**, hence, we can consider the ratings of the movies directed by**Satyajit Ray**and**Akira Kurosawa**as**not equivalent**. - Hence, the effect is
**statistically significant**, but not**statistically equivalent**. **Supporting the alternative with Bayes Factors**: As can be seen from the following results, the**Bayes Factor***50.17844*increases our belief in the**alternative hypothesis**(*H1*) over the**null hypothesis**(*H0*), starting with**small prior**beliefon the*0.2***effect size**.- The following code is taken from the course itself and modified as required and it’s originally written / protected by © Professor Daniel Lakens, 2016 and licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/).