An Open Science Project on Statistics: Doing the power analysis, equivalence test, NHST t-test, POST and computing the Bayes Factor to compare the IMDB Ratings of a few most recent movies by the legendary film directors Satyajit Ray and Akira Kurosawa (in R)

The following appeared as a project assignment (using Open Science Framework) in the coursera course Improving your Statistical Inferences (by Eindhoven University of Technology). The project is available here.

First we need to do pre-registration to control  the (type-I) error rates and reduce publication bias, as required by the OSF and shown below:

f1.png

Theoretical hypothesis

The theoretical hypothesis we are going to test is the following: both Satyajit Ray (from Kolkata, India) and Akira Kurosawa (from Japan) are great directors, both of them won the Academy Award for their Lifetime Achievement. Because they are both great, the movies they directed are equally good.

Dependent Variables to be measured

  • The dependent variables to be measured are the IMDB ratings (scores), # Users rated each movie.
  • First IMDB search will be used separately for the two legendary directors separately to get all the hits.
  • Then the search results will be sorted based on the release date, and 29 most recent full movies (excluding documentaries / TV series) will be used.
  • So in this case, we shall use the 29 last movies Satyajit Ray and Akira Kurosawa directed in from today (excluding documentaries / TV series), the moment we did the IMDB search.
  • The following table shows the data collected for Satyajit Ray movies.
Movie Rating (Out of 10) #Users Rated Release Year
The Stranger (Agantuk) 8.1 1760 1991
Shakha Prosakha 7.6 453 1990
Ganashatru 7.3 662 1989
Ghare Baire 7.7 812 1984
Hirak Rajar Deshe 8.8 1387 1980
Jai Baba Felunath 7.9 1086 1979
Shatranj Ke Khilari 7.8 2370 1977
Jana Aranya 8.3 887 1976
Sonar Kella 8.5 1308 1974
Distant Thunder (Ashani Sanket) 8.2 908 1973
Company Limited (Seemabaddha) 8.0 782 1971
Pratidwandi 8.2 1051 1970
Days and Nights in the Forest (Aranyer Din Ratri) 8.3 1720 1970
Goopy Gayen Bagha Bayen 8.8 1495 1969
Chiriyakhana 7.2 477 1967
Nayak 8.3 1974 1966
Mahapurush 7.3 719 1965
The Coward (Kapurush) 7.8 858 1965
Charulata 8.3 3597 1964
Mahanagar 8.3 2275 1963
Abhijaan 8.0 781 1962
Kanchenjungha 8.0 706 1962
Teen Kanya 8.2 991 1961
Devi 8.0 1407 1960
The World of Apu (Apur Sansar) 8.2 8058 1959
The Music Room (Jalshaghar) 8.1 3872 1958
Paras-Pathar 7.8 723 1958
Aparajito 8.2 7880 1956
Pather Panchali 8.4 15799 1955
  • The following table shows the data collected for Akira Kurosawa movies.
Movie Rating (Out of 10) #Users Rated Release Year
Maadadayo 7.4 4035 1993
Rhapsody in August 7.3 5131 1991
Dreams 7.8 19373 1990
Ran 8.2 84277 1985
Kagemusha 8.0 25284 1980
Dersu Uzala 8.3 18898 1975
Dodes’ka-den 7.5 4839 1970
Red Beard 8.3 12295 1965
High and Low 8.4 19989 1963
Sanjuro 8.2 22296 1962
Yojimbo 8.3 80906 1961
The Bad Sleep Well 8.1 8082 1960
The Hidden Fortress 8.1 25980 1958
The Lower Depths 7.5 3776 1957
Throne of Blood 8.1 34723 1957
I Live in Fear 7.4 3090 1955
Seven Samurai 8.7 247406 1954
Ikiru 8.3 46692 1952
The Idiot 7.4 3533 1951
Rashomon 8.3 112668 1950
Scandal 7.4 2580 1950
Stray Dog 7.9 11789 1949
The Quiet Duel 7.5 2131 1949
Drunken Angel 7.8 7422 1948
One Wonderful Sunday 7.3 1988 1947
Waga seishun ni kuinashi 7.2 2158 1946
Asu o tsukuru hitobito 6.6 119 1946
The Men Who Tread on the Tiger’s Tail 6.8 2567 1945
Zoku Sugata Sanshirô 6.2 1419 1945
Zoku Sugata Sanshirô 5.8 1229 1944

Justify the sample size

  • We want to predict no difference, and thus we shall do a power analysis for an equivalence test. We want to be pretty sure that we can reject our smallest effect size of interest, so We shall design a study with 84% power. For this educational assignment, we do not collect a huge amount of data.
  • As long as we can exclude a large effect (Cohen’s d=0.8 or larger) we shall be happy for this assignment.
  • The power analysis estimates that the sample size we need to show the difference between the ratings for movies directed by Satyajit Ray and Akira Kurosawa is smaller than Cohen’s d=0.8 (assuming the true effect size is 0, and with n α of 0.05, when we aim for 84power) is 29 movie ratings from Satyajit Ray, and 29 movie ratings from Akira Kurosawa, as can be seen from the following R code and the figures.
  • The αα-level I found acceptable is 0.05.
  • we performed a two-sided test.
  • we used 84% power for this study.
  • The effect size expected is 0.78948 0.8, as shown below.
  • Given that Satyajit Ray has a total 29 full movies directed, we can only collect 29 observations for him, also we collected equal amount of sample
    data (29 movies) for each of the directors.

The following theory is going to be used for the statistical tests:

p11.png

Results

p1
p2.png

  • As can be seen from above, the sample size required to obtain 84power is 29.

Specify the statistical test to conduct

  • We need to translate our theoretical hypothesis to a statistical hypothesis.
  • Let’s calculate the (90%) CI around the effect size.
  • When the 90% CI falls below, and excludes a Cohen’s d of 0.8, we can consider the ratings of the movies directed by Satyajit Ray and Akira Kurosawa as equivalent.

p3.png

p4.png

p5.png

  • As can be seen from the NHST test above that the effects are statistically significant, since 90% confidence interval around the effect size does not contain 0.
  • Also, the TOST procedure results shown above indicates that the observed effect size d=0.69 was not significantly within the equivalent bounds of d=-0.8 and d=0.8t(29)=2.86p=0.997.
  • Also, the 90% CI (0.24,1.14) around the effect size includes a Cohen’s d of 0.8, hence, we can consider the ratings of the movies directed by Satyajit Ray and Akira Kurosawa as not equivalent.
  • Hence, the effect is statistically significant, but not statistically equivalent.
  • Supporting the alternative with Bayes Factors: As can be seen from the following results, the Bayes Factor 50.17844 increases our belief in the alternative hypothesis (H1) over the null hypothesis (H0), starting with small prior belief 0.2 on the effect size.
  • The following code is taken from the course itself and modified as required and it’s originally written / protected by © Professor Daniel Lakens, 2016 and licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/).p6.pngp7p8p9p10
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s