The paper by (Ruiz et al, 2017) questions the current rating
system used in most mobile application stores. The paper concludes that the
cumulative user-rating calculated for each application defined as a store
rating does not portray the correct picture of the customer experience with
respect to the changing versions of the application. This may discourage the developers
from improving the quality of their applications. To examine the research
question, the authors study the rating system used by Google play app-store by
mining the store ratings of free-to-download mobile applications throughout
2011. Any noticeable change in the store rating of an app with respect to
changing rating of a specific version of that app is examined. In the case
study results, two major observations are made:
change in the version rating of an application does not show a very high change
in the store rating. In other words, the store rating is very resilient to
changes in the version rating.
the store rating does not deviate much once a large number of people have rated
Hence, the paper concludes that the system of star rating for
apps is flawed, which may discourage the developers from improving their
article fails to provide the methodology used to mine the Google Play app
store. Although reference has been made to the publicly available APIs,
however, in order to validate the findings some brief information about
crawling tools and their techniques should be highlighted.
version rating may not be fairly accurate as it is not clear when the users
rate an app. Secondly, although the authors have tried to remove the anomalies
in the ratings by filtering the apps, a more accurate representation could have
been made had they filtered the user ratings as often app store ratings suffer
from self-selection bias (Henry, 2014).
it is out of the context of the current study, there could be other factors
that may indirectly alter the impact the store rating of an application has on
the developer. For instance, the developers may ignore the store rating or may
judge an application by other parameters such as a change in the number of
downloads once the store rating of an application has stabilized after it has
been rated by a sufficiently large number of users.
case study may give more accurate results if paid applications are considered
as the users of paid apps may tend to give a more fair response. The process of
reviewing an app requires a six-step process involving leaving the currently
opened app and signing into the store (Walz, 2016). So those who have actually
paid for an application may be more willing to follow the process of giving an unbiased
plot to examine the research question: Hexagonal binning has provided a more
coherent representation of data by capturing the frequency of each point depicted
by a color, hence replacing the scatter plots which often become too dense to
interpret if the data is large.
sample data: The mined sample size consists of 242,089 app versions of 131,649
mobile apps which is sufficiently large to give an accurate statistic.
with anomalous ratings such as apps with less than 10 raters and only one
version are filtered out.
dataset is made available in order to make the study replicable in the future.