Skip to content

Semester Project Report

Audrey Loeffel edited this page Jun 23, 2016 · 2 revisions

#http://reminisce.me — a game to measure the memorability of autobiographical data

##Introduction The project reminisce.me is a game based on the user's Facebook data in the purpose to study the memorability of autobiographical data. It's a online game inspired from Tic Tac Toe; the user has to answer three questions about his Facebook activity in order to take a cell of the board. In my project I worked on the Statistical module of the game and particularly on the back-end. The module is an autonomous application that is requested by the Game Application to compute the new statistics after a new play or to retrieve them.

##Technology The statistical module is a Scala application with Akka toolkit and runtime. Akka emphasizes actor-based concurrency. The RESTFul part is implemented with Spray toolkit that allows to build REST/HTTP-based integration layers on top of Scala and Akka. The data are stored in a MongoDB database and the interface between the application and the database is achieved with ReactiveMongo, a driver Scala Driver for MongoDB. MongoDB is a NoSQL database in which data are stored in a JSON-like form call BSON (For Binary JSON)

Data architecture

Two main data entities: Game and Statistics. The first represents the summary of a play between two players. It contains the userIDs, the scores, the questions, etc. The second contains the statistics for a user.

Both of them are stored in the database in his entirely as a single document of the collection. The mongoDB database contains two collections:

  • gameCollection: Contains all Game entities that have been received
  • cacheCollection: Contains all Statistic entities that have have computed

Game Entity

{
	_id: String,
    player1: String,
    player2: String,
    player1Board: Board,
    player2Board: Board,
    status: String,
    playerTurn: Int,
    player1Scores: Int,
    player2Scores: Int,
    boardState: List[List[Score]],
    player1AvailableMoves: List[Move],
    player2AvailableMoves: List[Move],
    wonBy: Int,
    creationTime: Int
}
  • _id is the ID of the play
  • player1 and player2 are the ID's of the two players
  • player1Board and player2Board are the game board of the play
  • status indicates whether the game is ended or not
  • playerTurn indicates which player can play
  • player1Score and player2Score indicates the points obtained by the player
  • player1AvailableMoveand player2AvailableMovekeep a list of available move for the user
  • wonByindicate which player has won
  • creationTime: creation tim of the game entity

More detailed informations about each subpart are available here

Statistics Entity

{	
    userID: String, 
    frequencies: FrequencyOfPlays
}
  • userID is the ID of the user for whome the statistics was computed
  • frequencies contains the list of all statistics for each type of interval

The stastistics are computed on different intervals (time unit): day, week, month, year and overall

FrequencyOfPlay

 {
	day: List[StatsOnInterval] = List(), 
    week: List[StatsOnInterval] = List(), 
    month: List[StatsOnInterval] = List(), 
    year: List[StatsOnInterval] = List(), 
    allTime: Option[List[StatsOnInterval]] = None 
 }
  • day lists the statistics computed on an daily interval
  • week lists the statistics computed on an weekly interval
  • month lists the statistics computed on an monthly interval
  • year lists the statistics computed on an yearly interval

StatsOnInterval

{
	ago: Int, 
    amount: Int, 
    won: Int, 
    lost: Int,
    questionsBreakDown: List[QuestionsBreakDown], 
    gamesPlayedAgainst: List[GamesPlayedAgainst]
}
  • ago is the number of unit time from now for whom the statistic was computed
  • amountis the number of game played ba the user during the interval
  • won and lost are the number of won plays, respectively lost plays.
  • questionsBreakDownlists all questions that the player has answered
  • gamesPlayedAgainst lists all oponent against which the user has played

GamesPlayedAgainst

{
	userID: String,
    numberOfGames: Int,
    won: Int,	    
    lost: Int 
}
  • userID is the ID of the opponent
  • numberOfGame is the number of plays against this opponent
  • won and lost are the number of won, respectively lost, plays against this opponent

QuestionsBreakDown

{
    questionsBreakDownKind: QuestionsBreakDownKind,
    totalAmount: Int,
    correct: Int,
    percentCorrect: Double
}
  • questionsBreackDownKind is the type of the question
  • totalAmount is the number of question of this type answered by the user
  • correct is the number of corect answers
  • percentCorrect is the percentage of correct answers

Request Handler

The module is a RESTful application that supports two requests:

  • GET: Statistic retrieving
    • address: http://localhost:777/stats
    • parameters
      • userId the id of the user
      • frequency to return
      • allTime if the summary of allTime should be included (true, false), if omitted defaults to false

Example

http://localhost:7777/stats?userId=1userID&frequency=day:30&frequency=week:4&frequency=month:2&frequency=year:1&allTime=false

This request asks for statistics of 1userID for the last 30 days, 4 weeks, 3 months and 1 year without the overall summary

More detailed informations about the API are available here

At the reception of the request, the parameters are parsed into a Timeline object which contains the number of time unit to return:

{ 
	userID: String, 
	day: Int, 
	week: Int, 
	month: Int, 
	year: Int
}

The default values are: day = 30, week = 5, month = 12, year = 10

System's Architecture - Services

Each request received by the StatServer is dispatched to the relevant service. In the application there are three main services: One for inserting an entity in the database, one for computing the statistics and one for retrieving data from the database. All services deleguate the tasks to worker. All Services and Workers are Akka actors.

The figure below shows the request dispatching to the Insertion and Retrieving Service.

System architecture

Data Insertion

The insertion is managed by the Insertion Service. He can insert Game of Statistic entity. The figure below shows the workflow of a Statistics entity insertion in blue and the Game entity Insertion in Red. Insertion workflow The workflow of a Statistic insertion is modeled in blue, the Game in red.

Game Insertion Workflow Example

  1. A client send an InsertEntity message to the Insertion Service
  2. The Service forward the message to a worker
  3. The worker inserts the entity in the database and send back a Inserted(ids) message which contains the list of userID contained in the game inserted.
  4. The service send a InsertionDone message to the client.
  5. The service create two Computation Services and send a ComputeStatistics(id) message to both of them.
  6. The Computation Service compute the statistics and send it a Done message to the Computation Service
  7. The Insertion is complete, the Service and all his children are stopped

Statistics Computation

Since the Statistics entity contains five time unit different, it delegates the task to five managers in charge to compute the statistics for every time unit. A worker is charge to compute the statistic for one interval and send it back to the manager. The manager collects them and, as soon as all workers have finished, send them to the service. The service collects all lists from the manager, create the Statistic entity and send it to the client. It also create a Insertion Service and send the Statistics entity to cache it in the database.

enter image description here

Statistics Computation Workflow Example

  1. The client sends a ComputeStatisticsWithTimline(userID, timeline) message to the Computation Service
  2. The service creates five managers with the interval type as attribute. He forwards the same message to each manager. (One manager will compute daily statistics, another will compute weekly statistics, ...)
  3. The manager extracts the number of time unit he has to compute in the Timeline object and creates for each of them a worker which is in charge to compute the statistics for one time unit. Finally he sends a ComputeStatsOnInterval(userID, from, to) message to every worker (from and to are the beginning and the end of the interval i.e a day or a week).
  4. The worker needs to compute the five sub-part of the statistics entity (amount, won, lost, ...). For that, he creates five sub-workers to which he send a ComputeSubStat(userID, SubStatType, from, to) message.
  5. The sub-worker send back the computed statistic.
  6. The worker collects all statistics from sub-workers and send them back to the manager via a ResponseStatOnInterval(stat) message.
  7. The manager collects all statistics from workers and send them back to the Computation Service via different kind of messages (DailyStats, weeklyStats, monthlyStats,...).
  8. The Computation Service instantiates an Insertion Service and send the Statistics entity to it in order to insert it in the database.
  9. The Insertion Service returns an Done message
  10. The Computation Service sends the Statistics entity (StatResponse(userID, freq, date)) to the client.

Data Retrievial

As the Insertion Service, the Retrieving Service delegates the task to a worker which query the database and return the data. In case of Statistics entity retrieving, if the entity isn't up-to-date, it instantiates a Computation Service and requests a new Computation.

The figure below shows a classic workflow in blue and the Statistics retrieving workflow in red. Data retrieving workflow

Statistics Retrievial Workflow Example

  1. The client sends a RetrieveStats(userID, frequences, allTime) message.
  2. The Retrieving Service parses frequencies into Timelineobject, send a RetrieveLastStatistic(userID) message to a new worker.
  3. The worker queries the entity and send it back via a StatisticsRetrieved(stats)message or send a StatisticsNotFound message if no Statistic could be retrieved.
  4. If the Service received out-dated Statistics (older than one day) or if no Statistics has been found, he instantiates a Computation Service and starts the computation with a ComputeStatisticsWithTimline(userID, timeline) message.
  5. The Computation Service sends back the freshly computed Statistics StatResponse.
  6. Finally the Retrieving Service sends the Statistics entity (from the database or the computation) to the client via a StatisticsRetrieved(stats) message.

Statistics Aggregations / Queries

The queries use ReactiveMongo driver which encode queries into a BSONDocument representation. In order to simplify the queries, some fields and values are modified during the BSON serialization:

  • player1Score and player2Score: player1 and player2 are replaced by the actual userID. They become i.e 1userID_Score called userScoreBelow.
  • player1Board and player2Board: Same as above.
  • wonBy: [int]: replace the number of the user by his userID (wonBy: "1userID")

As mentioned above, during the computation each sub-worker receives an interval boundaries from and to.

Amount

Since 1userID_Score (called userScore below) are contained only in the game play by this user, the query selects only the documents where this field name exists.

val query = BSONDocument(
              "status" -> "ended",
              userScore -> BSONDocument(
                "$exists" -> true),
              "creationTime" -> BSONDocument(
                "$gte" -> from.getMillis,
                "$lt" -> to.getMillis)
            )

This query returns a list of matched Game entities. The Amount statistic is the size of the list.

Won / Lost

val queryWon = BSONDocument(
                 "status" -> "ended",
                 userScore -> BSONDocument(
                   "$exists" -> true),
                 "wonBy" -> userID,
                 "creationTime" -> BSONDocument(
                   "$gte" -> from.getMillis,
                   "$lt" -> to.getMillis)
                )

val queryLost = BSONDocument(
                  "status" -> "ended",
                  userScore -> BSONDocument(
                    "$exists" -> true),
                  "wonBy" -> BSONDocument("$ne" -> userID),
                  "creationTime" -> BSONDocument(
                    "$gte" -> from.getMillis,
                    "$lt" -> to.getMillis)
          )

Same as above, the queries return a list of matched Game entities.

QuestionsBreakDown

All games played by the users in the interval are retrieved with a similar query that the one for the amount. All aggregations are done in the userspace in Scala.

GamesPlayedAgainst

Same as QuestionsBreakDown

Master Semester project, Audrey Loeffel, june 2016, EPFL Supervised by Michele Catasta, with the help of Roger Küng