……………………………..in the real data multiplied by the number of iterations per

……………………………..in the real data multiplied by the number of iterations per day (so that the time period of the simulation matches that of the real data).rsos.HM61713, BI 1482694 manufacturer royalsocietypublishing.org R. Soc. open sci. 3:…………………………………………5.1. CalibrationWe now describe how we calibrated our model to our Twitter data. The purpose of the six global parameters is to make our ABM `tuneable’, so that we can fine-tune it to match the behaviour observed in different kinds of online community. Calibrating the model to a particular community means finding the values of the six parameters that maximize the match between the model and the real data, i.e. the parameter values that make the simulation runs of the model most RP5264 cost closely resemble the real data. In our case, the specific metrics that we use to compare the simulated data with the real data are: the activity levels (number of messages sent per day) of each individual user, and the day-to-day volatility of this, as well as the sentiment of the whole network, and its day-to-day volatility. Comparing the real data and simulated data in this way is an instance of the method of simulated moments. We therefore propose the following function to score a particular simulation run (smaller scores mean a better match):N=i=^ | Ci – Ci | +N i=^ |std(Ci ) – std(Ci )|^ ^ + | Ec – Ec | + |std(Ec ) – std(Ec )|. Here N is the number of users. We denote with Ci , std(Ci ) the average and standard deviation, ^ ^ respectively, of the number of messages sent each day by user i in real data , and with Ci , std(Ci ) the corresponding values in the simulation run. Similarly, Ec , std(Ec ) denote the average and standard ^ ^ deviation of daily community sentiment and Ec , std(Ec ) those values in the simulation run. The relative sizes of the constants , , and are set to reflect how we prioritize the various aspects of the comparison between the real and simulated data. We have used = 1, = 0.1, = 10 and = 100, which means that we are putting a lot of emphasis on matching the volatility of daily community sentiment, and less emphasis on matching the level of daily community sentiment. Conversely for the number of messages sent per day by each agent, we prioritize matching the level over matching the volatility. We chose to model a small community so that we could trace through the simulations, in order to understand them better. We concentrated on modelling community 17 (friends chatting) which has 28 users. We calibrated the model for each of the three sentiment measures (MC), (SS) and (L). Each calibration was performed with an iterative grid search: we used five successive grid searches, each time zooming in on the area of the parameter space that appeared most promising in the previous search. The initial ranges searched for each parameter are given in appendix D. Because the simulation runs are randomized we performed 50 simulation runs for each combination of parameters tested, taking the mean of the resulting 50 scores as the score for the choice of parameters. The parameters found by the repeated grid search were as follows:(MC) number of iterations per day 1536 mean number of messages per burst contagion of sentiment factor sentiment reset probability sentiment noise level 2.(SS) 1536 2.(L) 1536 2………………………………………………………………………………………………………………………………………………………………………………………..in the real data multiplied by the number of iterations per day (so that the time period of the simulation matches that of the real data).rsos.royalsocietypublishing.org R. Soc. open sci. 3:…………………………………………5.1. CalibrationWe now describe how we calibrated our model to our Twitter data. The purpose of the six global parameters is to make our ABM `tuneable’, so that we can fine-tune it to match the behaviour observed in different kinds of online community. Calibrating the model to a particular community means finding the values of the six parameters that maximize the match between the model and the real data, i.e. the parameter values that make the simulation runs of the model most closely resemble the real data. In our case, the specific metrics that we use to compare the simulated data with the real data are: the activity levels (number of messages sent per day) of each individual user, and the day-to-day volatility of this, as well as the sentiment of the whole network, and its day-to-day volatility. Comparing the real data and simulated data in this way is an instance of the method of simulated moments. We therefore propose the following function to score a particular simulation run (smaller scores mean a better match):N=i=^ | Ci – Ci | +N i=^ |std(Ci ) – std(Ci )|^ ^ + | Ec – Ec | + |std(Ec ) – std(Ec )|. Here N is the number of users. We denote with Ci , std(Ci ) the average and standard deviation, ^ ^ respectively, of the number of messages sent each day by user i in real data , and with Ci , std(Ci ) the corresponding values in the simulation run. Similarly, Ec , std(Ec ) denote the average and standard ^ ^ deviation of daily community sentiment and Ec , std(Ec ) those values in the simulation run. The relative sizes of the constants , , and are set to reflect how we prioritize the various aspects of the comparison between the real and simulated data. We have used = 1, = 0.1, = 10 and = 100, which means that we are putting a lot of emphasis on matching the volatility of daily community sentiment, and less emphasis on matching the level of daily community sentiment. Conversely for the number of messages sent per day by each agent, we prioritize matching the level over matching the volatility. We chose to model a small community so that we could trace through the simulations, in order to understand them better. We concentrated on modelling community 17 (friends chatting) which has 28 users. We calibrated the model for each of the three sentiment measures (MC), (SS) and (L). Each calibration was performed with an iterative grid search: we used five successive grid searches, each time zooming in on the area of the parameter space that appeared most promising in the previous search. The initial ranges searched for each parameter are given in appendix D. Because the simulation runs are randomized we performed 50 simulation runs for each combination of parameters tested, taking the mean of the resulting 50 scores as the score for the choice of parameters. The parameters found by the repeated grid search were as follows:(MC) number of iterations per day 1536 mean number of messages per burst contagion of sentiment factor sentiment reset probability sentiment noise level 2.(SS) 1536 2.(L) 1536 2…………………………………………………………………………………………………………………………………………………………

Leave a Reply