Microservice Recipe: Detecting Sentiment Inflection Within Social Data

Service Recipes > Detecting Sentiment Inflection Within Social Data
02 Oct 2016

An example monitoring and alerting function to detect sentiment inflection points

Sentiment analysis is often a debated measurement within the social analytics world. Many social data analysts or data scientists discard it simply due to the likely inaccuracies caused by mis-classification. Complexities include longer form text (may contain both positive and negative sentiment), false-positives and dealing with slang, mis-spellings and shorthand, all of which are often seen within social data.

It is also sometimes perceived that understanding the sentiment of conversations can have limited value even when deemed to be accurate. For example, within a financial analysis use case, understanding that Twitter conversations around a specific company are largely positive may not be particularly useful.

Use Case

With the above context in mind, a more useful insight may be understanding when a sentiment score is changing. If something is moving from positive to negative or vice versa, this can be a powerful signal to social media analysts that something warrants further investigation. In order to track these changes, the concept is simple; understand what normal is and if this changes, initiate an alert.

In this example, Eventn is used to monitor incoming sentiment scores and compare two sliding windows of time against each other to detect signal inflection. Using this technique, shifts in sentiment can be reliably detected rather than single outliers triggering alerts. This recipe does not look to measure sentiment as part of this example. That topic warrants a separate recipe post!

Sample Data

Sentiment engines typically produce a score value to indicate if a the content is positive, neutral or negative. For the purposes of this example, it is assumes that an integer between -100 and +100 is provided, where negative values indicate negative sentiment, zero is neutral and positive values indicate positive sentiment. The higher or lower the values, the stronger the sentiment signal.

To emulate social data (decorated with a sentiment score) being consumed by an Eventn service, a simple script was developed to generate a fictitious sentiment score and to POST this to the service for storage. An example JSON body:

  {
      "records": [
          {
              "sentiment": 2
          }
      ]
  }

Clearly the JSON data can be sent in any format. Here we send arecords array containing a singlesentiment value object, but multiple objects could of course be present within this array.

ExamplePOSTFunction

The POST function executes when the a POST request is made to the Eventn service. Here we use the MySQL store to persist the sentiment values. Given that the default schema provides adata field of type JSON, no scheme modifications are required.

ThePOST function is is the default that simply saves the payload to the store:

function onPost(context, request) {
    return context.store().insert({ data: JSON.stringify(request.payload) });
}

module.exports = onPost;

Calculating Sentiment Inflection

For this example we create two sliding time windows containing all sentiment values for their respective periods. The "local" window will contain sentiment values for the last 10 minutes and the "global" window will contain sentiment values for the last 60 minutes. The output score is then calculated as follows:

score = (avg Local - avg Global) / standardDeviation(global)

Using this technique, given sentiment values within a constant range, the score value will sit within a -1 to +1 range. If the sentiment values start to trend outside of the normal range, the score will start to increase or decrease accordingly making it simple to alert. For example, if the score increases to > 1, there has been a shift in positive sentiment.

ExampleGETFunction

To put the formula in to place we need to collect all of the sentiment values for the local and global time windows. Given that we are storing all values within the store, this is simply two SQL statements with different time ranges. With the result array values of each, we store these are variables and then preform the calculation across both sets of values. Note the use of the Simple Statistics library making the standard deviation calculation a breeze.

const ss = require("simple-statistics");

function onGet(context, request) {
let localWindow; let globalWindow; let debug = {};
return context.stores.default().raw(`select data->"$.records[0].sentiment" as "local" FROM ?? WHERE ts_created BETWEEN DATE_SUB(NOW(), INTERVAL 10 minute) AND NOW();`, context.id) .then(function(rows) { return localWindow = Object.keys(rows).map(f=>rows[f]["local"]); }) .then(function() { return context.stores.default().raw(`select data->"$.records[0].sentiment" as "global" FROM ?? WHERE ts_created BETWEEN DATE_SUB(NOW(), INTERVAL 60 minute) AND NOW();`, context.id) }) .then(function(globalWindow) {
globalWindow = Object.keys(globalWindow).map(f=>globalWindow[f]["global"]);
debug.globalWindow_length = globalWindow.length; debug.localWindow_length = localWindow.length;
let avgGlobal = ss.mean(globalWindow); let avgLocal = ss.mean(localWindow);
debug.globalWindow_avg = avgGlobal; debug.localWindow_avg = avgLocal;
// calculate score let score = (avgLocal - avgGlobal) / ss.standardDeviation(globalWindow);
// alert flag? let alert = false; if(score >= 1 || score <= -1){ alert = true; }
return { score, "alert": alert, debug } ;
});

}
module.exports = onGet;

Example Results

When the function is executed by making an HTTP GET request (which can be done using the Service Editor interface), the result data returns the score as well as some interesting debug information telling us how many values are within each window, and the average:

{
  "score": 0.0104121361603175,
  "alert": false,
  "debug": {
      "globalWindow_length": 3019,
      "localWindow_length": 879,
      "globalWindow_avg": 14.539913878767804,
      "localWindow_avg": 14.630261660978384
  }
}

You may also note analert flag. This is a simple flag that is set to true or false if the score value exceeds a predefined threshold, in this case hard coded to 1. A nice extentiosn would be to be able to set this value via a request parameter.

Plotted over time and introducing increased sentiment scores shows the score increase accordingly and the "alert" flag set when exceeding 1.

Results

Get Started For Free

Get Started