Microservice Recipe: Latent Dirichlet Allocation (LDA) Topic Modeling

Service Recipes > Latent Dirichlet Allocation (LDA) Topic Modeling
02 Jul 2017

LDA is a machine learning algorithm that extracts topics and their related keywords from a collection of documents.

Latent Dirichlet Allocation topic modeling is a powerful technique for unsupervised analysis of large document collections. Topic models conceive latent topics in text using hidden random variables, and discover that structure with posterior inference. Topic models have a wide range of applications like tag recommendation, text categorization, keyword extraction and similarity search in the broad fields of text mining, information retrieval, statistical language modeling.

This example microservice uses the LDA NPM package written by Kory Becker.

ExampleGETFunction

Description

This example microservice will extract topics along with their associated probabilities from a provided text string.

Parameters

textSTRING, REQUIRED - the text string for topic extraction.
topicsNUMBER, optional - the number of topics to extract their related keywords
termsNUMBER, optional - the number of terms to derive for each topic

Code

const lda = require('lda');

function onGet(context, request) {

// Defaults const topics = request.query.topics || 1; const terms = request.query.terms || 10; const text = request.query.text || "";

if(!text) return "ERROR: Text cannot be empty.";

// Extract sentences. const documents = text.match( /[^\.!\?]+[\.!\?]+/g );

return lda(documents, topics, terms); }

module.exports = onGet;

Example Request

Here we take several paragraphs of text from an example news article, url encode and submit an HTTP GET request. The sample text is as follows:

It can be really hard to tell who is flying a drone, even if the aircraft is flying within a pilot’s line of sight. Just because you can see the drone doesn’t mean you can see the pilot, and when a drone is hundreds of feet in the air, the pilot could be anywhere.

The difficulty of identifying who is flying a drone has sparked alarm among law enforcement, which is one reason why the Federal Aviation Administration has opened a new rulemaking committee to try to find a solution that would allow police to identify drones remotely.

The FAA held its first meeting of that committee last week, and today the agency finally reported on what took place. At the meeting, participants — who included representatives from Amazon, Ford and the New York Police Department — talked about various remote identification solutions currently available, air traffic control for drones and concerns from law enforcement.

Though most drones that weigh over half a pound are registered, and thus should have an identification number on the drone, that ID is nearly impossible to see from the ground.

Not all drones over a half pound are registered, though, since a federal court nixed the FAA’s registration rules for non-commercial aircraft last month, saying the agency didn’t have the authority to require registration of drones that are being flown for fun.

Still, legislation is moving through Congress now that could restore the FAA’s authority to regulate non-commercial drones, which would allow the agency to reinstate the registration requirement.

Registration will likely be necessary for any remote identification system to work, since the drone would have to be listed in some sort of database in order to associate the aircraft with its operator or owner.

URL encode and send an HTTP GET request:

curl -H 'authorization: Bearer {SERVICE_TOKEN}' -H 'content-type: application/json' 'https://service.eventn.com/{SERVICE_ID}?text=It%20can%20be%20really%20hard%20to%20tell%20who%20is%20flying%20a%20drone%2C%20even%20if%20the%20aircraft%20is%20flying%20within%20a%20pilot%E2%80%99s%20line%20of%20sight.%20Just%20because%20you%20can%20see%20the%20drone%20doesn%E2%80%99t%20mean%20you%20can%20see%20the%20pilot%2C%20and%20when%20a%20drone%20is%20hundreds%20of%20feet%20in%20the%20air%2C%20the%20pilot%20could%20be%20anywhere.%20The%20difficulty%20of%20identifying%20who%20is%20flying%20a%20drone%20has%20sparked%20alarm%20among%20law%20enforcement%2C%20which%20is%20one%20reason%20why%20the%20Federal%20Aviation%20Administration%20has%20opened%20a%20new%20rulemaking%20committee%20to%20try%20to%20find%20a%20solution%20that%20would%20allow%20police%20to%20identify%20drones%20remotely.%20The%20FAA%20held%20its%20first%20meeting%20of%20that%20committee%20last%20week%2C%20and%20today%20the%20agency%20finally%20reported%20on%20what%20took%20place.%20At%20the%20meeting%2C%20participants%20%E2%80%94%20who%20included%20representatives%20from%20Amazon%2C%20Ford%20and%20the%20New%20York%20Police%20Department%20%E2%80%94%20talked%20about%20various%20remote%20identification%20solutions%20currently%20available%2C%20air%20traffic%20control%20for%20drones%20and%20concerns%20from%20law%20enforcement.%20Though%20most%20drones%20that%20weigh%20over%20half%20a%20pound%20are%20registered%2C%20and%20thus%20should%20have%20an%20identification%20number%20on%20the%20drone%2C%20that%20ID%20is%20nearly%20impossible%20to%20see%20from%20the%20ground.%20Not%20all%20drones%20over%20a%20half%20pound%20are%20registered%2C%20though%2C%20since%20a%20federal%20court%20nixed%20the%20FAA%E2%80%99s%20registration%20rules%20for%20non-commercial%20aircraft%20last%20month%2C%20saying%20the%20agency%20didn%E2%80%99t%20have%20the%20authority%20to%20require%20registration%20of%20drones%20that%20are%20being%20flown%20for%20fun.%20Still%2C%20legislation%20is%20moving%20through%20Congress%20now%20that%20could%20restore%20the%20FAA%E2%80%99s%20authority%20to%20regulate%20non-commercial%20drones%2C%20which%20would%20allow%20the%20agency%20to%20reinstate%20the%20registration%20requirement.%20Registration%20will%20likely%20be%20necessary%20for%20any%20remote%20identification%20system%20to%20work%2C%20since%20the%20drone%20would%20have%20to%20be%20listed%20in%20some%20sort%20of%20database%20in%20order%20to%20associate%20the%20aircraft%20with%20its%20operator%20or%20owner.'

Example Response

{
  "status": "success",
  "data": [
    [
      {
        "term": "drone",
        "probability": 0.099
      },
      {
        "term": "registration",
        "probability": 0.033
      },
      {
        "term": "remotely",
        "probability": 0.025
      },
      {
        "term": "pilots",
        "probability": 0.025
      },
      {
        "term": "identification",
        "probability": 0.025
      },
      {
        "term": "flying",
        "probability": 0.025
      },
      {
        "term": "faa",
        "probability": 0.025
      },
      {
        "term": "aircraft",
        "probability": 0.025
      },
      {
        "term": "agency",
        "probability": 0.025
      }
    ]
  ]
}

Get Started For Free

Get Started