Microservice Recipe: URL Topic Extraction

Service Recipes > URL Topic Extraction
19 Jul 2017

Extract topics for a given URL using Latent Dirichlet Allocation (LDA).

This recipe demonstrates how to develop a microservice for automatic topic extraction from web page content. This is useful for automatic classification of content and generation of meta data. Latent Dirichlet Allocation topic modeling is a powerful technique for unsupervised analysis of large document collections.

This example microservice uses the LDA NPM package written by Kory Becker and Node Readability by Zihua Li.

ExampleGETFunction

Description

Extract topics for a given URL using Latent Dirichlet Allocation (LDA).

Parameters

urlSTRING, REQUIRED - the url of the page for extraction.
topicsNUMBER, optional - the number of topics to extract their related keywords
termsNUMBER, optional - the number of terms to derive for each topic

Code

Example Request

Here we take an example web page url e.g."https://techcrunch.com/2017/07/19/apple-launches-machine-learning-research-site/", URL encode and send as a request parameter using an HTTP GET request:

curl -H "Content-Type: application/json" \
-H "Authorization: Bearer {SERVICE_TOKEN}" \
https://service.eventn.com/{SERVICE_NAME}?url=https%3A%2F%2Ftechcrunch.com%2F2017%2F07%2F19%2Fapple-launches-machine-learning-research-site%2F

Example Response

Get Started For Free

Get Started