Microservice Recipe: URL Topic Extraction

Service Recipes > URL Topic Extraction
19 Jul 2017

Extract topics for a given URL using Latent Dirichlet Allocation (LDA).

This recipe demonstrates how to develop a microservice for automatic topic extraction from web page content. This is useful for automatic classification of content and generation of meta data. Latent Dirichlet Allocation topic modeling is a powerful technique for unsupervised analysis of large document collections.

This example microservice uses the LDA NPM package written by Kory Becker and Node Readability by Zihua Li.



Extract topics for a given URL using Latent Dirichlet Allocation (LDA).


urlSTRING, REQUIRED - the url of the page for extraction.
topicsNUMBER, optional - the number of topics to extract their related keywords
termsNUMBER, optional - the number of terms to derive for each topic


Example Request

Here we take an example web page url e.g."https://techcrunch.com/2017/07/19/apple-launches-machine-learning-research-site/", URL encode and send as a request parameter using an HTTP GET request:

curl -H "Content-Type: application/json" \
-H "Authorization: Bearer {SERVICE_TOKEN}" \

Example Response

Get Started For Free

Get Started