In this multi-part series, we will try building an end-end search engine. The first part will focus on getting the right tools and getting technology stack ready. We will build this search engine with an AngularJS front-end and use elasticsearch as the computation back end.

Most applications of today are data driven. Single Page Applications (SPA) are gaining a lot of traction because of their simplicity and ability to act as a graceful front end to gigabytes of back end data.

Features

Search engines, especially Google has evolved to be extremely intuitive over the past two decades. What we will attempt to do is build a full-text search engine that would be a really quick information retrieval system if you are sitting on lots of data. Some of the features we would be looking to incorporate is –

  • Fuzzy Search
  • Subset Pattern Matching
  • Auto-complete suggestions
  • Scoring Algorithm
  • Further Sorting and filtering the results

Building an end-end Search Engine - angularjs

AngularJS

AngularJS is a front end HTML framework developed by Google. They have made it open-source. It lets you build lightweight dynamic webapps. It is a simple javascript library that you have to import. That is enough to get you started. You do not need to explicitly install anything for this.

Building an end-end Search Engine - elasticsearch

Elasticsearch

In the pursuit of building an end-end search engine, the computation engine is the most crucial component. Elasticsearch is a REST interface on top of Apache Lucene. It is simple and blazingly fast. It offers all the CRUD operations that would easily help in indexing and retrieving data. More about elasticsearch later. Installing elasticsearch requires Java 8. It is recommended that Oracle JDK version 1.8.0_73 be used. You can head over to Oracle’s website to check install Java 8 for your operating system.

Once Java is set up, you are ready to install elasticsearch. Download elasticsearch binaries from here.

After downloading, just extract the contents in a folder of your choice.

Open a command prompt window and navigate to this folder. Go further in the bin folder in cmd and type elasticsearch and hit enter. If there are no errors, head over to your browser and hit the below URL –

http://localhost:9200

{
  "name" : "Hitman",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.3.2",
    "build_hash" : "b9e4a6acad4008027e4038f6abed7f7dba346f94",
    "build_timestamp" : "2016-04-21T16:03:47Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

A response similar to this will be shown. Any errors will either be system specific or platform specific. Drop a note on the comments and I’d love to take a look.

Building an end-end Search Engine - python

Python

I mostly love using the Anaconda distribution of python. Give it a chance if you haven’t already and it will work wonders. The python component will not be present anywhere in the live setup, we would just be using it to make a few configurations on elasticsearch as it has a very nice elasticsearch api. Once the distribution is installed, head over to the command prompt and type

pip install elasticsearch

These are all that you need to do to complete the installation and setup.
Finally, congratulations! You have taken the first step towards building an end-end search engine. Going forward we will directly look at elasticsearch configurations and the angular UI to go with it.

Building an end-end Search Engine: Part 1

Leave a Reply