Share this page
Intranet Search Engine
A global public relations firm supporting big clients including many from the FTSE 100, Nasdaq 100, and Nikkei 255. Trading out of 14 countries, with a focus on producing high quality multimedia offerings, worldwide collaboration of their staff is paramount.
As a firm advising on high profile PR announcements, there are a large number of different outputs for any piece of work - press releases, videos, blogging, website updates, and printed literature to name a few. The number of <u>types</u> of work though can often be categorised into one of a few common threads.
In many cases, new work is often fairly similar to previous work in the same area - for example, the press releases for one IPO will likely follow a similar pattern to the previous one. The client was looking for ways to continuously improve the outputs they provided to their clients: cherry picking appropriate similar work to draw upon, and reusing existing best practice where possible. This would allow them to move towards standardising their offerings, making sure that no aspect was missed, and continuously improving them as they did.
However, with such a large volume of historic (and current) data stored: How do you categorise that data? How do you identify best practice among it? and how do you surface the relevant information in response to a user request?
One particularly tricky aspect was the multitude of terminology used throughout the business. Is an IPO the same as a market floatation? How similar to an Annual Report is a financial release?
This complicated problem was approached from two different directions: the data and the language.
Data was located across multiple different data sources (SharePoint, network shares, SQL, and Oracle databases). By implemented continuous indexing of each data source, and all the rules necessarily to extracted, break down, and classify the data into a meaningful shape. The results from these disparate data sources were then amalgamated into a single authoritative feed taking into account the relevant weighting of each to produce the final results.
To help address the language issues, a taxonomy was produced in collaboration with business stakeholders, and integrated deep into the search engine. The custom written natural language processor allowed users to ask for data using their own terminology and have the system provide additional results inferred from the programmed knowledge of the business language. Management of the taxonomy was provided through a self-service portal which allowed it to be continuously updated to ensure it stayed relevant.
Finally, the results were consolidated into a custom branded intranet portal, and themed to fit with their existing intranet branding. Relevant commonly used information (such as the phone number for a contact) could be surfaced directly in the results, and more in-depth information (such as document previews) could be displayed inline on demand. This allowed the users to find the data they need quickly, and make an informed decision before opening a document.
Written as a C# web application and designed to run either inside Microsoft SharePoint or as a standalone application. Search was provided using Microsoft SharePoint, Microsoft SQL server, and Lucene.