Automating big data insights to inform tourism marketing

sentiment-index.com
  • Technical Audit + Review
  • Feature Definition + Requirements Planning
  • Architectured Platform Infrastructure
  • Serverless Implementation
  • At-Scale Cloud Ingestion Pipeline + Interface APIs
  • UX/UI Design
  • Performance + Quality Assurance Tests
Working with Think! X, we empowered destination marketers to make informed, strategic decisions by automating their data analysis and providing an easily accessible dashboard of insights powered by machine learning.

Helping deliver results faster than ever

Think! X captures online sentiment from social media discussion for destination marketers with their unique Tourism Sentiment Index (TSI); enabling them to understand the online perception of their destination and tailor their messaging for maximum impact. Their process wasn’t simple though. To produce these insights, Think! X had to manually perform imports, calculations, and reports at scale for each individual destination they served. This lengthy process inhibited real-time insights, restricted the number of clients Think! X could serve, and only allowed them to provide client-specific data.

This led Think! X to partner with Invoke to automate this process with TSI Live — enabling destination marketers to immediately react to ever-evolving public perceptions. We may have been creating the product from scratch, but the business already existed. Their clients had expectations on what data and insights they were paying for. In order to deliver value, our automated solution had to provide everything they already had and more.

Automating their Tourism Sentiment Index turns a benefit that arrived quarterly or even yearly into something that clients can access at any time to drive instant impact.

Building an accessible database

To successfully analyze data, it must first be captured. Working with their data pipeline, we utilized a collection of serverless functions and technologies that import data, segment it, provide post analysis, and leverage the enterprise-grade natural language processing of AWS Comprehend for text classification.

Then we store it, but not in a traditional database. It’s a database of indexed documents containing information like sentiment scores, text excerpts, emotions, locations, and more. From these documents, the product builds queries to aggregate and extract data and determine an analysis based on continual segmentations of the data. That could mean starting with a query like “Whistler”, then aggregating and analyzing that bucket of data to extract statistics and other analytics such as sentiment ranges and trending topics.

TSI Live is able to instantly provide all kinds of useful insights — average sentiments, changes over time, prevalent emotions, and more — that inform the decisions of destination marketers.

Making sense of a firehose of data

The biggest challenge for this product was the scale — the platform is processing a huge amount of data from sources like Twitter, Reddit, Instagram, news sources, and more. All in, it processes between 100,000 and 170,000 posts per day. In a month, that’s approximately three million posts.

Tackling this challenge required an approach that maintained live, accurate information for end users while keeping the cost of machine learning at that scale under control. Our multi-faceted approach relied on the use of cutting-edge serverless technologies that allow us to ingest, queue, rate throttle, and serve data as soon as it was available.

An intelligent approach to data processing resulted in a platform that’s so efficient it runs for free the first 10 days of the month.

The difference between London and London

In order to be a thorough representation of word of mouth, TSI Live pulls in data from a wide variety of social media sources. This leads to huge variations in the content itself; there are posts with text ranging from 100 to 5000 characters, posts with images or videos, posts with hashtags and emojis, posts with colloquial grammar and nicknames, and even posts with raw HTML. Handling inconsistent data and extracting valuable entities is vital to the value of the platform.

However, it’s not just extracting the subject from a sentence. It’s being able to differentiate them. How can the product consistently recognize the difference between London, Ontario and London, England? Or understand that Myrtle Beach is also a city, not just a beach. With a thorough process that applies broad heuristics and scrutinized targeting, TSI Live is now able to consistently classify items correctly.

How do you avoid classifying things incorrectly? Our algorithms handle the edge cases, helping it recognize when someone is discussing Hamilton the musical, Hamilton the city in Canada, or Hamilton County, Indiana — one of six Hamilton Counties in the United States.

Extending the system to tackle COVID sentiment

Part of our engagement with Think! X involved extending TSI Live to cover the specific use case of sentiment regarding how a location is addressing the COVID-19 pandemic.

The TSI is reborn as TSI Live, an interactive dashboard that covers both aggregated tourism and COVID-indexed data. By uncovering useful insights, these marketers can understand changing trends, what is impacting positive or negative sentiment, and what the overall conversation volume is towards both COVID-19 and general travel. As the travel industry evolves and changes going forward, this is the type of invaluable information that can help ensure thoughtful and successful messaging from destination marketers around the world.

100,000+ social media posts analyzed daily

We’re ready to make a plan that works for your organization.