Trak.in is a popular Indian Business, Technology, Mobile & Startup blog featuring trending News, views and analytical take on Technology, Business, Finance, Telecom, Mobile, startups & Social Media Space

Every Tweet Ever Posted Since 2006 Will be Indexed; Half Trillion Tweets Are Now Searchable

0

Twitter has decided to embrace Big Data is a major way: Every tweet ever posted on the website is now searchable, and indexed in their database. For digital marketers, this is one huge announcement.

When Twitter filed their IPO last year, they had stated that 500 million Tweets are sent every day by 200 million active users. Their record of maximum Tweets per second (TPS) is 143,199, and on an average, 360,000 tweets are sent every minute, all over the world.

Indexing such huge data is a complex task, as they need to index each and every character ever posted on their platform. Using Big Data, they have no made it possible as almost half a trillion of tweets are now searchable on their portal.

Twitter engineering

How Did They Do This?

In a blog post published by Twitter, they have detailed each and every step involved in this massive indexing and then retrieving mechanism of Tweets. There are basically four main factors involved in this search:

Modularity: Earlier, Twitter had a real time search included in their website, which usually stores Tweets upto one week. This mechanism has been scaled into several modules using big data for searching every Tweet ever posted.

Scalability: Several modules have been scaled up to index billions of Tweets every week, and a new scalable system has been developed by Twitter for the same.

Cost Effectiveness: Engineers at Twitter have developed a new concept for storing this information, which was earlier used to be saved in RAM.

Simple Interface: The current advanced search interface on Twitter has been enabled to handle this new search mechanism.

Incremental Development: As shared by Twitter, this new search capability has been developed in a phased manner, right from 2012, when the idea was conceptualized.

Besides using Big Data, Twitter has used some advanced mechanism such as Batched data aggregation and preprocessing, Inverted index building, Earlybirds shards and Earlybird roots. You can know more about the process on their blog here.

Live Example of Twitter Search

When you visit the advanced search section of Twitter, enter “Happy” in the search box, and set dates as November 1st 2010 to November 8, 2010, and press search. You will get these results, wherein every Tweet with the word “Happy” is being shown to you.

Such advanced search was not possible earlier on.

Possibilities Using This Advanced Search

There can be immense usage of this advanced search option provided by Twitter, by every professional involved with digital media and it’s implications. For example, a showbiz blogger can dig deep into the Tweets posted by Kim Kardashian in 2010 or the romance developed between Brad Pitt and Angelina Jolie in 2009. Digital entrepreneurs can checkout what products were favorite in their niche during Christmas of 2011, and accordingly devise their strategy.

The happiest lot would be digital marketers, who have now virtually endless supply of highly relevant and niche data, which was out of access till now.

Kudos to the Twitter engineering team for accomplishing this feat. Do share your possibilities and implications of this advanced search on Twitter, by commenting right here.

Leave A Reply

Your email address will not be published.

who's online