CSS Design Award nominee

Case study

Designing a Big Data platform

to rival the GAFA companies

Synthesio, Ipsos Group

E-reputation

Date

2015

Missions :

Back-end development
Technical Architecture

Localisation

Paris

FABERNOVEL has been recruited by Synthesio, a specialist in the world of e-reputation, to help the company with the complete redesign of its digital platform so it can manage an ever-increasing volume of data.

The existing platform is reaching its limits in terms of volume, performance and cost per transaction unit. The challenge was to completely redesign the basic infrastructure so it could handle an increasing number of processes and volume of information.

So, Synthesio wanted to push back the obsolescence threshold and develop a solution that would be prepared for any challenge to come.

The solution proposed consisted of a system with massively decentralized and massively distributed architecture to support genuinely vast Big Data processing.

4

months of work

400

terabytes of storage

160

physical servers

2 billions

indexed pages per month

Project rollout

FABERNOVEL defined the functional architecture of the platform as well as designing a system that is reactive. The team at FABERNOVEL delivered the project using agile methods, creating a Proof of Concept (POC) for approving choices and confirming expected performance, developing a Minimum Viable Product (MVP) for the technical and functional components, and establishing a Proof of Delivery (POD) for the launch of the project.

A number of different activities were carried out to achieve the end result.

The team:

- designed a distributed database with decentralized processing;

- developed intelligent crawlers;

- l’implémentation d’algorithmes émotionnels (emotional computing) ;

- gathered real-time data on customer search requests;

- created emotional algorithms;

- made sure the teams at Synthesio could work autonomously by involving them in the development of the solution.

Results

The final platform uses the same technology and architecture used by the GAFA companies. It is capable of growing (in terms of volume processed) at the same pace as the social media it is observing. And the length of time it takes to generate customer relationships is now approaching real time.

Cédric Chantepie - Technical Lead

“ The main aim was to successfully deal with any future problems regarding constraints of volume ”

What kind of impact did this project have on your job?

This project has meant fully embracing the world of Big Data, the relevant technology and technical challenges for processing data. It’s also been an opportunity to discover and broaden my knowledge on Apache Spark, Apache Kafka, ElasticSearch and text analysis. We have migrated the MySQL database to MySQL+ES (600 ES servers) by way of Spark, with Kafka as a distributed streaming platform. It’s an exciting project, firstly because it’s more advanced than existing solutions and also because the processing, originally performed in monoblocs that we’ve broken down, has meant the platform now has all the necessary properties for scalability, partitioning and distribution. So, we’ve been able to adapt the changes to suit different platform components depending on future requirements. Working with the client’s team meant I could soon understand the major challenges of this project and come up with a clear solution to meet the client’s needs.

What is cutting edge about this project?

The main aim was to successfully deal with any future problems regarding constraints of volume. That meant developing a realtime processing chain and scalable improvements to cope with peak loads. And it also needed to be upgradeable. The platform now lets you quickly and flexibly retrieve and analyze the results of data analysis. So, our client can create detailed, personalized dashboards for their own customers. Opinion analysis was also included in data processing. And Real-time is an important part of managing e-reputation. We successfully created a platform that can manage relationships in pretty much real time.

How does this project bring you job satisfaction?

The first demo of the platform was available within a month, thanks to our partnership with developers from DevOps. The collaboration meant we could approve initial assumptions and technology carried over onto this project while avoiding a tunnel effect. The client team also benefited from learning about tech they’d not had the chance to use before. Now they can develop the platform independently. The performance of the platform meets Synthesio’s requirements and can be quickly upgraded in the future