Managing a 14TB Reporting Data Warehouse

Soumya Ranjan

Soumya Ranjan on April 22, 2015

It was a great pleasure to be a part of first ever full day PostgreSQL event (PGDay India 2015 event) in Bangalore with more than 70 members attending it.

For this event, I was given the chance of presenting on how we "Manage a 14TB Reporting Data Warehouse at Inmobi" using postgresql.

To shed some light on the topic :

While dealing with large volumes of data (14Terabyte) on a single database server, it becomes apparently evident to think about the best way to either build or manage so that querying and dataloading becomes faster and easier. The key feature that makes data-fetch/data-load faster in PostgreSQL is through right partitioning and indexing. In this talk, I delved into how tables and indexes use up the space on server and elaborated the techniques we use to analyse and fix the same. Moreover, I focussed on automation of various data-loading and archival jobs and even highlighted on pre-requisites for managing huge databases. Towards the end, it followed by a few quick tips on database fitness and maintenance.

Hope you find the content quite useful as everybody else.