Intellum, the pioneer in online learning, has revolutionized virtual education with its easy-to-use platform for on-demand learning. With its learning management system, companies can present, manage, track, and improve learning events to engage and educate audiences.
Intellum has revolutionized how educational content is consumed, helping some of the world's largest and fastest-growing companies educate their customers, partners, and employees. During the past two decades, their platform grew exponentially, presenting several business challenges.
Scaling to meet high customer demands
Intellum created a new market space for online learning platforms, which has attracted the attention and interest of different competitors. With high-profile clients like Google, Facebook and Amazon, Intellum needed a way to quickly and efficiently process data while building their platform to meet increasing customer demand.
Bringing online content into one platform
In recent years, the shift to distance learning led to an increase in global content creators, and more and more users have started taking courses online. As a result, Intellum had to re-platform their solution to consolidate content generated by Intellium's several partners and customers into a singular learning platform called "Evolve". With Evolve, users could access multiple e-learning courses simultaneously without opening separate tabs, as well as enjoy a seamless learning experience on their mobile devices. As a result, users got a much richer, highly interactive learning experience, and Intellum could easily track e-learning metrics in a single platform without any headaches.
Learners completed Evolve content a staggering 90.3% of the time, indicating its effectiveness in encouraging students to participate and complete courses.
Meeting local compliance regulations
For Intellum, customer data stored in their service was highly confidential. Therefore, to meet the local compliance requirements, it was vital to ensure that data is strictly contained within the geographical service boundary. As a result, the business risk was elevated if Intellum leveraged any third-party services that were proprietary, un-certified, or running outside the allowed geo boundaries.
For faster data analysis, Intellum needed a tool that could break down the ETL process. Their old data architecture failed to meet these requirements and had several problems.
Unreliable under heavy data volumes
To ingest data into BigQuery, the legacy data pipeline architecture relied heavily on home-grown custom scripts built by DevOps teams at Intellum. For external data sources like Salesforce and Zendesk, singer was used. A significant amount of data had to be lifted and shifted from multiple databases, each with over 1 TB of data, making this process inefficient and hard to maintain. The home-grown scripts were also resource-intensive, sometimes causing overloaded machines to crash. Intellum wanted more predictability in their business, and the lack of reliability was not helping.
We have large databases with 20 years' worth of data, and this is a lot to process. The manual system we were using was not effective, so we're looking for a reliable system that splits up the ETL process easily and handles things more incrementally.
Performance pitfalls in the script
Apart from reliability, the home-grown script also had several performance pitfalls. For example, it could process only 1 table at a time, which took a long time to complete. Table data also needed to be fully synced, and storing incremental states was complex. It was also difficult to configure and tune the script to perform optimally because there were so many different data sources to export data from.
Intellum handles a lot of data daily. They move 30 TB of data daily to BigQuery from more than 17 databases, with 15 or so tables above 500 million rows each. One of the schemas also has up to 300 tables. To help manage these large data pipelines and efficiently break down the process, Intellum betted their design on Airbyte's APIs and baked them into their code running across different docker instances. This gave them more control over each stage of the ETL process — they could freely pick an available connector, configure the needed DVT transformations on the data, and seamlessly load this data into BigQuery. Using a container-based deployment approach also allowed Intellum to restart jobs easily and refresh or upgrade to different connectors on demand.
We were looking for a reliable system to break down the ETL process into smaller steps which Airbyte does effectively, and the process is further made easier with the docker instances, connectors, and networking.
Fast incremental data syncing and increased scalability
Airbyte identifies and tracks changes to data, which are replicated to the BigQuery data warehouse. Using change data capture (CDC) in Airbyte, incremental changes to the data made across the Intellum platform could be easily captured and pushed to the data warehouse in near real-time. Since Airbyte is open-source, Intellum could view the code underneath the hood and make modifications as needed. Additionally, with Airbyte's support for Kubernetes, Intellum's new architecture could scale horizontally to sync large amounts of data while keeping the server resources in check.
With the quick and easy Airbyte setup process and with a few answers from the Airbyte community in Slack, Intellum was able to launch the new platform quickly and efficiently within two weeks. Airbyte worked out-of-the-box, and there was no long setup protocol like other tools.
Looking into the future
Today, Airbyte is an essential component of Intellum's data processing systems. In the future, as Intellum grows and scales its use cases, it is confident in Airbyte's capabilities and the support that it can get from experts in the community.
We are extremely impressed by engagement in the slack community, and the team makes it a point to respond to every question and issue on the Slack channel, to customers big and small.
Start breaking your data siloes with Airbyte.