A national standardized testing company needed to consolidate data sources in order to perform advanced analytics and to provide the organization with actionable insights. The company’s data comes from multiple third parties, which made it difficult to combine all sources and analyze the data holistically.
With a desire to transition to a digital exam from paper tests, this company wanted the ability to perform real-time analytics as students took the test online; this would allow the company to study student behavior and detect cheating as it happens. This company also offers preparatory courses but had no way to prove effectiveness since the student’s prep data was isolated from exam results.
The testing company additionally had a big opportunity to increase revenue by expanding into international markets but were legally constrained to only certain data storage locations. Their current architecture was rigid, expensive to maintain, and didn’t provide the flexibility they needed in order to maximize revenue and get the most from their data.
The testing company partnered with Zirous to identify current limitations and determine how to quickly deliver a custom solution that would provide new analytical possibilities and operational insights.
The teams worked together to:
- Define and understand business goals
- Identify all data sources and formats
- Empower the company’s in-house resources
- Implement a sustainable data analytics solution
By consolidating data from multiple third-party systems, this company was able to connect data for an individual student across multiple interactions (pre-tests, preparatory classes, multiple exams, etc.)
This holistic view of students allows results to be analyzed over time and to prove the positive correlation of preparatory participation with improved exam scores.
The scalability of the Big Data platform allows the company to spin up secure environments in international markets where an exam is administered so that the data is stored in accordance with all laws and regulations.
Ultimately, this data analytics solution can replace the organization’s traditional architecture, which will save millions of dollars in licensing fees.
– Kubernetes infrastructure orchestrated the use of NiFi and Kafka to ingest and stream internal and external data sources into a governed Hortonworks Data Platform, utilizing Apache Hadoop Distributed File System (HDFS) and Amazon Simple Storage Service (S3) as centralized data stores
– Hive was utilized to develop tables and data model layers to allow for querying and access to raw data
– Zeppelin notebooks allowed for indepth analytical research using Spark, R, and Python scripts