Open edX Platform Atlasv1.0
Instructor Experiences

Analytics & Insights

Scaffold

Overview

Analytics & Insights covers the data and visualization tools that instructors use to understand learner behavior — engagement rates, video watch times, problem attempts, completion patterns, and course-level aggregations.

Open edX has undergone a major analytics evolution: from a proprietary in-house analytics stack (edX Insights, backed by a Hadoop/Hive data pipeline), to the community-developed Aspects platform (ClickHouse + Apache Superset). Most active deployments are migrating to Aspects.

Current State (2026)

• Aspects: The current-generation analytics platform — event data flows from the LMS to ClickHouse via `event-routing-backends`; Apache Superset dashboards give instructors rich, real-time views

• Legacy Insights: The old edX Insights product (Python + Hadoop + Hive + Django) is effectively deprecated for the community; still may run in some older deployments

• LMS instructor tab: The legacy instructor analytics tab in the LMS provides basic aggregate stats (enrollment count, grade distribution) — still available but not being enhanced

• Event tracking: `event-tracking` captures browser and server events; `event-routing-backends` routes them to ClickHouse and other backends

Architecture

• Event pipeline: Browser/server → `event-tracking` → `openedx-platform` Celery → `event-routing-backends` → ClickHouse (Aspects)

• Aspects stack: Tutor plugin (`openedx-aspects`) installs ClickHouse + Superset; `aspects-dbt` transforms raw events into analytics models

• Superset dashboards: Pre-built dashboards for enrollment, engagement, completion, assessment analytics; instructors access via embedded Superset

• Real-time vs. batch: ClickHouse provides near-real-time analytics (seconds to minutes) vs. old Hadoop pipeline (hours to days)

History

Origin

• Year introduced: ~2013 (basic analytics from early edX)

• Initial implementation: Django-rendered analytics pages in the LMS instructor tab; basic enrollment and grade stats

• Context: Instructors needed visibility into how learners were engaging with their courses; data was also used by edX researchers for learning science

Key Milestones

~2013

Basic instructor analytics tab in LMS

~2014–2015

edX Insights launched (Hadoop/Hive pipeline)

~2020

edX Insights begin to stagnate post-2U acquisition

~2022–2023

Aspects project initiated by community

~2024

Aspects becomes the recommended analytics approach

Open Questions

  • ?Who built edX Insights and what was the original data pipeline architecture?
  • ?What drove the decision to build a Hadoop/Hive pipeline rather than using simpler approaches?
  • ?Who initiated the Aspects project and what was the community process?
  • ?How does the event schema in `event-tracking` compare to industry standards (xAPI, Caliper)?
  • ?What analytics questions do instructors most commonly ask that the platform struggles to answer?
  • ?How was learning analytics research (Learning Sciences) connected to the platform's analytics infrastructure?