Work

Data-Science Launchpad

Python
FastAPI
Docker
Ubuntu
Nginx

A self-hosted multi-user platform for managing individual RStudio and JupyterLab instances with email-based OTP authentication, enabling secure and isolated data science environments.

Data Science Launchpad banner

The GeDaC Data Science Launchpad is a private, multi-user platform for managing individual RStudio and JupyterLab (Data Science Notebook) instances on a Linux Server. It provides researchers and data scientists with on-demand, isolated computational environments through a simple web interface.

Overview

This platform addresses the need for secure, user-specific data science environments in an institutional setting. Deployed on a high-performance Ubuntu server with 256GB RAM and 64 CPU cores, it uses Docker containerization to provide each user with an isolated workspace featuring persistent storage, configurable resource limits, and automatic session management. The system implements passwordless authentication via email-based OTP, streamlining access while maintaining security.

Core Features

The platform provides:

  • Multi-User Isolation: Each user receives their own containerized environment with dedicated resources
  • Email-Based OTP Authentication: Passwordless login using 6-digit codes with automatic user registration
  • On-Demand Environments: Users can launch personal RStudio or JupyterLab instances from their dashboard
  • Persistent Storage: User data preserved in /home/username directories across sessions (1.5TB capacity)
  • Dynamic Port Management: Automatic port allocation for each container instance
  • Auto-Cleanup: Expired containers automatically removed via systemd timer
  • Admin Controls: Comprehensive management of users, instances, and resource quotas (memory, CPU)
  • Security Features: Rate limiting, attempt limiting, email validation, and secure session management

Technology Stack

Built with:

  • Backend: Python with FastAPI framework
  • Frontend: Jinja2 Templates with HTML & CSS
  • Authentication: Email-based OTP via SMTP + secure session-based authentication
  • Database: SQLite for lightweight, file-based data persistence
  • Email Service: SMTP integration (AWS SES support)
  • Containerization: Docker (utilizing rocker/rstudio and jupyter/datascience-notebook base images)
  • Reverse Proxy: Nginx for SSL termination and traffic routing
  • Deployment: Systemd service management on Ubuntu Linux

Architecture

The platform operates on a high-performance Ubuntu server (251GB RAM, 64 CPU cores) with a multi-tiered architecture:

  • Nginx Reverse Proxy: Routes HTTPS traffic to the FastAPI backend and individual container instances
  • FastAPI Backend: Orchestrates container lifecycle, user management, and authentication on port 8000
  • Docker Daemon: Spawns and manages isolated RStudio and JupyterLab containers with dynamic port allocation
  • SQLite Database: Stores user information, container metadata, and session data
  • Persistent Storage: User workspaces mapped to /home/username (1.5TB), with additional 16.4TB data storage available

Each user’s environment runs in an isolated Docker container with dedicated resources, and all data persists across sessions through volume mounts to the host filesystem.

Architecture Diagram

Links: GitHub Repository

Impact

The Data Science Launchpad democratizes access to computational resources for researchers, eliminating the complexity of environment setup and infrastructure management. By providing instant access to configured RStudio and JupyterLab environments, it accelerates research workflows and ensures reproducible computational analyses across the institution.