- Python 3.10+ installed
- Virtual environment created
- Dependencies installed from requirements.txt
- Project structure verified
Generate realistic synthetic data simulating 50,000 Notion users over 2 years.
-
Configure parameters in
src/config.py:- Set user count: 50,000
- Date range: Jan 2023 - Dec 2025
- Business metrics (MAU, revenue, etc.)
-
Run data generator:
python src/data_generator.py-
Verify outputs:
- Check
data/synthetic/user_profiles.csv(50,000 rows) - Check
data/synthetic/user_events.csv(~500,000-1M events)
- Check
-
Review data quality:
- User segments distributed correctly
- Event types realistic
- Timestamps sequential
- 50,000 user profiles
- ~750,000 events
- 4 user segments (individual, small_team, enterprise, education)
- 6 acquisition channels
- ~13% paid conversion rate
Calculate North Star metric and all supporting KPIs.
-
Define North Star Metric:
- Rationale: Weekly Active Collaborative Workspaces
- Combines engagement (weekly active) + network effects (collaboration)
-
Calculate engagement metrics:
python src/metrics_framework.py-
Review outputs:
- North Star: ~2.1M collaborative workspaces
- WAU: ~7M users
- MAU: ~10M users
- DAU/MAU ratio: ~35%
- Activation rate: ~60%
-
Document insights:
- Stickiness benchmark (DAU/MAU > 30% is good for productivity tools)
- Activation rate comparison (60% is above industry average)
- Feature adoption rates
- Daily Active Users (DAU)
- Weekly Active Users (WAU)
- Monthly Active Users (MAU)
- DAU/MAU Ratio (Stickiness)
- Activation Rate
- Feature Adoption Rates
- Power User Segmentation
Build and analyze 7-stage user acquisition funnel.
-
Define funnel stages:
- Awareness → Signup → Activation → Engagement → Habit → Collaboration → Monetization
-
Run funnel analysis:
python src/funnel_analysis.py-
Analyze conversions:
- Stage-by-stage conversion rates
- Identify biggest drop-offs
- Segment-wise performance
-
Compare segments:
- Enterprise vs Individual users
- Channel performance (referral vs paid ads)
- Activation speed impact on retention
- Overall funnel conversion: ~1.2% (signup to paid)
- Biggest drop-off: Activation → Engagement (55% drop)
- Enterprise users convert 6x better than individuals
- Fast activators (< 24 hours) have 2x better retention
funnel_metrics.csvsegment_funnel.csv- Drop-off analysis report
Analyze user retention by signup cohort over time.
-
Create monthly cohorts:
- Group users by signup month
- Track activity in subsequent months
-
Run cohort analysis:
python src/cohort_analysis.py-
Calculate retention rates:
- Day 1, 7, 14, 30, 60, 90 retention
- Month-over-month retention curves
- Cohort LTV estimates
-
Compare early vs late cohorts:
- Product improvements reflected in retention
- Late cohorts show 8% better Month 1 retention
- Month 1 retention: 45%
- Month 3 retention: 29%
- Month 6 retention: 19%
- Power law: 10% of users drive 50% of activity
cohort_retention.csvretention_matrix.csv(heatmap data)- LTV by cohort analysis
Identify and quantify 5 major growth opportunities.
-
Define growth levers:
- Template discovery improvement
- Viral sharing optimization
- SEO content strategy
- Mobile experience enhancement
- API/integrations expansion
-
Model each lever:
python src/growth_modeling.py-
Calculate impact:
- Revenue projections
- User acquisition estimates
- Confidence levels
-
Prioritize by ROI:
- Rank by revenue impact × confidence
- Create implementation timeline
-
Project compound impact:
- Select top 3 levers
- Model 12-month projection
- Calculate cumulative revenue
- #1 Lever: SEO Content → $12.5M annual revenue
- #2 Lever: Viral Sharing → $7.8M annual revenue
- #3 Lever: Templates → $6.2M annual revenue
- Compound 12-month impact: $26.5M additional revenue
growth_levers.csvgrowth_projections.csv- Sensitivity analysis charts
Generate production-ready SQL queries for all key metrics.
- Generate query templates:
python src/sql_queries.py-
Review queries for:
- DAU/MAU metrics
- Funnel analysis
- Cohort retention
- Power users
- Feature adoption
- Revenue metrics
- North Star metric
-
Customize for your database:
- Adjust schema names
- Optimize indexes
- Add filters as needed
- 7 SQL files in
sql/directory - Ready to run on PostgreSQL
- Documented and commented
Create beautiful, interactive dashboards.
- Generate all visualizations:
- Included in main pipeline
- Or run individually:
python src/visualization.py-
Review outputs:
- North Star metric gauge
- Engagement trends
- Funnel visualization
- Cohort retention heatmap
- Feature adoption charts
- Growth levers bar chart
- Executive dashboard
-
Open dashboards:
start outputs\dashboards\executive_dashboard.html- 8+ interactive HTML visualizations
- Executive dashboard (comprehensive)
- Print-ready PNG charts
Generate executive summary and recommendations.
- Run complete pipeline:
python scripts\run_full_analysis.py-
Review generated report:
- Open
outputs/reports/analytics_framework_report.txt
- Open
-
Customize recommendations:
- Add company-specific context
- Adjust timelines
- Include stakeholder notes
- Executive Summary
- North Star Metric Analysis
- Key Metrics Snapshot
- Funnel Analysis
- Retention Insights
- Growth Opportunities (ranked)
- Strategic Recommendations
- Next Steps
- No missing user IDs
- Timestamps sequential
- Conversion rates realistic (1-20%)
- Event distributions make sense
- Metrics calculated correctly
- Funnel stages flow logically
- Retention curves show decay
- Growth projections reasonable
- All charts render correctly
- Colors consistent with brand
- Labels clear and readable
- Interactive elements work
- All code commented
- README complete
- SQL queries documented
- Report clear and actionable
| Phase | Time | Cumulative |
|---|---|---|
| 1. Data Generation | 30 min | 30 min |
| 2. Metrics Framework | 45 min | 1h 15min |
| 3. Funnel Analysis | 60 min | 2h 15min |
| 4. Cohort Analysis | 60 min | 3h 15min |
| 5. Growth Modeling | 90 min | 4h 45min |
| 6. SQL Queries | 30 min | 5h 15min |
| 7. Visualizations | 45 min | 6h |
| 8. Final Report | 30 min | 6h 30min |
Total: ~6.5 hours (spread over 2-3 days for thorough analysis)
Actual Runtime: ~5 minutes (for code execution)
- Synthetic data generation at scale
- Complex funnel analysis with segments
- Cohort retention mathematics
- Growth modeling with confidence intervals
- SQL optimization for analytics
- North Star metrics must combine engagement + value
- Activation is the most critical stage
- Collaboration drives retention and monetization
- Early product improvements compound over time
- Data-driven prioritization beats intuition
- Visualizations must tell a story
- Executive dashboards need context
- Recommendations must be actionable
- Quantify everything (revenue, users, time)
- Present ranges, not point estimates
Issue: Data generation takes too long
- Solution: Reduce
n_usersparameter to 10,000-25,000
Issue: Memory error during analysis
- Solution: Process data in chunks, use
chunksizeparameter
Issue: Visualizations don't render
- Solution: Install kaleido:
pip install kaleido
Issue: SQL queries don't match your database
- Solution: Adjust schema names in
config.py
After completing this project:
- Adapt framework for your product
- Connect to real data sources
- Set up automated reporting
- Build real-time dashboards
- Implement top growth levers
- Track impact and iterate