Testing Strategy for MentorMe

This document outlines the comprehensive testing strategy for MentorMe to prevent regressions and ensure code quality.

Overview

MentorMe uses a test pyramid approach to ensure comprehensive coverage while maintaining fast test execution:

        /\
       /  \      Integration Tests (5-10%)
      /    \     - Full user flows
     /------\    - E2E scenarios
    /        \
   /  Widget  \  Widget Tests (20-30%)
  /   Tests    \ - Critical screens
 /-------------\- Custom widgets
/               \
|  Unit Tests   | Unit Tests (60-70%)
|  (Priority!)  | - Providers (state management)
|               | - Services (business logic)
\_______________/ - Utility functions

Testing Approach:

Hybrid Strategy: Standard Dart tests for unit tests, Gherkin/BDD for integration tests
Unit Tests: Fast, developer-focused, extensive coverage
Integration Tests: BDD-style, stakeholder-readable, critical user flows

Current Coverage:

✅ Schema validation tests (data models)
✅ Legacy migration tests
✅ Provider tests (GoalProvider, JournalProvider, HabitProvider)
✅ BDD/Gherkin integration tests (goal management flows)
⚠️ Service tests (partially implemented)
⚠️ Widget tests (not yet implemented)

Test Organization

Directory Structure

test/
├── providers/               # Unit tests for providers (state management)
│   ├── goal_provider_test.dart
│   ├── journal_provider_test.dart
│   ├── habit_provider_test.dart
│   ├── checkin_provider_test.dart
│   ├── pulse_provider_test.dart
│   ├── pulse_type_provider_test.dart
│   └── chat_provider_test.dart
├── services/                # Unit tests for services (business logic)
│   ├── ai_service_test.dart
│   ├── storage_service_test.dart
│   ├── mentor_intelligence_service_test.dart
│   ├── notification_service_test.dart
│   └── context_management_service_test.dart
├── widgets/                 # Widget tests (UI components)
│   ├── mentor_coaching_card_widget_test.dart
│   ├── goal_card_widget_test.dart
│   └── habit_card_widget_test.dart
├── screens/                 # Screen tests (full pages)
│   ├── home_screen_test.dart
│   ├── goals_screen_test.dart
│   └── chat_screen_test.dart
├── models/                  # Model tests (data structures)
│   ├── goal_test.dart
│   └── journal_entry_test.dart
├── features/                # BDD/Gherkin feature files (plain text)
│   ├── goal_management.feature
│   ├── journal_writing.feature
│   └── habit_tracking.feature
├── steps/                   # Step definitions for Gherkin tests
│   ├── common_steps.dart    # Reusable steps across all features
│   ├── goal_steps.dart      # Goal-specific steps
│   ├── journal_steps.dart   # Journal-specific steps
│   └── habit_steps.dart     # Habit-specific steps
├── integration/             # Integration tests (multi-component)
│   └── backup_restore_race_condition_test.dart  # Backup/restore safety tests
├── schema_validation_test.dart    # Schema synchronization tests
└── legacy_migration_test.dart     # Data migration tests

test_driver/                 # Integration test driver (Gherkin)
├── app.dart                 # App entry point for integration tests
└── app_test.dart            # Gherkin test configuration & runner

Running Tests

Run All Tests

flutter test

Run Specific Test File

flutter test test/providers/goal_provider_test.dart

Run Tests with Coverage

flutter test --coverage

View Coverage Report

# Generate HTML report
genhtml coverage/lcov.info -o coverage/html

# Open in browser
open coverage/html/index.html  # macOS
xdg-open coverage/html/index.html  # Linux

Run Tests in Watch Mode (during development)

flutter test --watch

Run Gherkin/BDD Integration Tests

# Run all Gherkin tests
flutter drive --target=test_driver/app.dart

# Run specific feature file
flutter drive --target=test_driver/app.dart --feature=test/features/goal_management.feature

# Run tests with specific tags (e.g., only critical tests)
flutter drive --target=test_driver/app.dart --tags="@critical"

# Run tests excluding certain tags (e.g., skip work-in-progress tests)
flutter drive --target=test_driver/app.dart --tags="not @wip"

# Combine tag filters
flutter drive --target=test_driver/app.dart --tags="@integration and @critical"

Gherkin Test Reports:

JSON report: reports/gherkin-report.json
Console output: Shows progress and summary

Test Categories

1. Unit Tests (Priority: HIGH)

Purpose: Test individual functions, methods, and classes in isolation.

What to Test:

Provider methods (add, update, delete, filter)
Service logic (AI response generation, storage operations)
Utility functions (streak calculations, date formatting)
Data model serialization/deserialization

Example:

test('should add a new goal', () async {
  final goal = Goal(title: 'Test Goal', category: GoalCategory.personal);
  await goalProvider.addGoal(goal);

  expect(goalProvider.goals.length, 1);
  expect(goalProvider.goals.first.title, 'Test Goal');
});

Coverage Target: 70-80%

2. Widget Tests (Priority: MEDIUM)

Purpose: Test UI components and user interactions.

What to Test:

Widget rendering (does it display correctly?)
User interactions (tap, swipe, input)
State changes (does UI update when state changes?)
Navigation (does tapping button navigate correctly?)

Example:

testWidgets('should display goal title', (WidgetTester tester) async {
  final goal = Goal(title: 'Test Goal', category: GoalCategory.personal);

  await tester.pumpWidget(MaterialApp(
    home: GoalCardWidget(goal: goal),
  ));

  expect(find.text('Test Goal'), findsOneWidget);
});

Coverage Target: 50-60%

3. Integration Tests (Priority: LOW)

Purpose: Test complete user flows end-to-end.

What to Test:

Create goal → Add milestone → Complete milestone
Write journal entry → Link to goal → View in timeline
Complete habit → Build streak → View stats

Example:

testWidgets('should complete full goal creation flow', (WidgetTester tester) async {
  await tester.pumpWidget(MyApp());

  // Tap "Add Goal" button
  await tester.tap(find.byIcon(Icons.add));
  await tester.pumpAndSettle();

  // Fill in goal details
  await tester.enterText(find.byType(TextField).first, 'New Goal');
  await tester.tap(find.text('Save'));
  await tester.pumpAndSettle();

  // Verify goal appears in list
  expect(find.text('New Goal'), findsOneWidget);
});

Coverage Target: 30-40% of critical flows

4. BDD/Gherkin Integration Tests (Hybrid Approach)

Purpose: Write integration tests in plain English using Gherkin syntax for stakeholder readability.

When to Use:

✅ Critical user flows (e.g., goal creation → milestone completion)
✅ Acceptance criteria from user stories
✅ Regression tests for high-value features
✅ Tests that non-technical stakeholders should understand

When NOT to Use:

❌ Unit tests (too verbose, use standard Dart tests)
❌ Simple logic tests (overhead not justified)
❌ Tests that change frequently (step definitions need updates)

Plain Text Feature Files:

# test/features/goal_management.feature
Feature: Goal Management
  As a user
  I want to create and manage goals with milestones
  So that I can track my progress toward meaningful achievements

  @critical @integration
  Scenario: Create a new goal successfully
    Given I am on the home screen
    When I navigate to the goals screen
    And I tap the "Add Goal" button
    And I enter "Launch my website" as the goal title
    And I select "Career" as the category
    And I tap the "Save" button
    Then I should see "Launch my website" in my goals list
    And the goal should be in "Active" status

Step Definitions (Dart Code):

// test/steps/goal_steps.dart
class WhenIEnterGoalTitle extends When1<String> {
  @override
  Future<void> executeStep(String title) async {
    final world = getWorld<FlutterWorld>();
    await world.appDriver.enterText(
      find.byKey(const Key('goal_title_field')),
      title,
    );
  }

  @override
  RegExp get pattern => RegExp(r'I enter {string} as the goal title');
}

Running Gherkin Tests:

# Run all Gherkin tests
flutter drive --target=test_driver/app.dart

# Run only critical tests
flutter drive --target=test_driver/app.dart --tags="@critical"

Benefits:

✅ Plain English: Non-technical stakeholders can read and write tests
✅ Living Documentation: Feature files serve as up-to-date requirements
✅ Reusable Steps: Write once, use across many scenarios
✅ Acceptance Criteria: Maps directly to user stories

Considerations:

⚠️ Slower Execution: Parsing feature files adds overhead
⚠️ Setup Required: Need test_driver/, step definitions, configuration
⚠️ IDE Support: Less autocomplete compared to pure Dart tests

Testing Best Practices

1. Test Naming Convention

Use descriptive test names that explain what is being tested and what the expected outcome is:

✅ GOOD:

test('should calculate current streak correctly for consecutive days', () {});
test('should return null when goal not found', () {});
test('should persist completion history to SharedPreferences', () {});

❌ BAD:

test('test goal', () {});
test('streak test', () {});
test('it works', () {});

2. Arrange-Act-Assert Pattern

Structure tests using the AAA pattern:

test('should add a new goal', () async {
  // Arrange - Set up test data and preconditions
  final goal = Goal(title: 'Test Goal', category: GoalCategory.personal);

  // Act - Perform the action being tested
  await goalProvider.addGoal(goal);

  // Assert - Verify the expected outcome
  expect(goalProvider.goals.length, 1);
  expect(goalProvider.goals.first.title, 'Test Goal');
});

3. Test Isolation

Each test should be independent and not rely on other tests:

✅ GOOD:

setUp(() async {
  SharedPreferences.setMockInitialValues({});
  goalProvider = GoalProvider();
  await goalProvider.loadGoals();
});

test('test 1', () async {
  // Fresh provider, clean state
});

test('test 2', () async {
  // Fresh provider, clean state
});

❌ BAD:

test('test 1', () async {
  await goalProvider.addGoal(goal1);
});

test('test 2', () async {
  // Assumes test 1 ran first - FRAGILE!
  expect(goalProvider.goals.length, 1);
});

4. Mock External Dependencies

Use mocks for external services (API calls, storage, etc.):

class MockAIService extends Mock implements AIService {}

test('should generate coaching response', () async {
  final mockAI = MockAIService();
  when(mockAI.generateCoachingResponse(any)).thenAnswer(
    (_) async => 'Great progress!',
  );

  final response = await mockAI.generateCoachingResponse(prompt: 'How am I doing?');
  expect(response, 'Great progress!');
});

5. Test Edge Cases

Don't just test the happy path - test edge cases:

group('Edge Cases', () {
  test('should handle null descriptions', () async { /* ... */ });
  test('should handle empty content', () async { /* ... */ });
  test('should handle very long content', () async { /* ... */ });
  test('should prevent duplicate IDs', () async { /* ... */ });
  test('should handle concurrent updates', () async { /* ... */ });
});

6. Use Descriptive Test Groups

Organize related tests using group():

group('GoalProvider', () {
  group('CRUD Operations', () {
    test('should add a new goal', () {});
    test('should update an existing goal', () {});
    test('should delete a goal', () {});
  });

  group('Status Management', () {
    test('should complete a goal', () {});
    test('should abandon a goal', () {});
  });
});

CI/CD Integration

GitHub Actions Workflow

Tests run automatically on every push and pull request:

# .github/workflows/android-build.yml
- name: Run Flutter tests
  run: flutter test
  continue-on-error: true

- name: Run schema validation test
  run: flutter test test/schema_validation_test.dart
  continue-on-error: false  # FAIL build if schema validation fails

Coverage Reporting (Coming Soon)

Coverage reports will be:

✅ Generated on every CI run
✅ Uploaded to coverage service (Codecov/Coveralls)
✅ Displayed in pull requests
✅ Enforced via minimum threshold (70% coverage)

Test Coverage Goals

Category	Current	Target	Priority
Providers	80%	90%	HIGH
Services	30%	70%	HIGH
Models	60%	80%	MEDIUM
Widgets	0%	50%	MEDIUM
Screens	0%	30%	LOW
Integration	0%	20%	LOW
Overall	40%	70%	HIGH

Regression Prevention Strategy

1. Pre-Commit Hooks (Recommended)

Install pre-commit hooks to run tests before committing:

# .git/hooks/pre-commit
#!/bin/bash
flutter test
if [ $? -ne 0 ]; then
  echo "❌ Tests failed! Commit aborted."
  exit 1
fi

2. Pull Request Requirements

Before merging a PR:

✅ All tests must pass
✅ No decrease in code coverage
✅ New features include tests
✅ Bug fixes include regression tests

3. Test-Driven Development (TDD)

For new features, consider TDD:

Write test first (it fails)
Implement feature (test passes)
Refactor (test still passes)

Example:

// Step 1: Write failing test
test('should archive a goal', () async {
  final goal = Goal(title: 'Test', category: GoalCategory.personal);
  await goalProvider.addGoal(goal);

  await goalProvider.archiveGoal(goal.id);  // Method doesn't exist yet!

  final archivedGoal = goalProvider.getGoalById(goal.id);
  expect(archivedGoal!.status, GoalStatus.archived);
});

// Step 2: Implement archiveGoal() method
// Step 3: Test passes, refactor if needed

4. Continuous Monitoring

Track regression metrics:

Test pass rate: Should be 100%
Code coverage: Should not decrease
Test execution time: Should remain fast (<5 minutes)
Flaky tests: Should be fixed immediately

Testing Roadmap

Phase 1: Critical Unit Tests (COMPLETED ✅)

Phase 2: Service Tests (IN PROGRESS 🚧)

Phase 3: Widget Tests (PLANNED 📝)

MentorCoachingCardWidget tests
GoalCardWidget tests
HabitCardWidget tests
Custom form widgets tests

Phase 4: Screen Tests (PLANNED 📝)

HomeScreen tests
GoalsScreen tests
ChatScreen tests
JournalScreen tests

Phase 5: Integration Tests (IN PROGRESS 🚧)

Create goal → Complete milestone flow
Journal entry → Link to goal flow
Complete habit → Build streak flow
Backup → Restore race condition tests

Phase 6: Test Infrastructure (PLANNED 📝)

Add test coverage reporting to CI/CD
Set up coverage badges
Implement pre-commit hooks
Add performance tests for large datasets

Common Testing Patterns

Testing Providers with SharedPreferences

setUp(() async {
  SharedPreferences.setMockInitialValues({});
  provider = MyProvider();
  await provider.loadData();
});

test('should persist data', () async {
  await provider.addItem(item);

  // Create new instance to test loading
  final newProvider = MyProvider();
  await newProvider.loadData();

  expect(newProvider.items.length, 1);
});

Testing Async Operations

test('should handle async operations', () async {
  final future = provider.fetchData();

  // Can test loading state here
  expect(provider.isLoading, true);

  await future;

  expect(provider.isLoading, false);
  expect(provider.data, isNotNull);
});

Testing Error Handling

test('should handle errors gracefully', () async {
  // Simulate error condition
  when(mockService.getData()).thenThrow(Exception('Network error'));

  await provider.loadData();

  expect(provider.hasError, true);
  expect(provider.errorMessage, contains('Network error'));
});

Testing Stream-Based Data

test('should update stream when data changes', () async {
  final stream = provider.dataStream;

  provider.addItem(item1);
  await expectLater(stream, emits(containsAll([item1])));

  provider.addItem(item2);
  await expectLater(stream, emits(containsAll([item1, item2])));
});

Testing Backup/Restore Race Conditions

Critical: Backup/restore operations have a race condition window where stale in-memory data could overwrite restored data.

The Problem:

Time 0: ✅ SharedPreferences written with restored data
Time 1: ✅ Providers start reloading...
Time 2: ❌ [RACE WINDOW] Background operation saves stale data
Time 3: ✅ Providers finish reloading

Protection Strategy:

BackupService writes to SharedPreferences
Immediately calls reload() on all providers
Providers re-fetch from SharedPreferences
UI shows success message

Test Coverage: See test/integration/backup_restore_race_condition_test.dart for comprehensive tests:

test('restore should not be overwritten by immediate save after import', () async {
  // 1. Create original data and export
  final originalGoal = Goal(title: 'Original Goal', ...);
  await goalProvider.addGoal(originalGoal);
  final backupData = await backupService.exportData();

  // 2. Modify data after backup
  final newGoal = Goal(title: 'New Goal After Backup', ...);
  await goalProvider.addGoal(newGoal);
  expect(goalProvider.goals.length, 2);

  // 3. Import backup (restores to SharedPreferences)
  await backupService.restoreFromBackup(backupData);

  // 4. CRITICAL: Try to save with stale provider data
  //    (simulates background operation that hasn't reloaded)
  final staleGoal = Goal(title: 'Stale Goal', ...);
  await goalProvider.addGoal(staleGoal);

  // 5. Reload provider (simulating proper restore flow)
  await goalProvider.reload();

  // 6. Verify: Should have original + stale (restore wins)
  expect(goalProvider.goals.length, 2);
  expect(goalProvider.goals.map((g) => g.title),
    containsAll(['Original Goal', 'Stale Goal']));
  expect(goalProvider.goals.map((g) => g.title),
    isNot(contains('New Goal After Backup')));
});

Additional Test Scenarios:

Concurrent operations: Multiple providers saving during restore
Rapid saves: Background timers triggering rapid saves
Empty backups: Restoring empty backup should clear data
Provider reload verification: All providers implement reload() method

Why These Tests Matter:

Prevents data loss during restore operations
Catches regressions if restore flow is refactored
Documents expected behavior for developers
Provides confidence that user data is safe

Running Backup/Restore Tests:

flutter test test/integration/backup_restore_race_condition_test.dart

Real-World Scenarios Tested:

User imports backup but dismisses "restart app" dialog
Background notification service triggers save during restore
Auto-backup service runs during restore
User rapidly interacts with UI during restore
Multiple concurrent saves from different providers

Safety Guarantees: ✅ Restored data ALWAYS wins over stale in-memory data ✅ Provider reload() ensures fresh data from SharedPreferences ✅ BackupService writes atomically to SharedPreferences ✅ No partial writes (all-or-nothing restore)

Troubleshooting Tests

"SharedPreferences not initialized"

Solution: Mock SharedPreferences in setUp:

setUp(() async {
  SharedPreferences.setMockInitialValues({});
});

"Test timeout"

Solution: Increase timeout or use pumpAndSettle():

testWidgets('my test', (tester) async {
  await tester.pumpWidget(myWidget);
  await tester.pumpAndSettle();  // Wait for all animations
});

"Provider not found"

Solution: Wrap widget with providers:

await tester.pumpWidget(
  MultiProvider(
    providers: [
      ChangeNotifierProvider(create: (_) => GoalProvider()),
    ],
    child: MaterialApp(home: MyScreen()),
  ),
);

"Flaky tests"

Solution: Avoid time-dependent tests, use await, ensure test isolation:

// ❌ BAD - time-dependent
test('test', () {
  Future.delayed(Duration(seconds: 1), () => doSomething());
  expect(result, isTrue);  // May fail due to timing
});

// ✅ GOOD - await completion
test('test', () async {
  await doSomething();
  expect(result, isTrue);  // Waits for completion
});

Resources

Flutter Testing Documentation

Testing Packages

flutter_test - Built-in testing framework
mockito - Mocking library
integration_test - Integration testing
golden_toolkit - Golden file testing (screenshot comparison)

Best Practices

Contributing

When adding new features or fixing bugs:

Write tests first (TDD approach preferred)
Ensure all tests pass before submitting PR
Maintain or improve coverage (no decrease allowed)
Add tests for bug fixes to prevent regressions
Update this document if introducing new testing patterns

Questions?

If you have questions about testing or need help writing tests:

Check existing test files for examples
Consult this document for patterns
Ask in PR reviews for guidance
Refer to Flutter testing documentation

Remember: Tests are not just about coverage - they're about confidence that your code works correctly and preventing regressions as the codebase evolves.

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing Strategy for MentorMe

Overview

Test Organization

Directory Structure

Running Tests

Run All Tests

Run Specific Test File

Run Tests with Coverage

View Coverage Report

Run Tests in Watch Mode (during development)

Run Gherkin/BDD Integration Tests

Test Categories

1. Unit Tests (Priority: HIGH)

2. Widget Tests (Priority: MEDIUM)

3. Integration Tests (Priority: LOW)

4. BDD/Gherkin Integration Tests (Hybrid Approach)

Testing Best Practices

1. Test Naming Convention

2. Arrange-Act-Assert Pattern

3. Test Isolation

4. Mock External Dependencies

5. Test Edge Cases

6. Use Descriptive Test Groups

CI/CD Integration

GitHub Actions Workflow

Coverage Reporting (Coming Soon)

Test Coverage Goals

Regression Prevention Strategy

1. Pre-Commit Hooks (Recommended)

2. Pull Request Requirements

3. Test-Driven Development (TDD)

4. Continuous Monitoring

Testing Roadmap

Phase 1: Critical Unit Tests (COMPLETED ✅)

Phase 2: Service Tests (IN PROGRESS 🚧)

Phase 3: Widget Tests (PLANNED 📝)

Phase 4: Screen Tests (PLANNED 📝)

Phase 5: Integration Tests (IN PROGRESS 🚧)

Phase 6: Test Infrastructure (PLANNED 📝)

Common Testing Patterns

Testing Providers with SharedPreferences

Testing Async Operations

Testing Error Handling

Testing Stream-Based Data

Testing Backup/Restore Race Conditions

Troubleshooting Tests

"SharedPreferences not initialized"

"Test timeout"

"Provider not found"

"Flaky tests"

Resources

Flutter Testing Documentation

Testing Packages

Best Practices

Contributing

Questions?