Behavioral Research
Focus on observed behaviors in real contexts. Use the Say‑Do Gap Diagnostic to choose methods. See Downloads.
TLDR: Behavioral research in the Behavioral Strategy context focuses on observing and measuring what people actually do (not what they say they’ll do), using mixed methods to validate problems, identify leverageable behaviors, and test interventions.
Term crosswalk for readers of the Overview article (see the Overview)
Older term Where it lives here Situational Survey Context mapping and Environmental Audit Behavioral Audit Behavioral Observation, Diaries, Time-motion, Behavior Calendar Worldview Analysis Belief mapping, Emotional journey mapping Problem Examination Behavioral Event Interviews focused on problems
Overview
Behavioral research differs fundamentally from traditional market research. While market research asks “What do you want?” and “Would you buy this?”, behavioral research asks “What do you actually do?” and “What prevents you from doing X?”
This guide provides practical methods for conducting behavioral research throughout the DRIVE Framework.
Example (enterprise data workflow): In a platform rollout for analytics teams, the target behavior was “Publish a certified dataset with complete metadata.” Observation and time‑motion studies showed the weakest step was writing complete metadata. The team prototyped a metadata wizard + templates, then field‑tested with 30 analysts. SMF was measured as completion rate and time‑to‑first publish; follow‑on bPMF tracked certified‑publish behavior in 30‑day acquisition cohorts.
Say‑Do Gap Diagnostic
Context stability | Consequence salience | Observability | Method |
---|---|---|---|
High | High | High | Log analysis, field observation |
High | Low | Medium | Diary + spot checks |
Low | Medium | Low | Event‑based interviews + rapid probes |
Low | Low | Low | Structured experiments |
Core Principles of Behavioral Research
1. Actions Over Intentions
Traditional: “Would you use a fitness app?” Behavioral: “Show me your phone. Which apps do you open daily? When did you last try to exercise?”
2. Context Matters
Traditional: Survey in isolation Behavioral: Observe in natural environment where behavior occurs
3. Barriers Are Key
Traditional: “Why don’t you save money?” Behavioral: “Walk me through what happened the last time you tried to save”
4. Past Predicts Future
Traditional: “Will you change your behavior?” Behavioral: “How many times have you tried before? What happened?”
Research Methods by DRIVE Phase
Define Phase: Problem Validation Research
Method 1: Behavioral Event Interviews
Purpose: Understand when and how problems manifest
Protocol:
1. "Tell me about the last time you experienced [problem]"
2. "What exactly happened?" (probe for specifics)
3. "What did you do about it?" (actual behavior)
4. "How often does this happen?" (frequency)
5. "What have you tried before?" (solution history)
What to Listen For:
- Specific behaviors indicating problem severity
- Workarounds revealing unmet needs
- Emotional language showing problem importance
- Frequency and impact on daily life
Method 2: Problem Diary Studies
Purpose: Track problem occurrence in real-time
Implementation:
# Pseudocode for clarity
# Diary Study Protocol
diary_prompts = {
"trigger": "When you experience [problem], log it immediately",
"capture": [
"Time and location",
"What triggered it",
"How you responded",
"Impact on your day",
"Attempted solutions"
],
"duration": "7-14 days",
"medium": "Mobile app, SMS, or paper",
"incentive": "Small daily + completion bonus"
}
Analysis Framework:
- Frequency patterns
- Contextual triggers
- Severity variations
- Current coping behaviors
Research Phase: Behavior Identification
Method 3: Behavioral Observation
Purpose: See what people actually do vs. what they report
Observation Protocol:
- Shadow Sessions (with permission)
- Follow users through their day
- Note all behaviors related to problem space
- Document context and environment
- Avoid intervening or asking questions during
- Time-Motion Analysis
Behavior: [What they did] Duration: [How long it took] Frequency: [How often repeated] Context: [Where/when/with whom] Difficulty: [Observed struggle points] Outcome: [What resulted]
- Environmental Audit
- Physical barriers/enablers
- Social influences present
- Available resources
- Competing behaviors
Method 4: Behavioral Mapping
Purpose: Identify all possible behaviors that could address the problem
Process:
graph TD
A[Problem State] --> B{Current Behaviors}
B --> C[Effective Behaviors]
B --> D[Ineffective Behaviors]
B --> E[No Action]
C --> F[Why These Work?]
D --> G[Why These Fail?]
E --> H[What Prevents Action?]
F --> I[Leverageable Behaviors]
G --> I
H --> I
Behavior Inventory Template:
behavior:
name: "Specific behavior"
current_adoption: "% who do this"
effectiveness: "Problem-solving impact 0-10"
requirements:
ability_needed: "Skills required"
motivation_type: "Intrinsic/extrinsic"
environmental_needs: "Context requirements"
time_investment: "Minutes/hours"
barriers:
- "Barrier 1"
- "Barrier 2"
enablers:
- "Enabler 1"
- "Enabler 2"
Integrate Phase: Solution Testing Research
Method 5: Rapid Behavioral Prototyping
Purpose: Test if solutions actually enable target behaviors
Protocol:
# Pseudocode for clarity
def rapid_behavior_test(solution_prototype, target_behavior, users, duration="3 days"):
baseline = measure_current_behavior(users)
introduce_prototype(users, solution_prototype)
results = track_behavior(users, duration)
return {
'adoption_rate': results.tried_it / len(users),
'success_rate': results.completed / results.tried_it,
'sustained_rate': results.repeated / results.completed,
'barriers_found': results.failure_reasons,
'time_to_first_use': results.avg_time_to_try
}
What Makes a Good Prototype:
- Tests core behavior enablement
- Minimal features (just enough)
- Can deploy in <1 day
- Measurable behavior change
- Clear success criteria
Method 6: A/B Behavior Testing
Purpose: Compare behavioral interventions
Setup:
// Pseudocode for clarity
// A/B Test Configuration
const behaviorTest = {
control: {
name: "Current state",
intervention: null,
measurement: baselineBehaviorRate
},
variant_a: {
name: "Ability-focused",
intervention: "Simplify steps from 5 to 2",
hypothesis: "Reducing complexity increases adoption"
},
variant_b: {
name: "Motivation-focused",
intervention: "Add immediate reward",
hypothesis: "Instant gratification drives behavior"
},
success_metrics: {
primary: "% who perform target behavior",
secondary: [
"Time to first behavior",
"Repetition rate",
"Self-reported difficulty"
]
},
sample_size: calculateSampleSize(effect_size=0.2, power=0.8)
}
Verify Phase: Outcome Measurement
Method 7: Behavioral Analytics
Purpose: Track actual behavior at scale
Key Metrics:
# Pseudocode for clarity
behavioral_metrics = {
'first_use_rate': 'Users who try behavior / Total users',
'time_to_adopt': 'Average days from exposure to first use',
'adoption_curve': 'Plot of cumulative adoption over time',
'frequency': 'Behaviors per user per time period',
'consistency': 'Standard deviation of behavior frequency',
'streak_length': 'Consecutive periods with behavior',
'completion_rate': 'Completed behaviors / Attempted behaviors',
'error_rate': 'Failed attempts / Total attempts',
'time_per_behavior': 'Average duration to complete',
'retention_curve': 'Users still active at T+30, T+60, T+90',
'churn_points': 'When users typically stop',
'reactivation_rate': 'Dormant users who return'
}
Behavioral Funnel Analysis:
Awareness → First Attempt → Successful Completion → Repetition → Habit Formation
↓ ↓ ↓ ↓ ↓
85% 42% 31% 18% 12%
Where are users dropping off? Why?
Enhance Phase: Iterative Research
Method 8: Behavioral Cohort Analysis
Purpose: Understand how different user segments respond
Analysis Framework:
-- Behavioral Cohort Query Example
SELECT
user_segment,
behavior_version,
AVG(adoption_rate) as avg_adoption,
AVG(retention_30d) as avg_retention,
COUNT(DISTINCT user_id) as cohort_size
FROM behavioral_events
GROUP BY user_segment, behavior_version
ORDER BY avg_adoption DESC
Insights to Extract:
- Which segments struggle most?
- Which interventions work for whom?
- Are there unexpected champion users?
- What predicts long-term success?
Mixed Methods Research
Combining Qualitative and Quantitative
Research Stack:
layer_1_exploratory:
methods: ["interviews", "observation", "diary studies"]
purpose: "Discover behaviors and patterns"
sample_size: "10-20 users"
output: "Behavioral hypotheses"
layer_2_validation:
methods: ["surveys", "analytics", "experiments"]
purpose: "Validate at scale"
sample_size: "100-1000 users"
output: "Quantified behaviors"
layer_3_optimization:
methods: ["A/B tests", "cohort analysis", "ML models"]
purpose: "Optimize interventions"
sample_size: "1000+ users"
output: "Refined solutions"
Research Ethics and Bias
Stopping rules: Stop explorations after 3 consecutive sessions with no novel codes. For small effect detection (d=0.3), plan ≈ 175 per group (≈ 350 total) for A/B at α=0.05, power 0.8.
Consent: “We will observe your behavior related to X for Y minutes, record only these fields, and anonymize your data. Participation is voluntary and you can stop anytime.”
Ethical Considerations
- Informed Consent
- Explain behavioral observation
- Allow opt-out without penalty
- Protect vulnerable populations
- Privacy Protection
- Anonymize behavioral data
- Secure storage protocols
- Clear data retention policies
- Avoiding Harm
- Don’t enable harmful behaviors
- Consider long-term impacts
- Provide support resources
Common Research Biases
Bias | Description | Mitigation |
---|---|---|
Social Desirability | People report idealized behaviors | Observe actual behavior |
Hawthorne Effect | Behavior changes when observed | Extended observation periods |
Selection Bias | Wrong user sample | Behavioral segmentation |
Recency Bias | Overweight recent events | Longitudinal tracking |
Confirmation Bias | See what we expect | Structured protocols |
Tools and Resources
Research Toolkit
- Observation Tools
- Behavior coding sheets
- Time-stamped note apps
- Screen/video recording (with consent)
- Analysis Software
- Qualitative: NVivo, ATLAS.ti
- Quantitative: R, Python, SPSS
- Mixed: Dedoose, MAXQDA
- Behavioral Tracking
- Analytics: Mixpanel, Amplitude
- Heatmaps: Hotjar, FullStory
- Custom: Build your own
Sample Size Calculations
from statsmodels.stats.power import tt_ind_solve_power
import numpy as np
def calculate_sample_size(effect_size=0.5, power=0.8, significance=0.05):
"""
Calculate sample size for behavior change detection
effect_size: Expected behavior change (Cohen's d)
power: Statistical power (typically 0.8)
significance: Alpha level (typically 0.05)
"""
n = tt_ind_solve_power(
effect_size=effect_size,
power=power,
alpha=significance,
ratio=1
)
return {
'per_group': int(np.ceil(n)),
'total': int(np.ceil(n * 2)),
'note': f"Detect {effect_size} effect with {power*100}% power"
}
# Example
sample = calculate_sample_size(effect_size=0.3) # Small effect
print(f"Need {sample['total']} users to {sample['note']}")
Research Templates
Behavioral Research Plan Template
```