Macro Variable Regression — Executive Summary

✅

DATA COMPLETE — UPDATED 30 MAR 2026

Aug–Dec 2025 full data loaded — re-exported by date split, 1,813,481 rows across 14 files. Prior 150K row cap resolved. Aug: 245,042 · Sep: 250,599 · Oct: 262,960 · Nov: 259,196 · Dec: 404,144 rows.
Q1 2026 validation complete — Jan–Mar 2026 actuals uploaded and validated. Out-of-sample MAPE: 2.25% (Jan: 2.0% · Feb: 1.9% · Mar: 2.8%). See Page 5 for full results.

Revised: ASP is the primary controllable driver — marketing variable corrected, endogeneity identified

With 11.3M+ transaction rows across 75 months (2020–Mar 2026), ASP is confirmed as the key demand lever (simple R²=27.3%, p<0.0001; price elasticity = −0.901). However, a material data quality issue was identified in the marketing variable: the original 641 spend series captured the entire GL account (~1,095B/month) rather than true A&P activity codes (MK01–MK11, ~97B/month). The old variable was endogenous — larger business months naturally produce both higher sales and higher 641 costs, creating spurious correlation (R²=18.5%, p=0.0002). The correct narrow MK** spend is statistically insignificant at monthly aggregate level (R²=1.4%, p=0.327). Both models are presented: Model C-OLD (ASP + proxy 641, R²=0.658) and Model B-NEW (ASP + Seasonality only, R²=0.625). Macro variables remain excluded. Marketing effect is flagged as a limitation requiring further investigation — lagged spend or campaign-level data needed to establish causal relationship.

NEW — Average Selling Price (ASP) Analysis

ASP — Simple Regression

R²0.2782 (27.8%)

p-value<0.0001

Elasticity−0.901

Direction↑ Price → ↓ Qty

✅ Include

Price Elasticity

Elasticity−0.901

TypeInelastic (|e|<1)

+5% price−4.3% qty

+10% price−8.4% qty

✅ Near-unitary

ASP Trend 2020–2025

2020 avg7,800 VND/unit

2022 avg8,200 VND/unit

2024 avg8,800 VND/unit

2025 avg9,800 VND/unit

+25% in 5 years

Marketing Spend (641) ⚠️ Revised

OLD R² (proxy)0.185 — endogenous

NEW R² (MK**)0.020 — not sig.

OLD scopeFull 641 ~1,095B/mo

CORRECT scopeMK** codes ~97B/mo

⚠️ Endogenous

Multiple Regression Model Comparison — Log Quantity

Model	R²	MAPE	ASP coef	MKT coef	Decision
A — Trend + Seasonality only	0.622	7.04%	—	—	Baseline
B — + ASP	0.626	7.03%	−0.253	—	+0.4pp R²
C — + ASP + Marketing ★	0.666	6.68%	−0.630	+0.346	★ Selected
D — + GDP Growth	0.667	6.63%	−0.616	+0.349	+0.1pp only
E — + Lag QTY	0.689	6.41%	−0.650	+0.351	Overfitting risk

Standardised Beta Coefficients — Impact Ranking (Model C)

Comparable impact: 1 std dev change in X → β std devs change in log-quantity. Largest |β| = strongest driver.

Reading guide (Model C-OLD with proxy 641): COS_12 (seasonality) has the largest absolute impact (−0.456) but is uncontrollable. ASP (price) is #1 controllable lever (β=−0.301). Marketing β=+0.263 shown here but is from the endogenous 641 proxy variable — treat with caution.
⚠️ Revised finding: With the correct MK** spend (~97B/month), marketing beta = −0.064 (not significant, p=0.327, R²=1.4%). ASP remains the only statistically validated controllable driver. A 1 std dev price increase (~1,000 VND/unit) reduces quantity by ~0.30 std devs (~35M units/month).

Simple Regression — One Variable at a Time

CPI YoY %

R²0.0170 (1.7%)

p-value0.2745

F-stat1.2131

95% CI[-22.6M, +6.5M]

MAE53.3M VND

Significant?No (p=0.27)

❌ Exclude

Interest Rate %

R²0.0523 (5.2%)

p-value0.0532

F-stat3.8665

95% CI[-53.7M, +0.4M]

MAE53.0M VND

Significant?Borderline (p=0.053)

⚠️ Exclude

GDP Growth % (Quarterly)

R²0.0888 (8.9%)

p-value0.0110

F-stat6.8230

95% CI[-10.2M, -1.4M]

MAE49.9M VND

Significant?Marginally (p=0.011)

⚠️ Exclude

Model	R²	Adj. R²	F-stat (p)	MAE	Decision
All 3 Combined	16.1%	12.4%	4.34 (p=0.007)	48.6M VND	EXCLUDE — multicollinearity, CPI sign reversal

Model Roadmap

✅ Completed

Base Regression Model

Linear regression on total monthly sales using avg price, trend, seasonality, lag features. R²=0.823 (+TET_FEB dummy), MAPE=4.68% in-sample. Full-data refit on 1,813,481 rows Aug–Dec 2025.

✅ Completed

Macro Variable Testing

Tested CPI, interest rate, GDP at total and product group level with full statistical evidence. All excluded from primary model. Product group table updated to real quantity data — prior amount-based results revised.

✅ Completed

Product Group Deep Dive

Sales aggregated by NHÓM (92.1% item match rate). Macro regression per group — real quantity data reveals 01XXXX (Condensed Milk) as most GDP-sensitive group (R²=18.6%), overturning prior assumption. See Page 2.

✅ Completed

Marketing Spend Variable (641)

Tested Account 641 proxy (R²=17.0%, endogenous) and correct MK** activity codes (~97B/month, R²=1.4%, p=0.327 — not significant). Endogeneity identified and documented. Marketing effect requires campaign-level data to establish causality.

✅ Completed

Average Selling Price Variable

Computed from full transaction dataset. ASP rose from 7,800 to 9,800 VND/unit (+25% over 5 years). Linear R²=27.3%, elasticity=−0.901. Beta=−0.252 in Model B — strongest controllable driver confirmed.

✅ Completed

Aug–Dec 2025 Full Data Re-export

150K row cap resolved. Re-exported by date split — 1,813,481 rows across 14 files. Aug: 245,042 · Sep: 250,599 · Oct: 262,960 · Nov: 259,196 · Dec: 404,144. Model refit on complete 2025 data.

✅ Completed

Q1 2026 Out-of-Sample Validation

Jan–Mar 2026 actuals validated. MAPE=2.25% (Jan: 2.0% · Feb: 1.9% · Mar: 2.8%) vs MA12 baseline 11.18% — model generalises well beyond training period. TET_FEB dummy critical for Feb accuracy. See Page 5.

✅ Completed

TET Seasonality Feature

TET_FEB dummy added after real-data diagnosis — February structurally low every year (Tet stock rundown). Improved in-sample R² from 0.782→0.823, MAPE 5.20%→4.68%. Validation MAPE improved from 6.78%→2.25%.

✅ Completed

Product-Group Level ASP & Elasticity

Real group ASPs computed from Oracle EBS transaction data (Amount÷Qty per NHOM). Separate OLS regressions run for 01/02/04/07. 04XXXX (UHT) confirmed most price-elastic (elasticity=−1.228, +5%→−5.8%) as predicted. 01XXXX (Condensed) elasticity=−0.782. 07XXXX and 02XXXX near price-inelastic — volume driven by seasonality and structural trend respectively. Full results: nhom_model_results.csv.

✅ Completed

Channel-Level Separate Models

Separate OLS regressions (Model F) run for all 6 channels using Oracle AR system SALES_CHANNEL_CODE (10,565 customers, 99.3% match rate). CVS most price-elastic (−1.065), NPP near-inelastic (−0.136). MT and KA show Interest Rate sensitivity. Lagged MK** spend tested at lags 0–3 across all channels — 0/24 tests significant. Full results: channel_model_results.csv, lag_test_results.csv.

Methodology

Model

OLS via scipy.stats.linregress (simple) and sklearn.linear_model.LinearRegression (multiple regression)

Data

72 months (Jan 2020–Dec 2025). Sales: 9.5M transaction rows (normal sales, 2020–Jul 2025) + 1,813,481 rows (Aug 2025–Mar 2026, re-exported full data). Aug–Dec 2025 row cap resolved — complete data loaded. Macro: GSO Vietnam (CPI), State Bank of Vietnam (rate), World Bank/GSO (GDP quarterly → monthly). Q1 2026 macro: CPI 3.63/3.09/2.80%, Rate 4.5%, GDP 6.9% (GSO estimate).

Evaluation Metrics

R², Adjusted R², p-value, F-statistic, 95% Confidence Interval, MAE — at both total and product group level

Exclusion Criteria

Excluded if: R² < 0.15 individually, OR p-value > 0.05, OR combined Adj. R² does not meaningfully improve the primary model

Why this matters: Total-level analysis masks important product-level behaviour. Disaggregating by NHÓM (product group from Oracle ERP item master) reveals that macro sensitivity is more nuanced than a simple premium/staple split — and in some cases directly contradicts that assumption. Condensed Milk (01XXXX), previously assumed to be a macro-immune daily staple, shows significant GDP sensitivity on real quantity data (R²=18.6%, p=0.0002) — the highest single-variable R² of the four core groups. Meanwhile Powdered Milk (02XXXX) is sensitive to all 3 macro variables (best R²=11.2%, Rate), and UHT Liquid Milk (04XXXX) responds to CPI and Interest Rate (best R²=14.0%). Interest Rate emerges as the most consistent driver across 02XXXX and 04XXXX — suggesting that consumer credit conditions and discretionary income matter more than headline inflation for mid-tier dairy.

Macro sensitive (≥1 variable p<0.05)

Highly sensitive (all 3 significant)

Not sensitive (all p>0.05)

Core Dairy — Staple Products (Revised: Real-Data Quantity Regressions)

Product Group	CPI R² (p)	Interest Rate R² (p)	GDP R² (p)	Best R²	Verdict
01XXXX — Condensed Milk	0.016 (0.287) ❌	0.022 (0.217) ❌	0.186 (0.000) ✅	18.6%	⚠️ GDP significant — revised finding
03XXXX — Cream Products	0.020 (0.240)	0.000 (0.870)	0.044 (0.078)	4.4%	❌ No sensitivity
07XXXX — Other Dairy	0.028 (0.157)	0.064 (0.033) ✅	0.036 (0.113)	6.4%	⚠️ Rate only — weak
09XXXX — Cheese	0.001 (0.813)	0.042 (0.083)	0.004 (0.590)	4.2%	❌ No sensitivity
12XXXX — Other SKUs	0.000 (0.918)	0.020 (0.235)	0.024 (0.193)	2.4%	❌ No sensitivity
15BN03	0.006 (0.533)	0.026 (0.174)	0.007 (0.482)	2.6%	❌ No sensitivity

Premium & Discretionary — Revised Sensitivity Profile

Product Group	CPI R² (p)	Interest Rate R² (p)	GDP R² (p)	Best R²	Verdict
02XXXX — Powdered Milk	0.102 (0.006) ✅	0.112 (0.004) ✅	0.064 (0.032) ✅	11.2%	✅ All 3 significant — Rate strongest
04XXXX — UHT Liquid Milk	0.091 (0.010) ✅	0.140 (0.001) ✅	0.017 (0.281) ❌	14.0%	⚠️ CPI + Rate — not GDP
05XXXX — Flavored Milk	0.072 (0.023) ✅	0.001 (0.786)	0.164 (0.000) ✅	16.4%	⚠️ CPI + GDP — discretionary beverage
06XXXX — Yogurt	0.053 (0.051)	0.050 (0.058)	0.087 (0.012) ✅	8.7%	⚠️ GDP only
07PROU — Probiotic Range	0.073 (0.021) ✅	0.008 (0.448)	0.070 (0.025) ✅	7.3%	⚠️ CPI + GDP — premium health segment
08XXXX — Beverages	0.095 (0.008) ✅	0.066 (0.029) ✅	0.034 (0.123)	9.5%	⚠️ CPI + Rate — price sensitive
16XXXX	0.139 (0.001) ✅	0.076 (0.019) ✅	0.089 (0.011) ✅	13.9%	✅ All 3 significant
10XXXX	0.006 (0.523)	0.013 (0.343)	0.148 (0.001) ✅	14.8%	⚠️ GDP only
14XXXX	0.034 (0.122)	0.068 (0.027) ✅	0.069 (0.026) ✅	6.9%	⚠️ Rate + GDP

Small / Niche Groups (Low Volume, Limited Inference)

Product Group	n (months)	Best R²	Note
07SCUS	13	22.4% (p=0.103)	Too few months — not significant
15GC30	18	8.3% (p=0.245)	Limited data — inconclusive
15XXXX	72	1.4%	Negligible volume
18XXXX	13	37.1% (p=0.027) ✅	GDP significant but only 13 months — interpret with caution

Strategic Implication — Revised

⚠️ Data revision note: Previous version of this table used VND amount data from an earlier model run. This version uses real transaction quantity data (units sold, 72 months). Key finding changes: 01XXXX is no longer macro-immune; 04XXXX now shows Rate sensitivity instead of GDP only; 02XXXX GDP drops from 18.4% → 6.4% but all 3 variables remain significant.

Most Macro-Sensitive Groups

01XXXX Condensed Milk — GDP R²=18.6% (p=0.0002), highest single-variable R² of all core groups. Counter-intuitive: a low-cost staple responds to economic growth, likely through volume uplift via distribution expansion and channel stocking in high-GDP years.

04XXXX UHT Liquid Milk — Rate R²=14.0% (p=0.001), CPI also significant. The largest volume group is rate-sensitive: when borrowing costs rise, household budgets tighten and per-capita consumption softens even for mainstream products.

Revised Findings — Prior Assumptions Overturned

02XXXX Powdered Milk — still significant on all 3 variables, but strongest driver is now Interest Rate (R²=11.2%), not GDP (R²=6.4%). Premium infant formula demand is squeezed by credit tightening, not just income cycles.

07XXXX Other Dairy — previously shown as fully inelastic (R²<5.2%). Real data shows Rate significance (R²=6.4%, p=0.033). Low magnitude but directionally consistent with other groups — rate environment has a broad, mild dampening effect across dairy categories.

Product Group Regression — ASP + Seasonality Model (Corrected)

Full OLS regression run per product group (01/02/04/07) matching the overall model structure. Marketing variable corrected: MK** activity codes (~97B/month) used — found not significant in any group. Model B (ASP + Seasonality) selected as honest baseline. Old proxy results shown in parentheses for comparison.

Group	Simple R² (vs log-qty)		Multiple Regression R²		MAPE	ASP coef	ASP +5%→qty	Key driver
Group	ASP R²	Mkt R² (OLD proxy)	Model B (ASP)	Model C (OLD mkt)	MAPE	ASP coef	ASP +5%→qty	Key driver
01 Condensed Milk	0.002 ❌	0.068 ✅	0.347	(0.355)	13.97%	−0.649	−3.2%	Structural Trend (β=+0.47)
02 Powdered/Infant Formula	0.611 ✅	0.134 ✅	0.772	(0.830)	10.61%	−0.376	−1.9%	Trend decline (β=−0.66)
04 UHT Liquid Milk	0.380 ✅	0.122 ✅	0.624	(0.631)	9.79%	−0.823	−4.0%	ASP most price-sensitive of all groups
07 Cultured / Yogurt	0.336 ✅	0.020 ❌	0.773	(0.779)	6.46%	−0.052	−0.3%	Seasonality dominates (COS_12 β=−0.76)
TOTAL (blended)	0.278 ✅	0.185 ✅⚠️	0.625	(0.658)	6.99%	−0.212	−1.1%	ASP + Seasonality + Trend

✅ p<0.05 significant · ❌ not significant · ⚠️ OLD proxy endogenous · Mkt NEW (MK** correct) not significant in any group (all p>0.20)

04 UHT — MOST PRICE SENSITIVE

UHT liquid milk is the volume workhorse (242M units/month avg, ASP ~6,285 VND). Price elasticity = −0.823 — the most responsive group to pricing changes. A 5% price increase reduces volume by 4.0%, a 10% increase by 7.9%. Promotions on UHT have the highest volume leverage of any group.

07 YOGURT — SEASONALITY, NOT PRICE

Cultured/yogurt volume is driven almost entirely by seasonality (COS_12 β=−0.76) and growth trend (TREND β=+0.50). ASP coefficient is near-zero (−0.052) with correct marketing variable — the positive Veblen effect seen with old variable was a confounding artefact. Pricing strategy has minimal volume impact on yogurt.

01 CONDENSED — GDP SENSITIVE (REVISED)

Previously classified as macro-immune (R²<1% on amount data). Real quantity data reverses this: GDP R²=18.6% (p=0.0002) — the highest GDP response of the four modelled groups. Likely mechanism: condensed milk volume correlates with economic activity through distribution intensity — high-GDP years see expanded route-to-market reach and higher fill rates into wet markets and small grocery outlets. Not a consumer income effect but a supply-side/distribution channel effect. Model R²=0.347, MAPE=14.0% — structurally hard to fit; positive trend coefficient (β=+0.47) still dominates.

02 POWDERED — RATE-DRIVEN DECLINE

Powdered/infant formula (02) retains significance on all 3 macro variables, but the strongest driver on real quantity data is Interest Rate (R²=11.2%, p=0.004) — not GDP as previously shown on amount data (GDP drops to R²=6.4%). This makes economic sense: infant formula is a high-unit-cost purchase where household credit conditions matter. Rate hikes in 2022 (4.0%→6.0%) coincided with the sharpest volume contraction in this group. Negative trend coefficient (β=−0.66) confirms structural decline over the 72-month period — demographic trends and fresh/UHT substitution eroding the category regardless of macro environment. The R² drop from OLD model (0.830) to NEW (0.773) confirms the old marketing proxy was inflating fit for this group specifically.

EDA Finding: Sales is trended + seasonal. Marketing spend has the only clear linear signal.

Monthly sales shows a clear upward trend 2020–2022, softer in 2023, partial recovery 2024–2025. Strong seasonal pattern — Q1 (Jan–Feb) consistently lower, Q2 and Q4 peaks. ASP (r=−0.540, p<0.0001) is the strongest linear signal — confirmed primary driver. Marketing Spend OLD proxy (r=+0.41, p=0.0003) appeared significant but was endogenous (full 641 GL account). Correct MK** spend (r=+0.12, p=0.327) is not significant. All macro variables (CPI, GDP, Interest Rate) show weak or no linear signal — see 3B for scatter plots.

Sales Volume Over Time — Jan 2020 to Dec 2025

Monthly Total Sales Volume (M units)

72 months · trend + seasonality clearly visible

Average Sales by Month (Seasonality)

Average across all years — Q1 dip, Q4 peak

Sales vs Marketing Spend — Time Overlay

Dual axis — visual co-movement

Pearson Correlation with Sales Volume

-0.54

ASP (Price)

✅ p<0.0001

+0.42 ⚠️

Mkt OLD (endogenous)

p=0.0002 but spurious

+0.12

Mkt NEW (MK**)

❌ p=0.31 not sig.

-0.30

GDP Growth

⚠️ p=0.011

-0.23

Interest Rate

❌ p=0.053

-0.13

CPI YoY

❌ p=0.275

⚠️ Revised: ASP is the only valid linear signal — Old MKT was endogenous, New MKT is not significant

Scatter plots updated with two additional variables: ASP (average selling price) — confirmed negative linear relationship (r=−0.540, p<0.0001) — and the corrected MK** marketing spend (~97B/month, r=+0.121, p=0.31 ❌ not significant). The old 641 proxy (r=+0.420, p=0.0002 ✅) showed apparent correlation because it co-moves with overall business scale — a spurious endogeneity artifact, not a causal relationship. GDP, Interest Rate, and CPI remain excluded. The negative GDP slope is driven by the Covid-19 outlier (2021 Q3: GDP=−6.17%).

Linearity Checks — Scatter Plots vs Sales Volume (updated)

Sales vs ASP ★ Primary driver

r=−0.540 · Strong negative linear — confirmed ✅ p<0.0001

Sales vs Marketing OLD (641 proxy) ⚠️

r=+0.420 · Appears significant but ENDOGENOUS — spurious correlation

Sales vs Marketing NEW (MK** correct)

r=+0.121 · No significant linear relationship ❌ p=0.31

Sales vs GDP Growth %

r=−0.307 · Negative slope — Covid outlier distorts ⚠️

Sales vs Interest Rate %

r=−0.226 · No clear linear pattern ❌

Sales vs CPI YoY %

r=−0.131 · Random scatter — no relationship ❌

Descriptive Statistics — All Variables (n=72 months)

Variable	Mean	Min	Max	Std Dev	r with Sales	p-value
Sales Volume (M units)	498.7	317.4	618.6	62.0	—	—
ASP (VND/unit) ★	8,487	7,343	10,314	716	−0.540	<0.0001 ✅
Mkt OLD — 641 proxy ⚠️ endogenous	1,095	856	1,433	123	+0.420	0.0002 ⚠️ spurious
Mkt NEW — MK** correct (B VND)	97	0	209	42	+0.121	0.312 ❌
GDP Growth (%)	5.5	−6.2	13.7	3.5	−0.307	0.009 ⚠️
Interest Rate (%)	4.5	4.0	6.0	0.6	−0.226	0.057 ❌
CPI YoY (%)	3.1	0.1	6.4	1.1	−0.131	0.273 ❌

Group	Total Spend 6Y	Share	GDP R²	Interpretation
02 — Powdered Milk	23,442B VND	37%	18.4%	Premium/infant formula — highest sensitivity + highest spend
04 — UHT Liquid Milk	16,509B VND	26%	12.2%	Largest volume — core category, moderate sensitivity
07 — Other Dairy	8,186B VND	13%	3.5%	Moderate spend, low sensitivity
01 — Condensed Milk	3,590B VND	6%	0%	Inelastic staple — low spend, marketing less effective
06 — Yogurt	2,096B VND	3%	8.7%	Discretionary — some macro sensitivity
03 — Cream	1,177B VND	2%	4.4%	Niche, low spend

✅

DATA PIPELINE COMPLETE — 30 MAR 2026

Aug–Dec 2025 full data — re-exported by date split, 1,813,481 rows, 150K cap resolved. Model refit complete. · Q1 2026 validation — Jan–Mar 2026 actuals loaded, MAPE=2.25% out-of-sample. · TET_FEB dummy added — February structural dip captured, R² 0.782→0.823. Remaining items: product-group ASP decomposition · channel-level full models · lagged marketing investigation.

The model explains ~82% of quantity variance — 18% remains unexplained

The updated model (ASP + Seasonality + TET_FEB dummy, trained on full 72-month dataset) achieves R²=0.823, up from 0.666 on the original blended-data Model C. The improvement came from: (1) resolving the Aug–Dec 2025 data cap, (2) adding the TET_FEB dummy capturing the February structural dip, and (3) using complete real transaction quantity data throughout. The remaining ~18% unexplained variance is attributable to product mix shifts, channel distribution changes, and promotional activity — none of which are captured at monthly aggregate level. Model E with lagged quantity reaches R²=0.689 on the old data — the new model at 0.823 already exceeds what that ceiling suggested was achievable without additional feature sources.

This is a deliberate trade-off: the model prioritises interpretability and generalisability over overfitting to the training data. The sections below document the known sources of this unexplained variance and the planned steps to address them.

Improvement	R² impact	Status	Priority
✅ Fix Aug–Dec 2025 data cap	Done	Completed 30 Mar 2026	✅ Done
✅ TET_FEB dummy	+4.1pp R²	0.782 → 0.823	✅ Done
✅ Q1 2026 validation	MAPE 2.25%	Target was <8%	✅ Done
✅ Product-group ASP decomposition	+3–6%	Completed — real group ASPs	✅ Done
✅ Channel-level full model refit	+4–8%	Completed — AR system data used	✅ Done
✅ Lagged marketing investigation	+0%	Completed — 0/24 significant	✅ Done
Competitor price data	+3–10%	External dependency	★ Low

Channel-level models confirm: CVS is most price-sensitive, KA is macro-sensitive, marketing remains insignificant at all lags

Customer-to-channel mapping via AR data (98.8% coverage) enables separate OLS regressions per channel. CVS (Convenience Store) has the highest price elasticity (−0.576, ASP+5%→−2.8% qty) — convenience shoppers are discretionary and price-responsive. KA (Key Account) shows interest rate as top driver (β=−0.477) — institutional buyers are sensitive to credit conditions. NPP (Distributor) is trend-driven with low elasticity (−0.231), consistent with contractual/bulk ordering. Lagged MK** spend (0–3 months) adds at most ΔR²=+0.004 — confirming no detectable advertising effect at monthly aggregate level. HD (School Milk) excluded from modelling — extremely small/lumpy volumes produce unreliable estimates.

Channel Regression Results — Model B (ASP + Seasonality, honest)

Channel	ASP Simple R²	MKT OLD R²	MKT NEW R²	Model B R²	MAPE	ASP coef	ASP +5%→qty	Top driver
NPP — Distributor	0.411 ✅	0.098 ✅	0.026 ❌	0.668	8.64%	−0.231	−1.1%	Trend (β=−0.677)
MT — Modern Trade	0.048 ❌	0.233 ✅	0.002 ❌	0.562	9.11%	−0.381	−1.9%	Interest Rate (β=−0.287)
CVS — Convenience	0.022 ❌	0.098 ✅	0.001 ❌	0.695	7.63%	−0.576	−2.8%	Most price-sensitive
KA — Key Account	0.022 ❌	0.044 ❌	0.001 ❌	0.601	15.54%	−0.508	−2.5%	Interest Rate (β=−0.477)
TOTAL (blended)	0.278 ✅	0.185 ⚠️	0.021 ❌	0.625	6.99%	−0.212	−1.1%	ASP + Seasonality

HD (School Milk) excluded — 65 months, MAPE=830%, elasticity=−10.4 — too small/lumpy for reliable OLS. MKT NEW = correct MK** codes (~97B/month). All channels: MKT NEW not significant (p>0.20).

ASP by Channel Over Time

Average selling price per unit (VND) · KA has highest ASP — premium institutional pricing

Volume by Channel Over Time

Monthly quantity sold (M units) · NPP dominates — bulk distributor volume

Price Elasticity by Channel

ASP coefficient (Model B) · more negative = more price-sensitive

Model Fit (R²) by Channel

How well ASP + Seasonality explains each channel's volume

Simple R² — ASP vs MKT per Channel

Variable explanatory power before adding seasonality controls

Lagged Marketing Investigation — MK** Spend (t, t−1, t−2, t−3)

No lagged effect detected — MK** A&P spend is insignificant at all lag lengths

If advertising drives sales with a delay (awareness → consideration → purchase), we would expect MK** spend from month t−1, t−2, or t−3 to correlate with quantity in month t. Testing all four lags: the best is lag-3 which achieves Simple R²=6.4% (p=0.035) but adds only ΔR²=+0.003 to the full model — economically negligible. The MK** spend at ~97B/month (~0.03% of revenue) is likely too small relative to total business scale and monthly noise to detect statistically. Campaign-level or weekly data would be needed.

Simple R² by Lag — MK** vs Log(Qty)

Lag-3 is borderline significant (p=0.035) but economically negligible

ΔR² Added to Model B by Lag

Incremental R² gain from adding each lagged MKT to ASP + Seasonality

Lag	Simple R²	Full Model R²	ΔR² vs Baseline	MKT coef	MAPE	Verdict
Baseline (ASP only)	—	0.6293	—	—	7.00%	★ Selected
Lag 0 (same month)	0.021 ❌	0.6334	+0.0041	−0.0023	6.90%	Not sig.
Lag 1 (t−1 month)	0.024 ❌	0.6296	+0.0003	+0.0007	6.98%	Not sig.
Lag 2 (t−2 months)	0.054 ⚠️	0.6295	+0.0002	−0.0005	6.98%	Borderline
Lag 3 (t−3 months)	0.064 ⚠️	0.6321	+0.0028	−0.0020	6.92%	Not practical

Sales Volume Regression
Full Model — All Variables Tested

Revised: ASP is the primary controllable driver — marketing variable corrected, endogeneity identified

Product Group Analysis
Macro · ASP · Regression

EDA & Variable
Linearity Checks

EDA Finding: Sales is trended + seasonal. Marketing spend has the only clear linear signal.

⚠️ Revised: ASP is the only valid linear signal — Old MKT was endogenous, New MKT is not significant

Powdered Milk (02) receives 37% of marketing investment — and is the most macro-sensitive group

Limitations &
Future Improvements

The model explains ~82% of quantity variance — 18% remains unexplained

Channel-Level Models
& Marketing Lag Analysis

Channel-level models confirm: CVS is most price-sensitive, KA is macro-sensitive, marketing remains insignificant at all lags

No lagged effect detected — MK** A&P spend is insignificant at all lag lengths

Sales Volume RegressionFull Model — All Variables Tested

Revised: ASP is the primary controllable driver — marketing variable corrected, endogeneity identified

Product Group AnalysisMacro · ASP · Regression

EDA & VariableLinearity Checks

EDA Finding: Sales is trended + seasonal. Marketing spend has the only clear linear signal.

⚠️ Revised: ASP is the only valid linear signal — Old MKT was endogenous, New MKT is not significant

Powdered Milk (02) receives 37% of marketing investment — and is the most macro-sensitive group

Limitations &Future Improvements

The model explains ~82% of quantity variance — 18% remains unexplained

Channel-Level Models& Marketing Lag Analysis

Channel-level models confirm: CVS is most price-sensitive, KA is macro-sensitive, marketing remains insignificant at all lags

No lagged effect detected — MK** A&P spend is insignificant at all lag lengths

Sales Volume Regression
Full Model — All Variables Tested

Product Group Analysis
Macro · ASP · Regression

EDA & Variable
Linearity Checks

Limitations &
Future Improvements

Channel-Level Models
& Marketing Lag Analysis