Entity Stress Pipeline
Fundamental credit stress scoring for every company in financial_kg.companies — grounded in
CRISIL's published default and transition rates, FY2015–2025.
What this pipeline does
The pipeline reads every document in financial_kg.companies, extracts CRISIL rating signals
from crisilRatings[] and bankFacilities[], maps them to empirical
probability-of-default values from CRISIL's own published tables, and outputs a single composite
stress score (0–100) per entity.
The output CSV is sorted highest-stress first, making it immediately usable for watchlist monitoring, knowledge-graph edge weighting, or counterparty risk dashboards.
MongoDB fields consumed
| Field | Type | Used For |
|---|---|---|
| crisilRatings[].rating | String | Primary LT / ST rating — maps to CDR table PD |
| crisilRatings[].outlook | String | Outlook multiplier (Stable/Negative/Positive/Watch) |
| bankFacilities[].rating | String | Per-facility LT rating for exposure-weighted PD |
| bankFacilities[].amount | Number | ₹ Cr exposure weight for each facility |
| companyCode, crisilName, nseSymbol | String | Identity columns in output CSV |
| industryCode, industryName | String | Sector metadata in output CSV |
| listingStatus, ratingDate | String | Listing flag and rating staleness indicator |
Embedded CRISIL FY2025 tables
All PD and transition values are lifted directly from the CRISIL publication. No third-party estimates are used.
Table 1 — Long-Term 1-Year CDRs (FY15-25, Monthly Static Pools)
| Rating | Published 1-yr CDR | Notes |
|---|---|---|
| Crisil AAA | 0.00% | No AAA instrument has defaulted since inception (1987) |
| Crisil AA | 0.05% | 3 defaults since FY2020 — 2 pandemic, 1 operational glitch |
| Crisil A | 0.07% | — |
| Crisil BBB | 0.46% | Investment grade boundary; stability rate >91% |
| Crisil BB | 2.86% | Sub-investment grade begins here |
| Crisil B | 8.40% | — |
| Crisil C | 24.98% | Near-default category |
| Crisil D | 100.00% | Default by definition |
Modifier notches (+/−) are linearly interpolated between published anchors. E.g. AA+ = 0.03%, AA− = 0.08%, BBB− = 1.10%, BB+ = 1.90%.
Table 3 — Short-Term 1-Year CDRs (FY15-25)
| ST Rating | 1-yr CDR |
|---|---|
| Crisil A1+ | 0.02% |
| Crisil A1 | 0.01% |
| Crisil A2 | 0.23% |
| Crisil A3 | 0.43% |
| Crisil A4 | 4.72% |
Table 2 — 1-Year Downgrade Probability
Derived from the transition matrix as: 1 − stability_rate − upgrade_rate
| Rating | Downgrade Prob. | Derivation |
|---|---|---|
| AAA | 0.00% | Cannot downgrade further |
| AA | 1.75% | 1 − 95.96% stable − 2.28% upgrade |
| A | 3.04% | 1 − 93.14% stable − 3.82% upgrade |
| BBB | 4.64% | Sum of BB+B+C+D transition columns |
| BB | 6.57% | Sum of B+C+D transition columns |
| B | 8.80% | B→C (0.40%) + B→D (8.40%) |
| C | 46.77% | 1 − 53.23% stable − 0% upgrades |
Outlook Multipliers
| Outlook | Multiplier | Effect |
|---|---|---|
| Positive / Watch Positive | 0.80× | Reduces stress — improving credit trajectory |
| Stable | 1.00× | No adjustment |
| Developing | 1.15× | Slight stress — uncertain direction |
| Under Watch / Credit Watch | 1.20–1.25× | Moderate stress — under review |
| Negative | 1.30× | Higher stress — deteriorating trajectory |
| Watch Negative | 1.50× | Significant stress — imminent review |
How the stress score is computed
Four Weighted Components
Normalisation
Each component is mapped to [0, 1] before weighting:
- C1, C2 (PDs): Log-linear —
ln(1 + 99·PD) / ln(100). AAA→0.0, D→1.0. Log scale captures the non-linear nature of credit risk. - C3 (Downgrade prob): Linear —
min(dp / 0.25, 1.0). Anchored at ~25% max (C-rated entity). - C4 (Heterogeneity): Linear —
min(stdev / 0.50, 1.0). Max theoretical stdev across AAA–D mix ≈ 0.50.
Composite Formula
Fallback rules:
· No LT rating → C1 uses adjusted EW-PD, or ST PD as proxy
· No rating at all → Score = 50.0, Label = "Unknown"
· Unknown PD in norm() → returns 0.30 (neutral-cautious default)
Stress Labels
Risk Tier Classification
| Tier | Rating Categories |
|---|---|
| Investment Grade | AAA, AA+, AA, AA−, A+, A, A− BBB+, BBB, BBB− / A1+, A1, A2+, A2, A2− |
| Sub-Investment Grade | BB+, BB, BB−, B+, B, B− / A3+, A3, A3−, A4+, A4, A4− |
| Near Default | C |
| Default | D |
| Unrated | No recognised CRISIL rating found in document |
entity_stress_scores.csv — full column reference
One row per entity, sorted descending by stressScore.
Identity
| Column | Description |
|---|---|
| companyCode | Internal MongoDB identifier |
| crisilName | Company name (crisilName or mcaName fallback) |
| nseSymbol | NSE ticker symbol |
| industryCode / industryName | CRISIL sector classification |
| listingStatus | Listed / Unlisted |
| ratingDate | Date of latest CRISIL rating action (staleness indicator) |
Rating & PD
| Column | Description |
|---|---|
| primaryLT_Rating | Cleaned LT rating from crisilRatings[] (e.g. AAA, BBB+) |
| primaryLT_Outlook | Normalised outlook key (stable, negative, positive…) |
| primaryST_Rating | Short-term rating if present (e.g. A1+) |
| primaryLT_PD_% | 1-yr CDR for primary LT rating (CRISIL Table 1) |
| primaryST_PD_% | 1-yr CDR for primary ST rating (CRISIL Table 3) |
| outlookMultiplier | Outlook adjustment factor applied to PD |
| adjustedLT_PD_% | primaryLT_PD × outlookMultiplier |
Facility Exposure
| Column | Description |
|---|---|
| numBankFacilities | Total count of bankFacilities[] entries |
| numRatedFacilities | Count of facilities with a recognised LT rating |
| totalExposure_Cr | Sum of facility amounts (₹ Cr) for rated facilities |
| ewPD_% | Exposure-weighted average PD across all rated facilities |
| adjustedEW_PD_% | ewPD × outlookMultiplier |
| ewGrade | Exposure-weighted numeric grade (1=AAA … 18=D) |
| ratingHeterogeneity | Std dev of facility PDs × 100 (higher = more spread) |
Score Components & Final Output
| Column | Description |
|---|---|
| 1yr_DowngradeProbab_% | P(≥1-notch LT downgrade in 1yr) from CRISIL Table 2 |
| comp_PrimaryPD | C1 component (0–100), pre-weighting |
| comp_ExposureWeightedPD | C2 component (0–100), pre-weighting |
| comp_DowngradeRisk | C3 component (0–100), pre-weighting |
| comp_Heterogeneity | C4 component (0–100), pre-weighting |
| stressScore | Composite score 0–100 (sorted descending) |
| stressLabel | Minimal / Low / Moderate / Elevated / High / Severe / Unknown |
| riskTier | Investment Grade / Sub-Investment Grade / Near Default / Default / Unrated |
Running the pipeline
Prerequisites
pip install pymongo
Basic run (all documents)
python entity_stress_pipeline.py \ --uri "mongodb://localhost:27017" \ --db financial_kg \ --col companies \ --out entity_stress_scores.csv
Test run (10 documents)
python entity_stress_pipeline.py --limit 10 --out test_output.csv
MongoDB Atlas
python entity_stress_pipeline.py \ --uri "mongodb+srv://user:[email protected]/"
CLI Arguments
| Argument | Default | Description |
|---|---|---|
| --uri | mongodb://localhost:27017 | MongoDB connection URI |
| --db | financial_kg | Database name |
| --col | companies | Collection name |
| --out | entity_stress_scores.csv | Output CSV file path |
| --limit | 0 (all) | Max documents to process; 0 = all |
Console Output Example
# After processing completes: ✅ 1247 entities written → entity_stress_scores.csv (errors: 0) Stress Label Distribution: Minimal 312 ████████████ Low 418 ████████████████ Moderate 289 ███████████ Elevated 142 █████ High 63 ██ Severe 23 █ Stress Score min=0.0 max=68.4 mean=14.2 median=11.8
Indian Oil Corporation Limited (INDOIL)
IOC holds Crisil AAA/Stable on both its Term Loan and Short-Term Loan, and all 36 bank facilities are rated AAA/Stable.
Step 1 — Parse crisilRatings[]
| Instrument | Rating | Outlook |
|---|---|---|
| Term Loan | Crisil AAA | Stable |
| Short Term Loan | Crisil A1+ | Stable |
- Primary LT = AAA → PD = 0.00%
- Primary ST = A1+ → PD = 0.02%
- Outlook = Stable → multiplier = 1.00×
- Adjusted LT PD = 0.00% × 1.00 = 0.00%
- 1-yr downgrade probability (AAA) = 0.00%
Step 2 — Exposure-Weighted PD from bankFacilities[]
All 36 facilities rated AAA → every facility PD = 0.00%.
- EW-PD = 0.00% | Adjusted EW-PD = 0.00%
- Rating heterogeneity (std dev) = 0.0000 — fully uniform
Step 3 — Normalise & Composite
| Component | Raw Value | Normalised (0–100) | Weight | Contribution |
|---|---|---|---|---|
| C1 Primary PD | 0.00% | 0.00 | 35% | 0.00 |
| C2 Exposure-Weighted PD | 0.00% | 0.00 | 30% | 0.00 |
| C3 Downgrade Probability | 0.00% | 0.00 | 20% | 0.00 |
| C4 Heterogeneity | 0.0000 | 0.00 | 15% | 0.00 |
Stress Label = Minimal | Risk Tier = Investment Grade
Design decisions & known limitations
Why log-linear normalisation for PD?
Credit risk is non-linear. The economic gap between BBB (0.46%) and B (8.40%) is far larger than the raw arithmetic difference suggests. A log transform compresses the upper tail and spreads the lower range, giving better resolution to investment-grade distinctions while capturing sub-IG severity accurately.
Why 35 / 30 / 20 / 15 weights?
The primary LT rating (C1) is CRISIL's authoritative credit opinion, so it anchors the score. Facility exposure weighting (C2) captures lender-level concentration risk. Downgrade probability (C3) adds a forward-looking dimension. Heterogeneity (C4) is a secondary signal for internal rating inconsistency across facilities.
Known Limitations
- Financial statement data (leverage, coverage ratios) is not included — this is a ratings-based pipeline only.
- Ratings may be stale;
ratingDateis included in the CSV to flag this. - Facilities without a recognised CRISIL LT rating contribute to
numBankFacilitiesbut not to EW-PD. - INC (Issuer Not Cooperating) suffix is not extracted — these appear as Unrated with score = 50.
- Modifier notch interpolation (AA+, AA−, etc.) is a linear approximation between CRISIL's broad-category anchors.
Extending the Pipeline
- Financial ratio stress — multiply stressScore by a leverage factor derived from
mcaPaidupCapital / mcaAuthorizedCapital. - Industry adjustment — high-default industries (CRISIL Annexure 7) carry an industry beta multiplier.
- Time-to-default overlay — combine with CRISIL Annexure 8 survival curves for 2-year and 3-year variants.
- Graph integration — feed
stressScoreas edge weights into the financial_kg knowledge graph for counterparty contagion analysis.
Sources
- CRISIL Ratings Annual Default and Ratings Transition Study — FY2025 (Tables 1, 2, 3, 4)
- CRISIL Default Study FY2025 — Annexure 10: Methodology (monthly static pools, marginal default rate method)
- CRISIL Default Study FY2025 — Annexure 7: Industry-wise classification of defaults
- CRISIL Default Study FY2025 — Annexure 8: Analysis of defaults — Time to default
financial_kg.companiesMongoDB collection schema- pymongo Documentation — pymongo.readthedocs.io