Coverage for src / jquantstats / portfolio.py: 100%

310 statements  

« prev     ^ index     » next       coverage.py v7.13.5, created at 2026-03-26 18:44 +0000

1"""Portfolio analytics class for quant finance. 

2 

3This module provides :class:`Portfolio`, a single frozen dataclass that 

4stores the raw portfolio inputs (prices, cash positions, AUM) and exposes 

5both the derived data series and the full analytics / visualisation suite: 

6 

7- Derived data series — :attr:`profits`, :attr:`profit`, :attr:`nav_accumulated`, 

8 :attr:`returns`, :attr:`monthly`, :attr:`nav_compounded`, :attr:`highwater`, 

9 :attr:`drawdown`, :attr:`all` 

10- Lazy composition accessors — :attr:`stats`, :attr:`plots`, :attr:`report` 

11- Portfolio transforms — :meth:`truncate`, :meth:`lag`, :meth:`smoothed_holding` 

12- Attribution — :attr:`tilt`, :attr:`timing`, :attr:`tilt_timing_decomp` 

13- Turnover analysis — :attr:`turnover`, :attr:`turnover_weekly`, :meth:`turnover_summary` 

14- Cost analysis — :meth:`cost_adjusted_returns`, :meth:`trading_cost_impact` 

15- Utility — :meth:`correlation` 

16""" 

17 

18import dataclasses 

19from typing import TYPE_CHECKING, Self 

20 

21if TYPE_CHECKING: 

22 from ._stats import Stats as Stats 

23 from .data import Data as Data 

24 

25import polars as pl 

26import polars.selectors as cs 

27 

28from ._cost_model import CostModel 

29from ._plots import PortfolioPlots 

30from ._reports import Report 

31from .exceptions import ( 

32 IntegerIndexBoundError, 

33 InvalidCashPositionTypeError, 

34 InvalidPricesTypeError, 

35 MissingDateColumnError, 

36 NonPositiveAumError, 

37 RowCountMismatchError, 

38) 

39 

40 

41@dataclasses.dataclass(frozen=True, slots=True) 

42class Portfolio: 

43 """Portfolio analytics class for quant finance. 

44 

45 Stores the three raw inputs — cash positions, prices, and AUM — and 

46 exposes the standard derived data series, analytics facades, transforms, 

47 and attribution tools. 

48 

49 Derived data series: 

50 

51 - :attr:`profits` — per-asset daily cash P&L 

52 - :attr:`profit` — aggregate daily portfolio profit 

53 - :attr:`nav_accumulated` — cumulative additive NAV 

54 - :attr:`nav_compounded` — compounded NAV 

55 - :attr:`returns` — daily returns (profit / AUM) 

56 - :attr:`monthly` — monthly compounded returns 

57 - :attr:`highwater` — running high-water mark 

58 - :attr:`drawdown` — drawdown from high-water mark 

59 - :attr:`all` — merged view of all derived series 

60 

61 - Lazy composition accessors: :attr:`stats`, :attr:`plots`, :attr:`report` 

62 - Portfolio transforms: :meth:`truncate`, :meth:`lag`, 

63 :meth:`smoothed_holding` 

64 - Attribution: :attr:`tilt`, :attr:`timing`, :attr:`tilt_timing_decomp` 

65 - Turnover: :attr:`turnover`, :attr:`turnover_weekly`, 

66 :meth:`turnover_summary` 

67 - Cost analysis: :meth:`cost_adjusted_returns`, 

68 :meth:`trading_cost_impact` 

69 - Utility: :meth:`correlation` 

70 

71 Attributes: 

72 cashposition: Polars DataFrame of positions per asset over time 

73 (includes date column if present). 

74 prices: Polars DataFrame of prices per asset over time (includes date 

75 column if present). 

76 aum: Assets under management used as base NAV offset. 

77 

78 Analytics facades 

79 ----------------- 

80 - ``.stats`` : delegates to the legacy ``Stats`` pipeline via ``.data``; all 50+ metrics available. 

81 - ``.plots`` : portfolio-specific ``Plots``; NAV overlays, lead-lag IR, rolling Sharpe/vol, heatmaps. 

82 - ``.report`` : HTML ``Report``; self-contained portfolio performance report. 

83 - ``.data`` : bridge to the legacy ``Data`` / ``Stats`` / ``DataPlots`` pipeline. 

84 

85 ``.plots`` and ``.report`` are intentionally *not* delegated to the legacy path: the legacy 

86 path operates on a bare returns series, while the analytics path has access to raw prices, 

87 positions, and AUM for richer portfolio-specific visualisations. 

88 

89 Cost models 

90 ----------- 

91 Two independent cost models are provided. They are not interchangeable: 

92 

93 **Model A — position-delta (stateful, set at construction):** 

94 ``cost_per_unit: float`` — one-way cost per unit of position change (e.g. 0.01 per share). 

95 Used by ``.position_delta_costs`` and ``.net_cost_nav``. 

96 Best for: equity portfolios where cost scales with shares traded. 

97 

98 **Model B — turnover-bps (stateless, passed at call time):** 

99 ``cost_bps: float`` — one-way cost in basis points of AUM turnover (e.g. 5 bps). 

100 Used by ``.cost_adjusted_returns(cost_bps)`` and ``.trading_cost_impact(max_bps)``. 

101 Best for: macro / fund-of-funds portfolios where cost scales with notional traded. 

102 

103 To sweep a range of cost assumptions use ``trading_cost_impact(max_bps=20)`` (Model B). 

104 To compute a net-NAV curve set ``cost_per_unit`` at construction and read ``.net_cost_nav`` (Model A). 

105 

106 Date column requirement 

107 ----------------------- 

108 Most analytics work with or without a ``date`` column. The following features require a 

109 temporal ``date`` column (``pl.Date`` or ``pl.Datetime``): 

110 

111 - ``portfolio.plots.correlation_heatmap()`` 

112 - ``portfolio.plots.lead_lag_ir_plot()`` 

113 - ``stats.monthly_win_rate()`` — returns NaN per column when no date is present 

114 - ``stats.annual_breakdown()`` — raises ``ValueError`` when no date is present 

115 - ``stats.max_drawdown_duration()`` — returns period count (int) instead of days 

116 

117 Portfolios without a ``date`` column (integer-indexed) are fully supported for 

118 NAV, returns, Sharpe, drawdown, cost analytics, and most rolling metrics. 

119 

120 Examples: 

121 >>> import polars as pl 

122 >>> from datetime import date 

123 >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]}) 

124 >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]}) 

125 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6) 

126 >>> pf.assets 

127 ['A'] 

128 """ 

129 

130 cashposition: pl.DataFrame 

131 prices: pl.DataFrame 

132 aum: float 

133 cost_per_unit: float = 0.0 

134 cost_bps: float = 0.0 

135 _data_bridge: "Data | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False) 

136 _stats_cache: "Stats | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False) 

137 _plots_cache: "PortfolioPlots | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False) 

138 _report_cache: "Report | None" = dataclasses.field(init=False, repr=False, compare=False, hash=False) 

139 

140 @staticmethod 

141 def _build_data_bridge(ret: pl.DataFrame) -> "Data": 

142 """Build a :class:`~jquantstats._data.Data` bridge from a returns frame. 

143 

144 Splits out the ``'date'`` column (if present) into an index and passes 

145 the remaining numeric columns as returns. Used internally to populate 

146 ``_data_bridge`` at construction time so the ``data`` property is O(1). 

147 

148 Args: 

149 ret: Returns DataFrame, optionally with a leading ``'date'`` column. 

150 

151 Returns: 

152 A :class:`~jquantstats._data.Data` instance backed by *ret*. 

153 """ 

154 from .data import Data 

155 

156 returns_only = ret.select("returns") 

157 if "date" in ret.columns: 

158 return Data(returns=returns_only, index=ret.select("date")) 

159 return Data(returns=returns_only, index=pl.DataFrame({"index": list(range(ret.height))})) 

160 

161 def __post_init__(self) -> None: 

162 """Validate input types, shapes, and parameters post-initialization.""" 

163 if not isinstance(self.prices, pl.DataFrame): 

164 raise InvalidPricesTypeError(type(self.prices).__name__) 

165 if not isinstance(self.cashposition, pl.DataFrame): 

166 raise InvalidCashPositionTypeError(type(self.cashposition).__name__) 

167 if self.cashposition.shape[0] != self.prices.shape[0]: 

168 raise RowCountMismatchError(self.prices.shape[0], self.cashposition.shape[0]) 

169 if self.aum <= 0.0: 

170 raise NonPositiveAumError(self.aum) 

171 object.__setattr__(self, "_data_bridge", None) 

172 object.__setattr__(self, "_stats_cache", None) 

173 object.__setattr__(self, "_plots_cache", None) 

174 object.__setattr__(self, "_report_cache", None) 

175 

176 def _date_range(self) -> tuple[int, object, object]: 

177 """Return (rows, start, end) for the portfolio's returns series. 

178 

179 ``start`` and ``end`` are ``None`` when there is no ``'date'`` column. 

180 """ 

181 ret = self.returns 

182 rows = ret.height 

183 if "date" in ret.columns: 

184 return rows, ret["date"].min(), ret["date"].max() 

185 return rows, None, None 

186 

187 @property 

188 def cost_model(self) -> CostModel: 

189 """Return the active cost model as a :class:`~jquantstats.CostModel` instance. 

190 

191 Returns: 

192 A :class:`CostModel` whose ``cost_per_unit`` and ``cost_bps`` fields 

193 reflect the values stored on this portfolio. 

194 """ 

195 return CostModel(cost_per_unit=self.cost_per_unit, cost_bps=self.cost_bps) 

196 

197 def __repr__(self) -> str: 

198 """Return a string representation of the Portfolio object.""" 

199 rows, start, end = self._date_range() 

200 if start is not None: 

201 return f"Portfolio(assets={self.assets}, rows={rows}, start={start}, end={end})" 

202 return f"Portfolio(assets={self.assets}, rows={rows})" 

203 

204 def describe(self) -> pl.DataFrame: 

205 """Return a tidy summary of shape, date range and asset names. 

206 

207 Returns: 

208 ------- 

209 pl.DataFrame 

210 One row per asset with columns: asset, start, end, rows. 

211 

212 Examples: 

213 >>> import polars as pl 

214 >>> from datetime import date 

215 >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]}) 

216 >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]}) 

217 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6) 

218 >>> df = pf.describe() 

219 >>> list(df.columns) 

220 ['asset', 'start', 'end', 'rows'] 

221 """ 

222 rows, start, end = self._date_range() 

223 return pl.DataFrame( 

224 { 

225 "asset": self.assets, 

226 "start": [start] * len(self.assets), 

227 "end": [end] * len(self.assets), 

228 "rows": [rows] * len(self.assets), 

229 } 

230 ) 

231 

232 # ── Factory classmethods ────────────────────────────────────────────────── 

233 

234 @classmethod 

235 def from_risk_position( 

236 cls, 

237 prices: pl.DataFrame, 

238 risk_position: pl.DataFrame, 

239 aum: float, 

240 vola: int | dict[str, int] = 32, 

241 vol_cap: float | None = None, 

242 cost_per_unit: float = 0.0, 

243 cost_bps: float = 0.0, 

244 cost_model: CostModel | None = None, 

245 ) -> Self: 

246 """Create a Portfolio from per-asset risk positions. 

247 

248 De-volatizes each risk position using an EWMA volatility estimate 

249 derived from the corresponding price series. 

250 

251 Args: 

252 prices: Price levels per asset over time (may include a date column). 

253 risk_position: Risk units per asset aligned with prices. 

254 vola: EWMA lookback (span-equivalent) used to estimate volatility. 

255 Pass an ``int`` to apply the same span to every asset, or a 

256 ``dict[str, int]`` to set a per-asset span (assets absent from 

257 the dict default to ``32``). Every span value must be a 

258 positive integer; a ``ValueError`` is raised otherwise. Dict 

259 keys that do not correspond to any numeric column in *prices* 

260 also raise a ``ValueError``. 

261 vol_cap: Optional lower bound for the EWMA volatility estimate. 

262 When provided, the vol series is clipped from below at this 

263 value before dividing the risk position, preventing 

264 position blow-up in calm, low-volatility regimes. For 

265 example, ``vol_cap=0.05`` ensures annualised vol is never 

266 estimated below 5%. Must be positive when not ``None``. 

267 aum: Assets under management used as the base NAV offset. 

268 cost_per_unit: One-way trading cost per unit of position change. 

269 Defaults to 0.0 (no cost). Ignored when *cost_model* is given. 

270 cost_bps: One-way trading cost in basis points of AUM turnover. 

271 Defaults to 0.0 (no cost). Ignored when *cost_model* is given. 

272 cost_model: Optional :class:`~jquantstats.CostModel` 

273 instance. When supplied, its ``cost_per_unit`` and 

274 ``cost_bps`` values take precedence over the individual 

275 parameters above. 

276 

277 Returns: 

278 A Portfolio instance whose cash positions are risk_position 

279 divided by EWMA volatility. 

280 

281 Raises: 

282 ValueError: If any span value in *vola* is ≤ 0, or if a key in a 

283 *vola* dict does not match any numeric column in *prices*, or 

284 if *vol_cap* is provided but is not positive. 

285 """ 

286 if cost_model is not None: 

287 cost_per_unit = cost_model.cost_per_unit 

288 cost_bps = cost_model.cost_bps 

289 assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()] 

290 

291 # ── Validate vol_cap ────────────────────────────────────────────────── 

292 if vol_cap is not None and vol_cap <= 0: 

293 raise ValueError(f"vol_cap must be a positive number when provided, got {vol_cap!r}") # noqa: TRY003 

294 

295 # ── Validate vola ───────────────────────────────────────────────────── 

296 if isinstance(vola, dict): 

297 unknown = set(vola.keys()) - set(assets) 

298 if unknown: 

299 raise ValueError( # noqa: TRY003 

300 f"vola dict contains keys that do not match any numeric column in prices: {sorted(unknown)}" 

301 ) 

302 for asset, span in vola.items(): 

303 if int(span) <= 0: 

304 raise ValueError(f"vola span for '{asset}' must be a positive integer, got {span!r}") # noqa: TRY003 

305 else: 

306 if int(vola) <= 0: 

307 raise ValueError(f"vola span must be a positive integer, got {vola!r}") # noqa: TRY003 

308 

309 def _span(asset: str) -> int: 

310 """Return the EWMA span for *asset*, falling back to 32 if not specified.""" 

311 if isinstance(vola, dict): 

312 return int(vola.get(asset, 32)) 

313 return int(vola) 

314 

315 def _vol(asset: str) -> pl.Series: 

316 """Return the EWMA volatility series for *asset*, optionally clipped from below.""" 

317 vol = prices[asset].pct_change().ewm_std(com=_span(asset) - 1, adjust=True, min_samples=_span(asset)) 

318 if vol_cap is not None: 

319 vol = vol.clip(lower_bound=vol_cap) 

320 return vol 

321 

322 cash_position = risk_position.with_columns((pl.col(asset) / _vol(asset)).alias(asset) for asset in assets) 

323 return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps) 

324 

325 @classmethod 

326 def from_position( 

327 cls, 

328 prices: pl.DataFrame, 

329 position: pl.DataFrame, 

330 aum: float, 

331 cost_per_unit: float = 0.0, 

332 cost_bps: float = 0.0, 

333 cost_model: CostModel | None = None, 

334 ) -> Self: 

335 """Create a Portfolio from share/unit positions. 

336 

337 Converts *position* (number of units held per asset) to cash exposure 

338 by multiplying element-wise with *prices*, then delegates to 

339 :py:meth:`from_cash_position`. 

340 

341 Args: 

342 prices: Price levels per asset over time (may include a date column). 

343 position: Number of units held per asset over time, aligned with 

344 *prices*. Non-numeric columns (e.g. ``'date'``) are passed 

345 through unchanged. 

346 aum: Assets under management used as the base NAV offset. 

347 cost_per_unit: One-way trading cost per unit of position change. 

348 Defaults to 0.0 (no cost). Ignored when *cost_model* is given. 

349 cost_bps: One-way trading cost in basis points of AUM turnover. 

350 Defaults to 0.0 (no cost). Ignored when *cost_model* is given. 

351 cost_model: Optional :class:`~jquantstats.CostModel` instance. 

352 When supplied, its ``cost_per_unit`` and ``cost_bps`` values 

353 take precedence over the individual parameters above. 

354 

355 Returns: 

356 A Portfolio instance whose cash positions equal *position* x *prices*. 

357 

358 Examples: 

359 >>> import polars as pl 

360 >>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]}) 

361 >>> pos = pl.DataFrame({"A": [10.0, 10.0, 10.0]}) 

362 >>> pf = Portfolio.from_position(prices=prices, position=pos, aum=1e6) 

363 >>> pf.cashposition["A"].to_list() 

364 [1000.0, 1100.0, 1050.0] 

365 """ 

366 assets = [col for col, dtype in prices.schema.items() if dtype.is_numeric()] 

367 cash_position = position.with_columns((pl.col(asset) * prices[asset]).alias(asset) for asset in assets) 

368 return cls.from_cash_position( 

369 prices=prices, 

370 cash_position=cash_position, 

371 aum=aum, 

372 cost_per_unit=cost_per_unit, 

373 cost_bps=cost_bps, 

374 cost_model=cost_model, 

375 ) 

376 

377 @classmethod 

378 def from_cash_position( 

379 cls, 

380 prices: pl.DataFrame, 

381 cash_position: pl.DataFrame, 

382 aum: float, 

383 cost_per_unit: float = 0.0, 

384 cost_bps: float = 0.0, 

385 cost_model: CostModel | None = None, 

386 ) -> Self: 

387 """Create a Portfolio directly from cash positions aligned with prices. 

388 

389 Args: 

390 prices: Price levels per asset over time (may include a date column). 

391 cash_position: Cash exposure per asset over time. 

392 aum: Assets under management used as the base NAV offset. 

393 cost_per_unit: One-way trading cost per unit of position change. 

394 Defaults to 0.0 (no cost). Ignored when *cost_model* is given. 

395 cost_bps: One-way trading cost in basis points of AUM turnover. 

396 Defaults to 0.0 (no cost). Ignored when *cost_model* is given. 

397 cost_model: Optional :class:`~jquantstats.CostModel` 

398 instance. When supplied, its ``cost_per_unit`` and 

399 ``cost_bps`` values take precedence over the individual 

400 parameters above. 

401 

402 Returns: 

403 A Portfolio instance with the provided cash positions. 

404 """ 

405 if cost_model is not None: 

406 cost_per_unit = cost_model.cost_per_unit 

407 cost_bps = cost_model.cost_bps 

408 return cls(prices=prices, cashposition=cash_position, aum=aum, cost_per_unit=cost_per_unit, cost_bps=cost_bps) 

409 

410 # ── Internal helpers ─────────────────────────────────────────────────────── 

411 

412 @staticmethod 

413 def _assert_clean_series(series: pl.Series, name: str = "") -> None: 

414 """Raise ValueError if *series* contains nulls or non-finite values.""" 

415 if series.null_count() != 0: 

416 raise ValueError 

417 if not series.is_finite().all(): 

418 raise ValueError 

419 

420 # ── Core data properties ─────────────────────────────────────────────────── 

421 

422 @property 

423 def assets(self) -> list[str]: 

424 """List the asset column names from prices (numeric columns). 

425 

426 Returns: 

427 list[str]: Names of numeric columns in prices; typically excludes 

428 ``'date'``. 

429 """ 

430 return [c for c in self.prices.columns if self.prices[c].dtype.is_numeric()] 

431 

432 @property 

433 def profits(self) -> pl.DataFrame: 

434 """Compute per-asset daily cash profits, preserving non-numeric columns. 

435 

436 Returns: 

437 pl.DataFrame: Per-asset daily profit series along with any 

438 non-numeric columns (e.g., ``'date'``). 

439 

440 Examples: 

441 >>> import polars as pl 

442 >>> prices = pl.DataFrame({"A": [100.0, 110.0, 105.0]}) 

443 >>> pos = pl.DataFrame({"A": [1000.0, 1000.0, 1000.0]}) 

444 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6) 

445 >>> pf.profits.columns 

446 ['A'] 

447 """ 

448 assets = [c for c in self.prices.columns if self.prices[c].dtype.is_numeric()] 

449 

450 result = self.prices.with_columns( 

451 (self.prices[asset].pct_change().fill_null(0.0) * self.cashposition[asset].shift(n=1).fill_null(0.0)).alias( 

452 asset 

453 ) 

454 for asset in assets 

455 ) 

456 

457 if assets: 

458 result = result.with_columns( 

459 pl.when(pl.col(c).is_finite()).then(pl.col(c)).otherwise(0.0).fill_null(0.0).alias(c) for c in assets 

460 ) 

461 return result 

462 

463 @property 

464 def profit(self) -> pl.DataFrame: 

465 """Return total daily portfolio profit including the ``'date'`` column. 

466 

467 Aggregates per-asset profits into a single ``'profit'`` column and 

468 validates that no day's total profit is NaN/null. 

469 """ 

470 df_profits = self.profits 

471 assets = [c for c in df_profits.columns if df_profits[c].dtype.is_numeric()] 

472 

473 if not assets: 

474 raise ValueError 

475 

476 non_assets = [c for c in df_profits.columns if c not in set(assets)] 

477 

478 portfolio_daily_profit = pl.sum_horizontal([pl.col(c).fill_null(0.0) for c in assets]).alias("profit") 

479 result = df_profits.select([*non_assets, portfolio_daily_profit]) 

480 

481 self._assert_clean_series(series=result["profit"]) 

482 return result 

483 

484 @property 

485 def nav_accumulated(self) -> pl.DataFrame: 

486 """Compute cumulative additive NAV of the portfolio, preserving ``'date'``.""" 

487 return self.profit.with_columns((pl.col("profit").cum_sum() + self.aum).alias("NAV_accumulated")) 

488 

489 @property 

490 def returns(self) -> pl.DataFrame: 

491 """Return daily returns as profit scaled by AUM, preserving ``'date'``. 

492 

493 The returned DataFrame contains the original ``'date'`` column with the 

494 ``'profit'`` column scaled by AUM (i.e., per-period returns), and also 

495 an additional convenience column named ``'returns'`` with the same 

496 values for downstream consumers. 

497 """ 

498 return self.nav_accumulated.with_columns( 

499 (pl.col("profit") / self.aum).alias("returns"), 

500 ) 

501 

502 @property 

503 def monthly(self) -> pl.DataFrame: 

504 """Return monthly compounded returns and calendar columns. 

505 

506 Aggregates daily returns (profit/AUM) by calendar month and computes 

507 the compounded monthly return: prod(1 + r_d) - 1. The resulting frame 

508 includes: 

509 

510 - ``date``: month-end label as a Polars Date (end of the grouping window) 

511 - ``returns``: compounded monthly return 

512 - ``NAV_accumulated``: last NAV within the month 

513 - ``profit``: summed profit within the month 

514 - ``year``: integer year (e.g., 2020) 

515 - ``month``: integer month number (1-12) 

516 - ``month_name``: abbreviated month name (e.g., ``"Jan"``, ``"Feb"``) 

517 

518 Raises: 

519 MissingDateColumnError: If the portfolio data has no ``'date'`` 

520 column. 

521 """ 

522 if "date" not in self.prices.columns: 

523 raise MissingDateColumnError("monthly") 

524 daily = self.returns.select(["date", "returns", "profit", "NAV_accumulated"]) 

525 monthly = ( 

526 daily.group_by_dynamic( 

527 "date", 

528 every="1mo", 

529 period="1mo", 

530 label="left", 

531 closed="right", 

532 ) 

533 .agg( 

534 [ 

535 pl.col("profit").sum().alias("profit"), 

536 pl.col("NAV_accumulated").last().alias("NAV_accumulated"), 

537 (pl.col("returns") + 1.0).product().alias("gross"), 

538 ] 

539 ) 

540 .with_columns((pl.col("gross") - 1.0).alias("returns")) 

541 .select(["date", "returns", "NAV_accumulated", "profit"]) 

542 .with_columns( 

543 [ 

544 pl.col("date").dt.year().alias("year"), 

545 pl.col("date").dt.month().alias("month"), 

546 pl.col("date").dt.strftime("%b").alias("month_name"), 

547 ] 

548 ) 

549 .sort("date") 

550 ) 

551 return monthly 

552 

553 @property 

554 def nav_compounded(self) -> pl.DataFrame: 

555 """Compute compounded NAV from returns (profit/AUM), preserving ``'date'``.""" 

556 return self.returns.with_columns(((pl.col("returns") + 1.0).cum_prod() * self.aum).alias("NAV_compounded")) 

557 

558 @property 

559 def highwater(self) -> pl.DataFrame: 

560 """Return the cumulative maximum of NAV as the high-water mark series. 

561 

562 The resulting DataFrame preserves the ``'date'`` column and adds a 

563 ``'highwater'`` column computed as the cumulative maximum of 

564 ``'NAV_accumulated'``. 

565 """ 

566 return self.returns.with_columns(pl.col("NAV_accumulated").cum_max().alias("highwater")) 

567 

568 @property 

569 def drawdown(self) -> pl.DataFrame: 

570 """Return drawdown as the distance from high-water mark to current NAV. 

571 

572 Computes ``'drawdown'`` = ``'highwater'`` - ``'NAV_accumulated'`` and 

573 preserves the ``'date'`` column alongside the intermediate columns. 

574 """ 

575 return self.highwater.with_columns( 

576 (pl.col("highwater") - pl.col("NAV_accumulated")).alias("drawdown"), 

577 ((pl.col("highwater") - pl.col("NAV_accumulated")) / pl.col("highwater")).alias("drawdown_pct"), 

578 ) 

579 

580 @property 

581 def all(self) -> pl.DataFrame: 

582 """Return a merged view of drawdown and compounded NAV. 

583 

584 When a ``'date'`` column is present the two frames are joined on that 

585 column to ensure temporal alignment. When the data is integer-indexed 

586 (no ``'date'`` column) the frames are stacked horizontally — they are 

587 guaranteed to have identical row counts because both are derived from 

588 the same source portfolio. 

589 """ 

590 left = self.drawdown 

591 if "date" in left.columns: 

592 right = self.nav_compounded.select(["date", "NAV_compounded"]) 

593 return left.join(right, on="date", how="inner") 

594 else: 

595 right = self.nav_compounded.select(["NAV_compounded"]) 

596 return left.hstack(right) 

597 

598 # ── Lazy composition accessors ───────────────────────────────────────────── 

599 

600 @property 

601 def data(self) -> "Data": 

602 """Build a legacy :class:`~jquantstats._data.Data` object from this portfolio's returns. 

603 

604 This bridges the two entry points: ``Portfolio`` compiles the NAV curve from 

605 prices and positions; the returned :class:`~jquantstats._data.Data` object 

606 gives access to the full legacy analytics pipeline (``data.stats``, 

607 ``data.plots``, ``data.reports``). 

608 

609 Returns: 

610 :class:`~jquantstats._data.Data`: A Data object whose ``returns`` column 

611 is the portfolio's daily return series and whose ``index`` holds the date 

612 column (or a synthetic integer index for date-free portfolios). 

613 

614 Examples: 

615 >>> import polars as pl 

616 >>> from datetime import date 

617 >>> prices = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [100.0, 110.0]}) 

618 >>> pos = pl.DataFrame({"date": [date(2020, 1, 1), date(2020, 1, 2)], "A": [1000.0, 1000.0]}) 

619 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e6) 

620 >>> d = pf.data 

621 >>> "returns" in d.returns.columns 

622 True 

623 """ 

624 if self._data_bridge is not None: 

625 return self._data_bridge 

626 bridge = Portfolio._build_data_bridge(self.returns) 

627 object.__setattr__(self, "_data_bridge", bridge) 

628 return bridge 

629 

630 @property 

631 def stats(self) -> "Stats": 

632 """Return a Stats object built from the portfolio's daily returns. 

633 

634 Delegates to the legacy :class:`~jquantstats._stats.Stats` pipeline via 

635 :attr:`data`, so all analytics (Sharpe, drawdown, summary, etc.) are 

636 available through the shared implementation. 

637 

638 The result is cached after first access so repeated calls are O(1). 

639 """ 

640 if self._stats_cache is None: 

641 object.__setattr__(self, "_stats_cache", self.data.stats) 

642 return self._stats_cache # type: ignore[return-value] 

643 

644 @property 

645 def plots(self) -> PortfolioPlots: 

646 """Convenience accessor returning a PortfolioPlots facade for this portfolio. 

647 

648 Use this to create Plotly visualizations such as snapshots, lagged 

649 performance curves, and lead/lag IR charts. 

650 

651 Returns: 

652 :class:`~jquantstats._plots.PortfolioPlots`: Helper object with 

653 plotting methods. 

654 

655 The result is cached after first access so repeated calls are O(1). 

656 """ 

657 if self._plots_cache is None: 

658 object.__setattr__(self, "_plots_cache", PortfolioPlots(self)) 

659 return self._plots_cache # type: ignore[return-value] 

660 

661 @property 

662 def report(self) -> Report: 

663 """Convenience accessor returning a Report facade for this portfolio. 

664 

665 Use this to generate a self-contained HTML performance report 

666 containing statistics tables and interactive charts. 

667 

668 Returns: 

669 :class:`~jquantstats._reports.Report`: Helper object with 

670 report methods. 

671 

672 The result is cached after first access so repeated calls are O(1). 

673 """ 

674 if self._report_cache is None: 

675 object.__setattr__(self, "_report_cache", Report(self)) 

676 return self._report_cache # type: ignore[return-value] 

677 

678 # ── Portfolio transforms ─────────────────────────────────────────────────── 

679 

680 def truncate(self, start: object = None, end: object = None) -> "Portfolio": 

681 """Return a new Portfolio truncated to the inclusive [start, end] range. 

682 

683 When a ``'date'`` column is present in both prices and cash positions, 

684 truncation is performed by comparing the ``'date'`` column against 

685 ``start`` and ``end`` (which should be date/datetime values or strings 

686 parseable by Polars). 

687 

688 When the ``'date'`` column is absent, integer-based row slicing is 

689 used instead. In this case ``start`` and ``end`` must be non-negative 

690 integers representing 0-based row indices. Passing non-integer bounds 

691 to an integer-indexed portfolio raises :exc:`TypeError`. 

692 

693 In all cases the ``aum`` value is preserved. 

694 

695 Args: 

696 start: Optional lower bound (inclusive). A date/datetime or 

697 Polars-parseable string when a ``'date'`` column exists; a 

698 non-negative int row index when the data has no ``'date'`` 

699 column. 

700 end: Optional upper bound (inclusive). Same type rules as 

701 ``start``. 

702 

703 Returns: 

704 A new Portfolio instance with prices and cash positions filtered 

705 to the specified range. 

706 

707 Raises: 

708 TypeError: When the portfolio has no ``'date'`` column and a 

709 non-integer bound is supplied. 

710 """ 

711 has_date = "date" in self.prices.columns 

712 if has_date: 

713 cond = pl.lit(True) 

714 if start is not None: 

715 cond = cond & (pl.col("date") >= pl.lit(start)) 

716 if end is not None: 

717 cond = cond & (pl.col("date") <= pl.lit(end)) 

718 pr = self.prices.filter(cond) 

719 cp = self.cashposition.filter(cond) 

720 else: 

721 if start is not None and not isinstance(start, int): 

722 raise IntegerIndexBoundError("start", type(start).__name__) 

723 if end is not None and not isinstance(end, int): 

724 raise IntegerIndexBoundError("end", type(end).__name__) 

725 row_start = int(start) if start is not None else 0 

726 row_end = int(end) + 1 if end is not None else self.prices.height 

727 length = max(0, row_end - row_start) 

728 pr = self.prices.slice(row_start, length) 

729 cp = self.cashposition.slice(row_start, length) 

730 return Portfolio( 

731 prices=pr, 

732 cashposition=cp, 

733 aum=self.aum, 

734 cost_per_unit=self.cost_per_unit, 

735 cost_bps=self.cost_bps, 

736 ) 

737 

738 def lag(self, n: int) -> "Portfolio": 

739 """Return a new Portfolio with cash positions lagged by ``n`` steps. 

740 

741 This method shifts the numeric asset columns in the cashposition 

742 DataFrame by ``n`` rows, preserving the ``'date'`` column and any 

743 non-numeric columns unchanged. Positive ``n`` delays weights (moves 

744 them down); negative ``n`` leads them (moves them up); ``n == 0`` 

745 returns the current portfolio unchanged. 

746 

747 Notes: 

748 Missing values introduced by the shift are left as nulls; 

749 downstream profit computation already guards and treats nulls as 

750 zero when multiplying by returns. 

751 

752 Args: 

753 n: Number of rows to shift (can be negative, zero, or positive). 

754 

755 Returns: 

756 A new Portfolio instance with lagged cash positions and the same 

757 prices/AUM as the original. 

758 """ 

759 if not isinstance(n, int): 

760 raise TypeError 

761 if n == 0: 

762 return self 

763 

764 assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()] 

765 cp_lagged = self.cashposition.with_columns(pl.col(c).shift(n) for c in assets) 

766 return Portfolio( 

767 prices=self.prices, 

768 cashposition=cp_lagged, 

769 aum=self.aum, 

770 cost_per_unit=self.cost_per_unit, 

771 cost_bps=self.cost_bps, 

772 ) 

773 

774 def smoothed_holding(self, n: int) -> "Portfolio": 

775 """Return a new Portfolio with cash positions smoothed by a rolling mean. 

776 

777 Applies a trailing window average over the last ``n`` steps for each 

778 numeric asset column (excluding ``'date'``). The window length is 

779 ``n + 1`` so that: 

780 

781 - n=0 returns the original weights (no smoothing), 

782 - n=1 averages the current and previous weights, 

783 - n=k averages the current and last k weights. 

784 

785 Args: 

786 n: Non-negative integer specifying how many previous steps to 

787 include. 

788 

789 Returns: 

790 A new Portfolio with smoothed cash positions and the same 

791 prices/AUM. 

792 """ 

793 if not isinstance(n, int): 

794 raise TypeError 

795 if n < 0: 

796 raise ValueError 

797 if n == 0: 

798 return self 

799 

800 assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()] 

801 window = n + 1 

802 cp_smoothed = self.cashposition.with_columns( 

803 pl.col(c).rolling_mean(window_size=window, min_samples=1).alias(c) for c in assets 

804 ) 

805 return Portfolio( 

806 prices=self.prices, 

807 cashposition=cp_smoothed, 

808 aum=self.aum, 

809 cost_per_unit=self.cost_per_unit, 

810 cost_bps=self.cost_bps, 

811 ) 

812 

813 # ── Attribution ──────────────────────────────────────────────────────────── 

814 

815 @property 

816 def tilt(self) -> "Portfolio": 

817 """Return the 'tilt' portfolio with constant average weights. 

818 

819 Computes the time-average of each asset's cash position (ignoring 

820 nulls/NaNs) and builds a new Portfolio with those constant weights 

821 applied across time. Prices and AUM are preserved. 

822 """ 

823 const_position = self.cashposition.with_columns( 

824 pl.col(col).drop_nulls().drop_nans().mean().alias(col) for col in self.assets 

825 ) 

826 return Portfolio.from_cash_position( 

827 self.prices, 

828 const_position, 

829 aum=self.aum, 

830 cost_per_unit=self.cost_per_unit, 

831 cost_bps=self.cost_bps, 

832 ) 

833 

834 @property 

835 def timing(self) -> "Portfolio": 

836 """Return the 'timing' portfolio capturing deviations from the tilt. 

837 

838 Constructs weights as original cash positions minus the tilt's 

839 constant positions, per asset. This isolates timing (alloc-demeaned) 

840 effects. Prices and AUM are preserved. 

841 """ 

842 const_position = self.tilt.cashposition 

843 position = self.cashposition.with_columns((pl.col(col) - const_position[col]).alias(col) for col in self.assets) 

844 return Portfolio.from_cash_position( 

845 self.prices, 

846 position, 

847 aum=self.aum, 

848 cost_per_unit=self.cost_per_unit, 

849 cost_bps=self.cost_bps, 

850 ) 

851 

852 @property 

853 def tilt_timing_decomp(self) -> pl.DataFrame: 

854 """Return the portfolio's tilt/timing NAV decomposition. 

855 

856 When a ``'date'`` column is present the three NAV series are joined on 

857 it. When data is integer-indexed the frames are stacked horizontally. 

858 """ 

859 if "date" in self.nav_accumulated.columns: 

860 nav_portfolio = self.nav_accumulated.select(["date", "NAV_accumulated"]) 

861 nav_tilt = self.tilt.nav_accumulated.select(["date", "NAV_accumulated"]) 

862 nav_timing = self.timing.nav_accumulated.select(["date", "NAV_accumulated"]) 

863 

864 merged_df = nav_portfolio.join(nav_tilt, on="date", how="inner", suffix="_tilt").join( 

865 nav_timing, on="date", how="inner", suffix="_timing" 

866 ) 

867 else: 

868 nav_portfolio = self.nav_accumulated.select(["NAV_accumulated"]) 

869 nav_tilt = self.tilt.nav_accumulated.select(["NAV_accumulated"]).rename( 

870 {"NAV_accumulated": "NAV_accumulated_tilt"} 

871 ) 

872 nav_timing = self.timing.nav_accumulated.select(["NAV_accumulated"]).rename( 

873 {"NAV_accumulated": "NAV_accumulated_timing"} 

874 ) 

875 merged_df = nav_portfolio.hstack(nav_tilt).hstack(nav_timing) 

876 

877 merged_df = merged_df.rename( 

878 {"NAV_accumulated_tilt": "tilt", "NAV_accumulated_timing": "timing", "NAV_accumulated": "portfolio"} 

879 ) 

880 return merged_df 

881 

882 # ── Turnover ─────────────────────────────────────────────────────────────── 

883 

884 @property 

885 def turnover(self) -> pl.DataFrame: 

886 """Daily one-way portfolio turnover as a fraction of AUM. 

887 

888 Computes the sum of absolute position changes across all assets for 

889 each period, normalised by AUM. The first row is always zero because 

890 there is no prior position to form a difference against. 

891 

892 Returns: 

893 pl.DataFrame: Frame with an optional ``'date'`` column and a 

894 ``'turnover'`` column (dimensionless fraction of AUM). 

895 

896 Examples: 

897 >>> import polars as pl 

898 >>> from datetime import date 

899 >>> _d = [date(2020, 1, 1), date(2020, 1, 2), date(2020, 1, 3)] 

900 >>> prices = pl.DataFrame({"date": _d, "A": [100.0, 110.0, 121.0]}) 

901 >>> pos = pl.DataFrame({"date": prices["date"], "A": [1000.0, 1200.0, 900.0]}) 

902 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e5) 

903 >>> pf.turnover["turnover"].to_list() 

904 [0.0, 0.002, 0.003] 

905 """ 

906 assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()] 

907 daily_abs_chg = ( 

908 pl.sum_horizontal(pl.col(c).diff().abs().fill_null(0.0).fill_nan(0.0) for c in assets) / self.aum 

909 ).alias("turnover") 

910 cols: list[str | pl.Expr] = [] 

911 if "date" in self.cashposition.columns: 

912 cols.append("date") 

913 cols.append(daily_abs_chg) 

914 return self.cashposition.select(cols) 

915 

916 @property 

917 def turnover_weekly(self) -> pl.DataFrame: 

918 """Weekly aggregated one-way portfolio turnover as a fraction of AUM. 

919 

920 When a ``'date'`` column is present, sums the daily turnover within 

921 each calendar week (Monday-based ``group_by_dynamic``). Without a 

922 date column, a rolling 5-period sum with ``min_samples=5`` is returned 

923 (the first four rows will be ``null``). 

924 

925 Returns: 

926 pl.DataFrame: Frame with an optional ``'date'`` column (week 

927 start) and a ``'turnover'`` column (fraction of AUM, summed over 

928 the week). 

929 """ 

930 daily = self.turnover 

931 if "date" not in daily.columns or not daily["date"].dtype.is_temporal(): 

932 return daily.with_columns(pl.col("turnover").rolling_sum(window_size=5, min_samples=5)) 

933 return daily.group_by_dynamic("date", every="1w").agg(pl.col("turnover").sum()).sort("date") 

934 

935 def turnover_summary(self) -> pl.DataFrame: 

936 """Return a summary DataFrame of turnover statistics. 

937 

938 Computes three metrics from the daily turnover series: 

939 

940 - ``mean_daily_turnover``: mean of daily one-way turnover (fraction 

941 of AUM). 

942 - ``mean_weekly_turnover``: mean of weekly-aggregated turnover 

943 (fraction of AUM). 

944 - ``turnover_std``: standard deviation of daily turnover (fraction of 

945 AUM); complements the mean to detect regime switches. 

946 

947 Returns: 

948 pl.DataFrame: One row per metric with columns ``'metric'`` and 

949 ``'value'``. 

950 

951 Examples: 

952 >>> import polars as pl 

953 >>> from datetime import date, timedelta 

954 >>> import numpy as np 

955 >>> start = date(2020, 1, 1) 

956 >>> dates = pl.date_range(start=start, end=start + timedelta(days=9), interval="1d", eager=True) 

957 >>> prices = pl.DataFrame({"date": dates, "A": pl.Series(np.ones(10) * 100.0)}) 

958 >>> pos = pl.DataFrame({"date": dates, "A": pl.Series([float(i) * 100 for i in range(10)])}) 

959 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e4) 

960 >>> summary = pf.turnover_summary() 

961 >>> list(summary["metric"]) 

962 ['mean_daily_turnover', 'mean_weekly_turnover', 'turnover_std'] 

963 """ 

964 daily_col = self.turnover["turnover"] 

965 _mean = daily_col.mean() 

966 mean_daily = float(_mean) if isinstance(_mean, (int, float)) else 0.0 

967 _std = daily_col.std() 

968 std_daily = float(_std) if isinstance(_std, (int, float)) else 0.0 

969 weekly_col = self.turnover_weekly["turnover"].drop_nulls() 

970 _weekly_mean = weekly_col.mean() 

971 mean_weekly = ( 

972 float(_weekly_mean) if weekly_col.len() > 0 and isinstance(_weekly_mean, (int, float)) else float("nan") 

973 ) 

974 return pl.DataFrame( 

975 { 

976 "metric": ["mean_daily_turnover", "mean_weekly_turnover", "turnover_std"], 

977 "value": [mean_daily, mean_weekly, std_daily], 

978 } 

979 ) 

980 

981 # ── Cost analysis ────────────────────────────────────────────────────────── 

982 

983 @property 

984 def position_delta_costs(self) -> pl.DataFrame: 

985 """Daily trading cost using the position-delta model. 

986 

987 Computes the per-period cost as:: 

988 

989 cost_t = sum_i( |x_{i,t} - x_{i,t-1}| ) * cost_per_unit 

990 

991 where ``x_{i,t}`` is the cash position in asset *i* at time *t* and 

992 ``cost_per_unit`` is the one-way cost per unit of traded notional. 

993 The first row is always zero because there is no prior position to 

994 form a difference against. 

995 

996 Returns: 

997 pl.DataFrame: Frame with an optional ``'date'`` column and a 

998 ``'cost'`` column (absolute cash cost per period). 

999 

1000 Examples: 

1001 >>> import polars as pl 

1002 >>> from datetime import date 

1003 >>> _d = [date(2020, 1, 1), date(2020, 1, 2), date(2020, 1, 3)] 

1004 >>> prices = pl.DataFrame({"date": _d, "A": [100.0, 110.0, 121.0]}) 

1005 >>> pos = pl.DataFrame({"date": _d, "A": [1000.0, 1200.0, 900.0]}) 

1006 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e5, cost_per_unit=0.01) 

1007 >>> pf.position_delta_costs["cost"].to_list() 

1008 [0.0, 2.0, 3.0] 

1009 """ 

1010 assets = [c for c in self.cashposition.columns if c != "date" and self.cashposition[c].dtype.is_numeric()] 

1011 abs_position_changes = pl.sum_horizontal(pl.col(c).diff().abs().fill_null(0.0).fill_nan(0.0) for c in assets) 

1012 daily_cost = (abs_position_changes * self.cost_per_unit).alias("cost") 

1013 cols: list[str | pl.Expr] = [] 

1014 if "date" in self.cashposition.columns: 

1015 cols.append("date") 

1016 cols.append(daily_cost) 

1017 return self.cashposition.select(cols) 

1018 

1019 @property 

1020 def net_cost_nav(self) -> pl.DataFrame: 

1021 """Net-of-cost cumulative additive NAV using the position-delta cost model. 

1022 

1023 Deducts :attr:`position_delta_costs` from daily portfolio profit and 

1024 computes the running cumulative sum offset by AUM. The result 

1025 represents the realised NAV path a strategy would achieve after paying 

1026 ``cost_per_unit`` on every unit of position change. 

1027 

1028 When ``cost_per_unit`` is zero the result equals :attr:`nav_accumulated`. 

1029 

1030 Returns: 

1031 pl.DataFrame: Frame with an optional ``'date'`` column, 

1032 ``'profit'``, ``'cost'``, and ``'NAV_accumulated_net'`` columns. 

1033 

1034 Examples: 

1035 >>> import polars as pl 

1036 >>> from datetime import date 

1037 >>> _d = [date(2020, 1, 1), date(2020, 1, 2), date(2020, 1, 3)] 

1038 >>> prices = pl.DataFrame({"date": _d, "A": [100.0, 110.0, 121.0]}) 

1039 >>> pos = pl.DataFrame({"date": _d, "A": [1000.0, 1200.0, 900.0]}) 

1040 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e5, cost_per_unit=0.0) 

1041 >>> net = pf.net_cost_nav 

1042 >>> list(net.columns) 

1043 ['date', 'profit', 'cost', 'NAV_accumulated_net'] 

1044 """ 

1045 profit_df = self.profit 

1046 cost_df = self.position_delta_costs 

1047 if "date" in profit_df.columns: 

1048 df = profit_df.join(cost_df, on="date", how="left") 

1049 else: 

1050 df = profit_df.hstack(cost_df.select(["cost"])) 

1051 return df.with_columns(((pl.col("profit") - pl.col("cost")).cum_sum() + self.aum).alias("NAV_accumulated_net")) 

1052 

1053 def cost_adjusted_returns(self, cost_bps: float | None = None) -> pl.DataFrame: 

1054 """Return daily portfolio returns net of estimated one-way trading costs. 

1055 

1056 Trading costs are modelled as a linear function of daily one-way 

1057 turnover: for every unit of AUM traded, the strategy incurs 

1058 ``cost_bps`` basis points (i.e. ``cost_bps / 10_000`` fractional 

1059 cost). The daily cost deduction is therefore:: 

1060 

1061 daily_cost = turnover * (cost_bps / 10_000) 

1062 

1063 where ``turnover`` is the fraction-of-AUM one-way turnover already 

1064 computed by :attr:`turnover`. The deduction is applied to the 

1065 ``returns`` column of :attr:`returns`, leaving all other columns 

1066 (including ``date``) untouched. 

1067 

1068 Args: 

1069 cost_bps: One-way trading cost in basis points per unit of AUM 

1070 traded. Must be non-negative. Defaults to ``self.cost_bps`` 

1071 set at construction time. 

1072 

1073 Returns: 

1074 pl.DataFrame: Same schema as :attr:`returns` but with the 

1075 ``returns`` column reduced by the per-period trading cost. 

1076 

1077 Raises: 

1078 ValueError: If ``cost_bps`` is negative. 

1079 

1080 Examples: 

1081 >>> import polars as pl 

1082 >>> from datetime import date 

1083 >>> _d = [date(2020, 1, 1), date(2020, 1, 2), date(2020, 1, 3)] 

1084 >>> prices = pl.DataFrame({"date": _d, "A": [100.0, 110.0, 121.0]}) 

1085 >>> pos = pl.DataFrame({"date": _d, "A": [1000.0, 1200.0, 900.0]}) 

1086 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e5) 

1087 >>> adj = pf.cost_adjusted_returns(0.0) 

1088 >>> float(adj["returns"][1]) == float(pf.returns["returns"][1]) 

1089 True 

1090 """ 

1091 effective_bps = cost_bps if cost_bps is not None else self.cost_bps 

1092 if effective_bps < 0: 

1093 raise ValueError 

1094 base = self.returns 

1095 daily_cost = self.turnover["turnover"] * (effective_bps / 10_000.0) 

1096 return base.with_columns((pl.col("returns") - daily_cost).alias("returns")) 

1097 

1098 def trading_cost_impact(self, max_bps: int = 20) -> pl.DataFrame: 

1099 """Estimate the impact of trading costs on the Sharpe ratio. 

1100 

1101 Computes the annualised Sharpe ratio of cost-adjusted returns for 

1102 each integer cost level from 0 up to and including ``max_bps`` basis 

1103 points (1 bp = 0.01 %). The result lets you quickly assess at what 

1104 cost level the strategy's edge is eroded. 

1105 

1106 Args: 

1107 max_bps: Maximum one-way trading cost to evaluate, in basis 

1108 points. Defaults to 20 (i.e., evaluates 0, 1, 2, …, 20 

1109 bps). Must be a positive integer. 

1110 

1111 Returns: 

1112 pl.DataFrame: Frame with columns ``'cost_bps'`` (Int64) and 

1113 ``'sharpe'`` (Float64), one row per cost level from 0 to 

1114 ``max_bps`` inclusive. 

1115 

1116 Raises: 

1117 ValueError: If ``max_bps`` is not a positive integer. 

1118 

1119 Examples: 

1120 >>> import polars as pl 

1121 >>> from datetime import date, timedelta 

1122 >>> import numpy as np 

1123 >>> start = date(2020, 1, 1) 

1124 >>> dates = pl.date_range( 

1125 ... start=start, end=start + timedelta(days=99), interval="1d", eager=True 

1126 ... ) 

1127 >>> rng = np.random.default_rng(0) 

1128 >>> prices = pl.DataFrame({ 

1129 ... "date": dates, 

1130 ... "A": pl.Series(np.cumprod(1 + rng.normal(0.001, 0.01, 100)) * 100), 

1131 ... }) 

1132 >>> pos = pl.DataFrame({"date": dates, "A": pl.Series(np.ones(100) * 1000.0)}) 

1133 >>> pf = Portfolio(prices=prices, cashposition=pos, aum=1e5) 

1134 >>> impact = pf.trading_cost_impact(max_bps=5) 

1135 >>> list(impact["cost_bps"]) 

1136 [0, 1, 2, 3, 4, 5] 

1137 """ 

1138 if not isinstance(max_bps, int) or max_bps < 1: 

1139 raise ValueError 

1140 import numpy as np 

1141 

1142 periods = self.data._periods_per_year # one Data object, outside the loop 

1143 _eps = np.finfo(np.float64).eps 

1144 sqrt_periods = float(np.sqrt(periods)) 

1145 cost_levels = list(range(0, max_bps + 1)) 

1146 

1147 # Extract base returns and turnover once — O(1) allocations regardless of max_bps 

1148 base_rets = self.returns["returns"] 

1149 turnover_s = self.turnover["turnover"] 

1150 

1151 # Build all cost-adjusted return columns in one vectorised DataFrame construction, 

1152 # then compute means and stds in a single aggregate pass (no per-iteration allocation). 

1153 sweep = pl.DataFrame({str(bps): base_rets - turnover_s * (bps / 10_000.0) for bps in cost_levels}) 

1154 means_row = sweep.mean().row(0) 

1155 stds_row = sweep.std(ddof=1).row(0) 

1156 

1157 sharpe_values: list[float] = [] 

1158 for mean_raw, std_raw in zip(means_row, stds_row, strict=False): 

1159 mean_val = 0.0 if mean_raw is None else float(mean_raw) 

1160 if std_raw is None or float(std_raw) <= _eps * max(abs(mean_val), _eps) * 10: 

1161 sharpe_values.append(float("nan")) 

1162 else: 

1163 sharpe_values.append(mean_val / float(std_raw) * sqrt_periods) 

1164 return pl.DataFrame({"cost_bps": pl.Series(cost_levels, dtype=pl.Int64), "sharpe": pl.Series(sharpe_values)}) 

1165 

1166 # ── Utility ──────────────────────────────────────────────────────────────── 

1167 

1168 def correlation(self, frame: pl.DataFrame, name: str = "portfolio") -> pl.DataFrame: 

1169 """Compute a correlation matrix of asset returns plus the portfolio. 

1170 

1171 Computes percentage changes for all numeric columns in ``frame``, 

1172 appends the portfolio profit series under the provided ``name``, and 

1173 returns the Pearson correlation matrix across all numeric columns. 

1174 

1175 Args: 

1176 frame: A Polars DataFrame containing at least the asset price 

1177 columns (and a date column which will be ignored if 

1178 non-numeric). 

1179 name: The column name to use when adding the portfolio profit 

1180 series to the input frame. 

1181 

1182 Returns: 

1183 A square Polars DataFrame where each cell is the correlation 

1184 between a pair of series (values in [-1, 1]). 

1185 """ 

1186 p = frame.with_columns(cs.by_dtype(pl.Float32, pl.Float64).pct_change()) 

1187 p = p.with_columns(pl.Series(name, self.profit["profit"])) 

1188 corr_matrix = p.select(cs.numeric()).fill_null(0.0).corr() 

1189 return corr_matrix