Systematic Long-Only Equity Selection from Price Paths
Price-only LSTM ranking signal, execution-aware backtest, and long-only portfolio construction.
Price-only datasetLong-onlyNext-day close execution0.05% one-way costTop-18 / Buffer-30
Research Thesis and Workflow
Hypothesis: price paths contain cross-sectional ranking signals. This deck shows how the signal was built, tested, and converted into a portfolio.
Research Pipeline
Clean the raw price panel
Convert raw daily prices into a consistent research panel, remove invalid observations, control extreme returns, and define the eligible trading universe.
Data preparation layerEngineer price-path signals
Build momentum, reversal, trend, volatility, breakout, and path-quality features using only information available up to each signal date.
Feature and label layerTrain and compare ranking models
Compare tabular baselines with sequence models, then retain the LSTM score after time-series input normalization improves stability.
Alpha model layerTranslate scores into a portfolio
Apply Top-N selection, holding buffer, equal weighting, next-day close execution, transaction costs, benchmark comparison, and robustness checks.
Portfolio and evaluation layerSystem Design
Inputs
Cleaned price panel, tradable-universe mask, and precomputed model scores.
Price-only dataResearch Engine
Feature construction, execution-aware labels, sequence modeling, and cross-sectional score generation.
Signal generationPortfolio Layer
Top-18 selection, Buffer-30 retention rule, equal weighting, cost deduction, and benchmark-relative metrics.
Backtest outputResearch Iterations
Research path from raw prices to the final Top-18 / Buffer-30 portfolio.
Performance Evolution: From Baseline Signal to Final Portfolio
Sharpe improved through three levers: better signal design, cleaner timing alignment, and tighter portfolio construction.
Sharpe Improvement Ladder
Documented report-ready iterations. Each bar shows the best candidate in that stage, not every experiment attempted during research.
Data and Benchmark Sanity Checks
With no external index provided, I used an internal equal-weight eligible-universe benchmark.
The benchmark is an internal price-panel diagnostic, not an external market index.
Portfolio Construction Explorer
The final improvement came from separating score generation from portfolio construction. Top-N and holding-buffer sweeps showed that LSTM alpha was concentrated in the highest-ranked names.
Top-N / Buffer Sharpe Heatmap
Hover cells for details. The final Top-18 / Buffer-30 point is highlighted.
Selected Configuration
Sharpe 5.27. The strongest region is around Top-18 / Top-20, indicating that excessive diversification diluted the top-bucket signal.
Holding Buffer Rule
New positions must enter the Top-18 bucket. Existing holdings may remain if they are still ranked inside the broader Top-30 buffer. This reduces unnecessary churn around the selection boundary.
Final Performance vs Internal Benchmark
The final strategy is evaluated net of 0.05% one-way transaction costs. It is compared with an internal equal-weight benchmark constructed from the same score-available eligible universe.
The strategy generated higher return and higher risk-adjusted return than the internal benchmark, while maintaining comparable drawdown.
Higher return was not achieved by materially worse maximum drawdown relative to the internal benchmark.
Backtest Integrity Controls
The backtest explicitly addresses execution timing, transaction costs, benchmark alignment, and label construction to reduce common implementation errors.
No same-close trading assumption.
Returns start only after execution.
Future return starts from Price[t+1].
Turnover × 0.05% one-way transaction cost.
Non-negative weights; uninvested cash earns 0%.
Pre-threshold eligible universe prevents strategy filter contamination.
Robustness and Rejected Variants
Robustness checks were used to test whether the final portfolio choice was stable and whether more complex alternatives genuinely improved the signal. The conclusion from the report and code is that the simple single-LSTM portfolio remained the strongest and most interpretable choice.
Local Edge Check
Sharpe 5.27. This is the selected final portfolio. Nearby cells stay around or above Sharpe 5, which suggests the result is not a single isolated optimum.
Rows represent the concentration level (Top-N holdings) and columns represent the holding buffer used to reduce churn. The chosen Top-18 / Buffer-30 point sits inside a stable high-Sharpe neighborhood.
Seed Bagging Check
Five-seed rank averaging reduced performance because weaker seeds diluted the strongest single-LSTM signal. The final selection therefore uses the best single LSTM rather than a broad seed ensemble.
Discussion Topics
Limitations and Future Extensions
Data Limitations
The case-study dataset contains daily prices only. No volume, liquidity, market cap, sector, or fundamental data is available.
Risk Model Layer
A production version should add Barra-style or ML factor-risk controls for style, sector, market, liquidity, and concentration exposures.
Validation
Longer OOS periods, more market regimes, capacity analysis, and externally defined investable benchmarks would be needed before real-money deployment.