1 Set up Ticker Universe
2 Optimize the Portfolios
- 2.1 define a Parallel Processing
- 2.2 Opimization Function
3 Analysis of the optimization Results
4 Graphical evaluation of the results
5 Conclusion

“To assess is to compare” - this quote from Erich Kosiol sums up what a sound analysis is based on.

In order to understand whether your portfolio is well positioned, it makes sense to benchmark the portfolio against other peer portfolios. An optimization function, that weighs up risk and return helps to analyze the portfolio composition and better align it with your own goals. A basic orientation to the CAPM model and its associated efficiency line gives you an idea of where your portfolio stands in terms of efficiency - in other words, whether you are getting the most out of your risk exposure. If your portfolio is on or close to the efficiency line, the balance between risk and return is right. A comparison therefore shows whether your portfolio has the potential to become more efficient or whether you could achieve a better return for an equal risk.

The usual brokers and websites do not give you the opportunity to make such a comparison. Therefore, the following script aims to do this with just a few time series: to get more return with less risk and make your portfolio as efficient as possible.

1 Set up Ticker Universe

In order to make a well-founded valuation, we use the stock data from our own database. In this database, we use various automated scripts to update around 1160 individual stocks that are of interest to us from free data sources. These data are updated daily and offer various sensitivities that help us to better understand market movements.

1.1 Rerieve all Time Series

We use our own function aikia::get_any_history() to pull all stocks from our database that have a complete history from start to end date (basically, that should be all of them - we just want to make sure the cleanup function didn’t have a hiccup). In total, our basic base of stocks that we consider to be of interest is ~1160 stocks

returns_data_init <- aikiaTrade::get_any_history(start_date = "2024-04-25",
                                             end_date = "2024-10-25",
                                             tables = c("ticker"),
                                             complete_cases = T,
                                             as.xts.return = F,
                                             tic_to_names = F,
                                             exact = F)

1.1.1 use ChatGPT do do so

Alternatively, we can also use the AI database approach here, with the connection to Chat GPT. The process is described in the article Ask Chatgpt about your daily numerical key figures The corresponding prompt would be:
“Provide all close values for all tickers between 2024-04-25 and 2024-10-25”.

1.2 Add a Trend

In order to optimize the portfolios not only on the basis of the pure history up to the current point in time, we extend the time series drawn to include the respective short-term trend for the next 20 days. Of course, the error rate rises as the forecasting horizon increases, but we get an indication of which individual stocks would perform better for the overall portfolio under the current circumstances.

The result is provided by our function aikia::single_mc_forecast() which outputs the existing time series of the individual stocks with an additional Monte Carlo trend forecast using the next 20 business days of the NYSE calendar.

allfc <- tibble::tibble()
k <- length(unique(returns_data_init$symbol))
for(i in unique(returns_data_init$symbol)){
  print(k)
  newfc <- aikia::single_mc_forecast(i, returns_data_init,
                            hist_length = 140,
                            days_forecast = 20,
                            scale = "price")
  allfc <- rbind(allfc,newfc)
  k <<- k-1
}

We now have to convert the long version of the tibble into a wide version in which each column represents a separate stock. In addition, we calculate the return on a daily basis and throw out all NAs that would hinder our optimization.

returns_data <- 
  allfc |> 
  dplyr::distinct(symbol,date,.keep_all = T) |> 
  tidyr::pivot_wider(id_cols = date, names_from = symbol,values_from = price) |> 
  janitor::clean_names()|>  
  tidyr::fill(dplyr::everything(),.direction = "downup") |>
  dplyr::arrange(date) |> 
  dplyr::mutate(dplyr::across(-c(date), ~ log(.) - log(dplyr::lag(.)))) |> 
  tidyr::drop_na()

We also convert the tibble into an xts format as a basic input for the subsequent optimization function

returns_data <- xts::xts(returns_data[,-1], order.by=as.POSIXct(returns_data[[1]]))

To be on the safe side, we make sure that the xts does not contain any NAs after the transformation, so that individual stocks do not sabotage the entire process.

sapply(returns_data, function(x) sum(is.na(x))) |> 
  as.data.frame() |> 
  tibble::rownames_to_column() |> 
  dplyr::as_tibble() |> 
  dplyr::rename(count = 2) |> 
  dplyr::filter(count > 0)

1.3 Create an initial Investment Portfolio

Before the optimization function starts, we create an initial investment portfolio. So that most people can identify themselves here, our portfolio definitely includes the popular “Signifikant 7” :) We then fill the portfolio to a total number of 30 stocks. We allocate the weighting equally.

cur_prt_clean <- setdiff(sample(colnames(returns_data), 23),c("nvda","aapl","tsla","googl","meta","msft","amzn"))
cur_prt_clean <- c(cur_prt_clean,c("nvda","aapl","tsla","googl","meta","msft","amzn"))

prt_xts <- returns_data[,cur_prt_clean]

2 Optimize the Portfolios

2.1 define a Parallel Processing

In order to have something to hold and evaluate our newly created investment portfolio against, we need to create additional peer group portfolios. From our investment universe of 1160 stocks, we create 5000 portfolios with a desired portfolio size of 30 stocks each.

Here we define the corresponding variables

num_stocks <- ncol(returns_data)
portfolio_size <- 30 # Number of stocks in a sample portfolio
num_portfolios <- 5000  # Number of sample portfolios to optimize

The optimization in this case relates to the distribution of weights within the 30 stocks in each of the 5000 portfolios. We want to use the DEoptim algorithm, which is robust and often finds suitable solutions even in highly non-linear and non-smooth search spaces. However, since DEoptim works iteratively, it can be time-consuming, especially with large portfolios and many restrictions. To ensure that we also obtain a result in time, we apply parallel processing. To do this, we use the number of cores (leaving out 1) and form clusters with the required libraries:

num_cores <- detectCores() - 1  # Number of cores for parallel processing
cl <- makeCluster(num_cores)
clusterEvalQ(cl, {
  library(quantmod)
  library(PerformanceAnalytics)
  library(PortfolioAnalytics)
  library(DEoptim)
})

2.2 Opimization Function

Now we create the function that will optimize the individual portfolios. DEoptim stands for “Differential Evolution Optimization” and is an optimization algorithm based on the principles of evolution. In our portfolio optimization, this means for the conditions that
\(\bullet\) DEoptim searches for an allocation of capital that best meets the objectives of “maximizing return and minimizing risk”.
\(\bullet\) As a secondary condition, we assume that the sum of the weights must be between 95% and 100% (we don’t want to have any unnecessary cash lying around 😉 ) and
\(\bullet\) thirdly that it can only be a long-only portfolio. The last condition is that the weighting of an individual security must not exceed 15%.

In our case, to have a standard benchmark, we optimize according to the risk of the Sharpe ratio. The Sharpe ratio is one of the most important risk/return ratios in finance: It puts the excess return of a portfolio (compared to a risk-free interest rate) in relation to volatility and thus indicates how much return is achieved per unit of risk.
Although it is not a perfect measure of risk, for this, it is a sufficiently good one

optimize_single_portfolio <- function(selected_stocks, returns_data) {

  returns_data_selected <- returns_data[, selected_stocks]

  # Create a portfolio specification for the selected stocks
  port_spec <- portfolio.spec(assets = colnames(returns_data_selected))
  
  # Add constraints (full investment, long-only portfolio, each stock has a maximum weight limit of 15%)
  port_spec <- add.constraint(portfolio=port_spec, type="weight_sum", min_sum=0.95, max_sum=1)
  port_spec <- add.constraint(portfolio = port_spec, type = "long_only")
  port_spec <- add.constraint(portfolio = port_spec, type = "box", min = rep(0, ncol(returns_data_selected)), 
                                                                   max = rep(0.15, ncol(returns_data_selected)))
  
  # Add objectives for mean-variance optimization
  port_spec <- add.objective(portfolio = port_spec, type = "risk", name = "StdDev")
  port_spec <- add.objective(portfolio = port_spec, type = "return", name = "mean")
  
  # Optimize the portfolio using DEoptim method
  opt_results <- optimize.portfolio(R = returns_data_selected, portfolio = port_spec, 
                                    optimize_method = "DEoptim",
                                    search_size = 500,
                                    trace = FALSE)  # Set trace to FALSE for non-interactive mode
  
  # Extract the optimal weights and compute portfolio metrics
  optimal_weights <- extractWeights(opt_results)
  portfolio_return <- mean(Return.portfolio(R = returns_data_selected, weights = optimal_weights))
  portfolio_risk <- StdDev(Return.portfolio(R = returns_data_selected, weights = optimal_weights))
  
  # Calculate Sharpe Ratio assuming a risk-free rate of 3%
  risk_free_rate <- 0.03 / 252  # Convert annual risk-free rate to daily
  sharpe_ratio <- (portfolio_return - risk_free_rate) / portfolio_risk

  # Return a list with results
  list(
    combination = list(selected_stocks),
    weights = list(optimal_weights),
    portfolio_return = portfolio_return,
    portfolio_risk = portfolio_risk,
    sharpe_ratio = sharpe_ratio
  )
}

Before the iteration can start, the example portfolios mentioned above are created. Against whose results we can compare our investment portfolio and see how well we have performed in the last six months compared to “the market”.

sample_portfolios <- lapply(1:num_portfolios, function(x) sample(colnames(returns_data), portfolio_size))

We track time and let the portfolios calculate their optimal weights and the resulting sharp ratio.

tictoc::tic() # In our case, 5000 portfolio samples of 30 shares each require ~320 minutes
  results <- pbapply::pblapply(sample_portfolios, optimize_single_portfolio, cl = cl, returns_data = returns_data)
tictoc::toc()

Next, we need to stop the cluster again and release the tied-up resources.

stopCluster(cl)

3 Analysis of the optimization Results

3.1 Key Figures of “Investment Universe”

Through the random composition of the 5000 portfolios with 30 stocks from our overall holdings, we represent an “investment universe” in which one could theoretically have invested as an investor. We have thus defined our “market”.

In the tibble “all_tested” we gather the information of all calculated portfolios and prepare them for visualization. To be able to show the volatility results on an annual basis to have a standardized benchmark, we need to convert the daily values into annual values: Volatility: the annual volatility (𝜎Year) is estimated by multiplying the daily volatility by the square root of the number of trading days in the year: \(\sigma_{(\text{year})}\) = \(\sigma_{(\text{day})} \times \sqrt{252}\)

Return: Here we assume that the daily returns are reinvested daily, which leads to a compound interest effect and use the formular: \(\text{r}_{(\text{year})} = \left(1 + \text{r}_{(\text{day})}\right)^{252} - 1\)

all_tested <- dplyr::bind_rows(lapply(results, function(res) {
  tibble::tibble(
    combination = res$combination,
    weights = res$weights,
    portfolio_return = (1+res$portfolio_return)^252 -1,
    portfolio_risk = res$portfolio_risk * sqrt(252),
    sharpe_ratio = res$sharpe_ratio
  )
})) |> dplyr::arrange(desc(sharpe_ratio))

In order to identify the globally optimal portfolio, we select the portfolio with the highest sharp ratio.

optimal_combination = all_tested$combination[1]
optimal_weights = all_tested$weights[1]
optimal_portfolio_return = all_tested$portfolio_return[1]
optimal_portfolio_risk = all_tested$portfolio_risk[1]
optimal_sharpe = all_tested$sharpe_ratio[1]

3.2 Key Figures of initial Portfolio

Now that the optimal global risk/return portfolio has been determined and can be marked as the target portfolio, we must first calculate the key figures for our initial investment portfolio with the equally distributed weighting of equities and…

cur_weights <- paste(cur_prt_clean,scales::percent(1/30),collapse = ", ")
cur_portfolio_return <- mean(Return.portfolio(R = prt_xts, weights = rep(1/30,30))) 
cur_portfolio_risk <- StdDev(Return.portfolio(R = prt_xts, weights = rep(1/30,30)))

… with the corresponding portfolio’s sharp ratio

risk_free_rate <- 0.03 / 252  
cur_sharpe <- (cur_portfolio_return - risk_free_rate) / cur_portfolio_risk

cur_portfolio_return <- (1+cur_portfolio_return)^252 -1
cur_portfolio_risk <- cur_portfolio_risk * sqrt(252)

3.3 Key Figures for optimized initial Portfolio

Now we also run our initial investment portfolio through the optimization function to see which weighting adjustment would have given our portfolio the best risk/return ratio.

cur_prt_opt <- optimize_single_portfolio(cur_prt_clean,returns_data)
cur_prt_opt_weights = cur_prt_opt$weights[1]
cur_prt_opt_portfolio_return = (1+cur_prt_opt$portfolio_return[1])^252 -1
cur_prt_opt_portfolio_risk = cur_prt_opt$portfolio_risk[1] * sqrt(252)
cur_prt_opt_sharpe = cur_prt_opt$sharpe_ratio[1]

After 50 iterations, the algorithm diverges and returns the optimized weights of the existing stocks over a holding period of the last 6 months and returns the following portfolio key figures:

4 Graphical evaluation of the results

Up to this point, we have completed all the calculations and created a lot of single data. So that we can understand them easily and intuitively, we want to visualize them. We find the Plotly package best suited for this, as we can display a basic representation and add a lot of additional information at certain points.

4.1 Helper functions for the graphic

Following the idea of CAPM, we create our “efficient frontier” dividing line using all the calculated portfolios. To do this, we sort all the tested portfolios by risk and find the maximum return for each level.

efficient_frontier <- all_tested %>%
  arrange(portfolio_risk) %>%
  mutate(cummax_return = cummax(portfolio_return)) %>%
  filter(portfolio_return == cummax_return) %>%
  select(portfolio_risk, portfolio_return)

To make the “efficient frontier” line look better, we smooth the curve.

smoothed_curve <- smooth.spline(efficient_frontier$portfolio_risk, efficient_frontier$portfolio_return, spar = 0.5)

Within the plot instruction, we use functions that we have to define here as preparation. The aim of this helper function is to display the number of shares with weights per line.

format_stocks_weights <- function(weights, n = 3) {
  
  new <- data.frame(lapply(weights, type.convert), stringsAsFactors=FALSE) |> stack() |> 
    dplyr::arrange(desc(values))
  
  input_string <- paste(new$ind,scales::percent(new$values),collapse = ", ")
  
  # Split the input string into parts based on commas
  parts <- unlist(strsplit(input_string, ", "))
  
  # Create a vector to hold the formatted string with line breaks
  formatted_parts <- sapply(seq_along(parts), function(i) {
    # Check if the index is a multiple of n and add a line break if it is
    if (i %% n == 0 && i != length(parts)) {
      paste0(parts[i], ",<br>")
    } else {
      parts[i]
    }
  })
  
  # Combine all parts back into a single string
  formatted_string <- paste(formatted_parts, collapse = " ")
  
  return(formatted_string)
}

Since the initial portfolio has a slightly different structure, we briefly write another function for the equivalent representation of all portfolios in the plotly plot.

prt_wgt <- function(weights, n = 1){
  prt_parts <- unlist(strsplit(weights, ", "))
  prt_formatted_parts <- sapply(seq_along(prt_parts), function(i) {
    # Check if the index is a multiple of n and add a line break if it is
    if (i %% n == 0 && i != length(prt_parts)) {
      paste0(prt_parts[i], ",<br>")
    } else {
      prt_parts[i]
    }
  })
  prt_formatted_string <- paste(prt_formatted_parts, collapse = " ")
  
  return(prt_formatted_string)
}
wgt_plot <- prt_wgt(cur_weights)

Finally, we put all the data together in the plot We create a risk-return scatter with the Efficient Frontier and
\(\boldsymbol{\Rightarrow}\) all 5000 portfolios
\(\boldsymbol{\Rightarrow}\) highlight the Optimum Portfolio
\(\boldsymbol{\Rightarrow}\) highlight the initial portfolio and
\(\boldsymbol{\Rightarrow}\) highlight the optimized initial portfolio

First, we define the base plot

fig <- plot_ly()

we add the scatter plot of all tested portfolios

fig <- fig %>% add_trace(data = all_tested, x = ~portfolio_risk, y = ~portfolio_return, type = 'scatter', mode = 'markers',
                         name = 'Tested Portfolios', marker = list(color = aikia::aikia_palette_eight()[2], size = 5),
                         opacity = 0.5,
                         text = ~paste(
                           "Stocks:<br>", sapply(1:nrow(all_tested), function(i) {
                            # format_stocks_weights(all_tested$combination[[i]], all_tested$weights[[i]])
                             format_stocks_weights(all_tested$weights[[i]], n = 1)
                           }), "<br><br>",
                           "Risk: ", round(portfolio_risk, 4), "<br>",
                           "Return: ", round(portfolio_return, 4), "<br>",
                           "Sharpe Ratio: ", round(sharpe_ratio, 4)),
                         hoverinfo = 'text')

Add smoothed line plot of the efficient frontier

fig <- fig %>% add_trace(x = smoothed_curve$x, y = smoothed_curve$y, type = 'scatter', mode = 'lines',
                         name = 'Efficient Frontier (Smoothed)', line = list(color = aikia::aikia_palette_eight()[8], width = 2))

Add the optimal portfolio point with a green marker

fig <- fig %>% add_trace(x = ~optimal_portfolio_risk, y = ~optimal_portfolio_return, type = 'scatter', mode = 'markers',
                         name = 'Optimal Portfolio', marker = list(color = aikia::aikia_palette_eight()[1], size = 10, symbol = 'circle'),
                         text = ~paste(
                           "Stocks: <br>", format_stocks_weights(optimal_weights[[1]], n = 1), "<br><br>",
                           "Risk: ", round(optimal_portfolio_risk, 4), "<br>",
                           "Return: ", round(optimal_portfolio_return, 4), "<br>",
                           "Sharpe Ratio: ", round(optimal_sharpe, 4)),
                         hoverinfo = 'text'
                       )

Add the initial portfolio point with a red marker

fig <- fig %>% add_trace(x = ~cur_portfolio_risk, y = ~cur_portfolio_return, type = 'scatter', mode = 'markers',
                         name = 'Current Portfolio', marker = list(color = aikia::aikia_palette_eight()[4], size = 10, symbol = 'circle'),
                         text = ~paste(
                           "Stocks: <br>", prt_wgt(cur_weights), "<br><br>",
                           "Risk: ", round(cur_portfolio_risk, 4), "<br>",
                           "Return: ", round(cur_portfolio_return, 4), "<br>",
                           "Sharpe Ratio: ", round(cur_sharpe, 4)),
                         hoverinfo = 'text'
                      )

Add the initial optimized portfolio point with a purple marker

fig <- fig %>% add_trace(x = ~cur_prt_opt_portfolio_risk, y = ~cur_prt_opt_portfolio_return, type = 'scatter', mode = 'markers',
                         name = 'Current Portfolio Optimized', marker = list(color = aikia::aikia_palette_eight()[7], size = 10, symbol = 'circle'),
                         text = ~paste(
                           "Stocks: <br>", format_stocks_weights(cur_prt_opt_weights[[1]], n = 1), "<br><br>",
                           "Risk: ", round(cur_prt_opt_portfolio_risk, 4), "<br>",
                           "Return: ", round(cur_prt_opt_portfolio_return, 4), "<br>",
                           "Sharpe Ratio: ", round(cur_prt_opt_sharpe, 4)),
                         hoverinfo = 'text'
                      )

Customize the plot layout

fig <- fig %>% layout(title = list(x = 0.5 , y = 1.0, text = "<b>Risk-Return Plot with Efficient Frontier and Optimal Portfolio</b>", 
                                   showarrow = F, xref='paper', yref='paper', font = list(size = 24)),
                      xaxis = list(title = "Risk (Standard Deviation)"),
                      yaxis = list(title = "Return (Mean)"),
                      legend = list(x = 0.8, y = 0.2))

4.2 The final plot

Print the plot

fig

Thanks to the interactive chart, the Plotly Plot allows us to browse through the entire result and extract specific information.

The current optimal global portfolio of our equity universe is made up as follows:

Finally, we are of course interested in the list of top investments. The individual stocks that are in the upper quartile of return and the lower half of risk.

top_invest <- all_tested |> dplyr::filter(portfolio_return > quantile(portfolio_return, 0.75), # highest 25%
                                          portfolio_risk < quantile(portfolio_risk, 0.5)) |>  # lowest 25%
  dplyr::select(weights)

We can now take a look at the average weighting of the individual shares contained therein:

alldf <- data.frame()
for(i in 1:nrow(top_invest)){
  newdf <- data.frame(lapply(top_invest$weights[[i]], type.convert), stringsAsFactors=FALSE) |> stack() 
  alldf <- rbind(alldf,newdf)
}

4.3 The Top 20

Following is the list of individual investments from the top-rated portfolios from the last 6 months with the current trend forecast for the next 20 days. The resulting top 20 individual stocks from our universe of 1160 stocks are:

df_single <- alldf |> dplyr::summarise(mean_wgt = mean(values),.by = ind) |> dplyr::arrange(desc(mean_wgt)) |> 
  dplyr::as_tibble() |> dplyr::top_n(20)

# get names from DB first
con <- aikia::connect_to_db()
allnames <- DBI::dbReadTable(con,"fin_ticker_meta_data")|> 
  dplyr::select(ticker_yh,name,country_iso,industry_sector,bics_level_2_industry_group_name) |> 
  dplyr::mutate(ind = stringr::str_replace_all(ticker_yh,"\\.","_"),
                ind = tolower(ind)) 
DBI::dbDisconnect(con)

# Create GT table
df_single |> 
  dplyr::mutate(ind = as.character(ind),
                ind = ifelse(stringr::str_detect(ind, "^x[0-9]"), 
                             stringr::str_remove(ind, "^x"), 
                             ind)) |>  # ticker mit Zahlen bekommen ein führendes x
  
  dplyr::left_join(allnames, by = "ind") |> 
  dplyr::select(name,ticker_yh,mean_wgt,industry_sector) |> 
  gt::gt() %>% 
  aikia::gt_theme_aikia() |>  
  gt::tab_header(
    title = gt::md("Investments with Highest Weights")) |> 
  gt::fmt_percent(columns = c(mean_wgt)) |> 
  gt::cols_label(name = "Name",
                 ticker_yh = "Ticker",
                 mean_wgt = "Mean Weight",
                 industry_sector = "Industry Sector")

5 Conclusion

Of course, the past is no guide to the future and we can only see who the top performers of the past weeks and months were and whose current trend could give us a head start over the next 20 days.
\(\boldsymbol{!}\) But at least we get an idea of where we have done well so far and where we have been totally wrong and, above all, where we should take a closer look.
\(\boldsymbol{!}\) We get an idea of which stocks and which sectors are interesting and can take a closer look at the valuations.
\(\boldsymbol{?}\) Is there still potential here
\(\boldsymbol{?}\) What is the forecast for the sectors
\(\boldsymbol{?}\) How high is the PE ratio (again - compared to the history)

As you can see - questions about questions….

Finally, I would like to emphasize that this analysis is of course not an investment recommendation. It is merely a simple way of demonstrating how you can easily assess and monitor the efficiency of your (current) portfolio using some basic steps in R.
I am always interested in exchanging ideas and inspiration. And please do so very informally ✌️🙂

DISCLAIMER
Unfortunately, people need to write one of these, even for blogs: All information and data on this website is for informational purposes only (what else would it be for?). I make no representations as to the accuracy, completeness, suitability, or validity of any information. I will not be liable for any errors, omissions, or any losses arising from its display or use. All information is provided AS IS with no warranties and confers no rights.
Because the information on this website is based on my personal opinion and experience, it should not be considered a specific purchase recommendation or professional financial investment advice. I am not liable for any losses incurred as a result of the implementation of the thoughts or ideas.

Compare your Portfolio to "the optimized Market" using R