Dataframe with Monte Carlo Simulation calculation next row Problem

Question

I want to build up a Dataframe from scratch with calculations based on the Value before named Barrier option. I know that i can use a Monte Carlo simulation to solve it but it just wont work the way i want it to.

The formula is:

Value in row before * np.exp((r-sigma**2/2)*T/TradingDays+sigma*np.sqrt(T/TradingDays)*z)

The first code I write just calculates the first column. I know that I need a second loop but can't really manage it.

The result should be, that for each simulation it will calculate a new value using the the value before, for 500 Day meaning S_1 should be S_500 with a total of 1000 simulations. (I need to generate new columns based on the value before using the formular.) similar to this: So for the 1. Simulations 500 days, 2. Simulation 500 day and so on...

import numpy as np
import pandas as pd
from scipy.stats import norm
import random as rd
import math 

simulation = 0
S_0 = 42
T = 2
r = 0.02
sigma = 0.20
TradingDays = 500

df = pd.DataFrame()

for i in range (0,TradingDays):
    z = norm.ppf(rd.random())
    simulation = simulation + 1

    S_1 = S_0*np.exp((r-sigma**2/2)*T/TradingDays+sigma*np.sqrt(T/TradingDays)*z)


    df = df.append ({

                    'S_1':S_1,    
                    'S_0':S_0

                     }, ignore_index=True)

    df = df.round  ({'Z':6,
                     'S_T':2
                     })
    df.index += 1
    df.index.name = 'Simulation'


print(df)

I found another possible code which i found here and it does solve the problem but just for one row, the next row is just the same calculation. Generate a Dataframe that follow a mathematical function for each column / row

If i just replace it with my formular i get the same problem.

replacing:

exp(r - q * sqrt(sigma))*T+ (np.random.randn(nrows) * sqrt(deltaT)))

with:

exp((r-sigma**2/2)*T/nrows+sigma*np.sqrt(T/nrows)*z))

import numpy as np
import pandas as pd
from scipy.stats import norm
import random as rd
import math 

S_0 = 42
T = 2
r = 0.02
sigma = 0.20
TradingDays = 50
Simulation = 100

df = pd.DataFrame({'s0': [S_0] * Simulation})

for i in range(1, TradingDays):
    z = norm.ppf(rd.random())

    df[f's{i}'] = df.iloc[:, -1] * np.exp((r-sigma**2/2)*T/TradingDays+sigma*np.sqrt(T/TradingDays)*z)

print(df)

I would work more likely with the last code and solve the problem with it.

There is something unclear to me: in your example table, does the value of column 2 row 2 depend only on the value of column 2 row 1, or also on the value of column 1 row 2? In other words: are the columns independent of each other? Are the rows independent of each other? Or does every value depend on both the previous value in the row and the column? — bartaelterman
Column 2 row 2 = 42, is the start value of each simlulation and c[3] r[2] = 41.63 is based on [2:2]*formular. The next Value c[4] r[2] is based on on the value c[3] r[2]* formular and so on. But I already have a solution for the problem, I will post it here as an answer. — Plutostone

bartaelterman bartaelterman · Accepted Answer · 2020-05-16T20:54:28

How about just overwriting the value of S_0 by the new value of S_1 while you loop and keeping all simulations in a list? Like this:

import numpy as np
import pandas as pd
import random
from scipy.stats import norm


S_0 = 42
T = 2
r = 0.02
sigma = 0.20
trading_days = 50
output = []

for i in range(trading_days):
    z = norm.ppf(random.random())
    value = S_0*np.exp((r - sigma**2 / 2) * T / trading_days + sigma * np.sqrt(T/trading_days) * z)
    output.append(value)
    S_0 = value

df = pd.DataFrame({'simulation': output})

Perhaps I'm missing something, but I don't see the need for a second loop.

Also, this eliminates calling df.append() in a loop, which should be avoided. (See here)

Dataframe with Monte Carlo Simulation calculation next row Problem

2 Answers