C# Performance Improvement Experience

I have a personal code originally wriiten in 10 years ago when I was starting coding in University. The code is basically stock price analysis tool. 

Recently I started to look at it again and found a few tips to improve the performance calculation and DB access heavy code. Actually, this shouldn’t be DB access heavy and it is the part of peformance tunning. 

This is what I have learnt…

  • Read  all data from DB at the start of the process. 
  • Do repeatitive task and calling data in memony.
  • Utilize multi-thread and don’t call DB inside multi-threaded task
  • Once all tasks are completed, update the database at the end if possible at once. 

One more tip

  • Analyze the coding the profiler, which cleaerly shows what method took the longest. This easily let me focus on where to improve. 

After these improvements, the time took to process the once cycle of simulation decreased from 40 minutes to 10 minutes. 

A few code change samples are as below.

  • Read data from DB at the beginning once rather than calling each time when needed.

From 

public List<StockToAnalyze> ReadPriceToAnalysisForward(string code, int days, DateTime dateToReadFrom)
{
return _marketwatchDbContext
.StockToAnalyzes
.Where(a => a.Code == code
&& a.Date >= dateToReadFrom)
.OrderBy(a => a.Date)
.Take(days)
.AsNoTracking()
.ToList();
}

Initially, I was calling data from DB each time when stock prices are needed as above. 

 

To 

private readonly List<StockToAnalyze> _stockPrices;

_stockPrices = _marketwatchDbContext.StockToAnalyzes.AsNoTracking().ToList();

public List<StockToAnalyze> ReadPriceToAnalysisForward(string code, int days, DateTime dateToReadFrom)
{
return _stockPrices
.Where(a => a.Code == code
&& a.Date >= dateToReadFrom)
.OrderBy(a => a.Date)
.Take(days)
.ToList();
}

In the improved code, stock prices are loaded in constructor once and the variable, _stockPrices, was used in the private method. 

 

  • Print debugging content in debugging console rather than conole output. Because console output slows down the process, debugging content only useful when debugging it. 

Before

Console.WriteLine($"Correlation coefficient: {leaderInPair.Coefficient}");

After

Debug.WriteLine($"Correlation coefficient: {leaderInPair.Coefficient}");

 

  • Sort once when loading the data. After profiling, I could noticed that repeatitive sorting is very costly even though it happens in the memery. After seprating this point in a sample code, it looks too obivous, it can require a bit of investigation to find where to improve. 

Before

private readonly List<StockToAnalyze> _stockPrices;

_stockPrices = _marketwatchDbContext.StockToAnalyzes.AsNoTracking().ToList();

public List<StockToAnalyze> ReadPriceToAnalysisForward(string code, int days, DateTime dateToReadFrom)
{
return _stockPrices
.Where(a => a.Code == code
&& a.Date >= dateToReadFrom)
.OrderBy(a => a.Date)
.Take(days)
.ToList();
}

After

private readonly List<StockToAnalyze> _stockPrices;

_stockPrices = _marketwatchDbContext.StockToAnalyzes.AsNoTracking()
.OrderBy(a => a.Date).ToList();


public List<StockToAnalyze> ReadPriceToAnalysisForward(string code, int days, DateTime dateToReadFrom)
{
return _stockPrices
.Where(a => a.Code == code
&& a.Date >= dateToReadFrom)
.Take(days)
.ToList();
}

 

  • Use multi-thread. In this case, theadpool can be very useful. This reduced the processing time in half. 

Before

private IterateSimulationAsync(DateTime investDay, List<string> codes)
{
var simulator = new Simulator(_logger, _marketWatchRepository, codes)
{
StartDay = investDay,
};
simulator.Initialize();

var strategies = GetPotentialStrategies(_limit).ToList();

var strategyResults = new List<Strategy>();
foreach (var strategy in strategies)
{
strategyResults.Add(simulator.Run(strategy));
}
}

After

private IterateSimulationAsync(DateTime investDay, List<string> codes)
{
var simulator = new Simulator(_logger, _marketWatchRepository, codes)
{
StartDay = investDay,
};
simulator.Initialize();

var strategies = GetPotentialStrategies(_limit).ToList();

// Multi Threading
ThreadPool.SetMinThreads(Settings.Default.MinNumberOfThread, Settings.Default.MinNumberOfThread);
ThreadPool.SetMaxThreads(Settings.Default.MaxNumberOfThread, Settings.Default.MaxNumberOfThread);

var strategyResults = new List<Strategy>();
foreach (var strategy in strategies)
{
Action<Strategy> resultCallback = (result) =>
{
strategyResults.Add(result);
Interlocked.Decrement(ref _threadCounter);
};

Interlocked.Increment(ref _threadCounter);

WaitCallback workItem = (input) =>
{
var result = simulator.Run(input);
resultCallback(result);
};

ThreadPool.QueueUserWorkItem(workItem, strategy);
}

while (_threadCounter != 0)
{
// wait until all thread is finished.
}
}

Next steps to test for performance improvement would be converting this to .Net core app or coding in F#. or pararell processing through mulitple computers. What else can potentially increase the performance more?