Current location | Thread information | |
![]() ![]() ![]() ![]() ![]() ![]() |
Last Activity 7/19/2022 6:39 AM 8 replies, 1189 viewings |
|
|
Printer friendly version |
^ Top | |||
John W![]() Regular ![]() ![]() ![]() Posts: 96 Joined: 6/18/2011 Location: Sydney, NSW, Australia ![]() |
When training GA's I often notice that some of the intervals being trained are just fantastic, others are so-so and others worsen the training result. Here's a simple example when training for 300 iterations and re-initializing every 100 iterations. In the first 100 iterations the rules formed didn't do much to improve the equity curve. The rules generated in the 101-200 iterations caused a dramatic lift in performance, and the 201-300 iterations caused a performance decline. Wouldn't it be great if the settings used by the GA in the period 101-200 could be extended for more iterations because it appears those settings are finding valuable rules and getting good training results? Question - would adding an 'n' bar extension period (e.g. 2X the interval size) using the settings for the 'best' re-initialization interval result in better rules? In this simple example after 300 bars had been run, the interval 101-200 settings would be applied to see if further valuable rules could be gleaned by running a further 200 iterations. Or could the GA automatically use the best interval settings and then let it decide when to terminate - when those settings no longer generate rules that improve results at a rapid rate? [Edited by John W on 2/1/2020 10:39 PM] ![]() | ||
^ Top | |||
Jim Dean![]() Sage ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3433 Joined: 3/13/2006 Location: L'ville, GA ![]() |
I like this idea! | ||
^ Top | |||
John W![]() Regular ![]() ![]() ![]() Posts: 96 Joined: 6/18/2011 Location: Sydney, NSW, Australia ![]() |
Here's a second GA improvement suggestion. I ran an experiment where I kept the rules as generated, and then deleted and renewed the data samples as the GA progressed its iterations. Each red triangle in the graph indicates a data sample change. The starting data sample ceased generating meaningful additional rules at n=5000 I changed the data sample at n=10,000 and that data set stopped generating much after n=18,000. I changed again at n=75,000 and had some very quick rule formation for about 1000 iterations. I changed the data set again at 80,000 and that has just kept on generating rules. Based on the graph above there are data sets that are way better (or way worse) for rule generation. If the GA can find and use the better data sets then that should enhance GA rule generation and GA fitness. Suggestion - instead of importing one data set could the GA import 'n' data sets at its formation. Allow the GA to gravitate over time to samples from those data sets that generate more rules. This could be as simple or as complex as Nirvana wish, but a simple method would be if the data set ceases to train meaningful rules after 'x' iterations then switch to the next data set. [Edited by John W on 2/28/2020 4:16 PM] ![]() | ||
^ Top | |||
mgerber![]() Regular ![]() ![]() Posts: 61 Joined: 3/30/2006 Location: Issaquah, WA ![]() |
Both of these ideas are great, John. The mutation idea has been frustrating; nothing to do but watch and hope the good mutation runs keep going . . . I have been dong a version of your data recharging idea; doing multiple new data runs and culling effective genes. But your proposal sounds much better; keeping the knowledge base and improving it over the different data runs. Looking forward to trying that out over the weekend. --Mark G. | ||
^ Top | |||
Mel![]() Veteran ![]() ![]() ![]() Posts: 235 Joined: 3/18/2006 ![]() |
This idea makes sense as long as you know for certain that the very good segment is not due to market luck or curve fitting. You need to validate the rules from any segment on out-of-sample data before deciding they are wonderful. If it were the case that the good segment was due to market luck or curve fit, you would expect it to perform poorly in succeeding segments. It could be that the extra rules in the next, "poor" segment are doing fine, and the curve fitted rules in the curve fitted segment are dragging down results. Only some kind of out-of-sample validity testing can help you keep the good rules. [Edited by Mel on 2/29/2020 8:17 AM] | ||
^ Top | |||
John W![]() Regular ![]() ![]() ![]() Posts: 96 Joined: 6/18/2011 Location: Sydney, NSW, Australia ![]() |
Mel that's a really good point you make. There is a GA setting to 'enable internal validation' - I switch it on all the time now to reduce the risk of curve fit, that's a must if employing the concept proposed. But with your thoughts in mind, perhaps there is an additional stronger safeguard to avoid curve fit with the concept proposed: That is for the GA to keep 2 sets of results - one set using the concept as suggested over 'n' iterations, and a second set over 'n' iterations using ALL the sample data. Then compare and see which of the 2 sets, or both, holds up in the forward test. | ||
^ Top | |||
Buffalo![]() Elite ![]() ![]() Posts: 603 Joined: 7/11/2007 Location: Braintree, MA ![]() |
Be great if the GA could use a form of walk-forward testing like ATM First hold back some symbols from all testing for validation Say you use a 10 yr BT - first 5 yrs it trains then starts walk-forward testing of old rules as it finds new rules for the next 5 years, maybe on a yr-to-yr basis Then it compares the end result - surviving WF rules to the fresh data from the symbols it didn't look at to validate the end result. HR +- 5%, ppt +- .25% IDK - have them meet some validation criteria Just spitballin but I love the idea of validating the result I currently don't train on approx 30% of my symbol list. Then I compare the results from the trained data to results on just the untrained. If they're not close I start over - it's curve fit | ||
^ Top | |||
John W![]() Regular ![]() ![]() ![]() Posts: 96 Joined: 6/18/2011 Location: Sydney, NSW, Australia ![]() |
Yes it would be nice to see the fitness curve for the training period and also for the forward test period all on the same graph. | ||
^ Top | |||
Mel![]() Veteran ![]() ![]() ![]() Posts: 235 Joined: 3/18/2006 ![]() |
All the fitness curve of training is good for is stopping the iterations. The only only thing that tells you "fitness" is the forward test equity curve. To automate the process, I would like to be able to train in walkforward test increments, say quarters. Having generated rules for quarter 1 and generated an equity curve, it would use them in the next quarter to generate an out-of-sample equity curve for quarter 2, then use quarter 2 to train and generate a training set equity curve, adding rules to those from quarter 1. An equity curve for quarter 3 would be run, then more training with quarter 3. And so on. This way, you could compare equity curves from the walkforward segments with an equity curve from the training data. All curve fitting would be apparent as it happens. |
|
|
Legend | Action | Notification | |||
Administrator
Forum Moderator |
Registered User
Unregistered User |
![]() |
Toggle e-mail notification |