Implementation of Machine Learning systems at scale. Integration with existing business processes. Continuous Deployment and ongoing governance.
Definition of the problem scope. Identification of goals and criteria for success. Overview of end-to-end process and determine key areas where benefits of Machine Learning models would be most beneficial.
What Data Sources are available in the current business context. What are the data types provided by the available sources.
Data quality considerations, is the available data structured or unstructured. What is the estimated effort required to make the data usable for Machine Learning algorithms.
When multiple data sources are available, compare and contrast quality, data type and effort required to ingest data.
Streaming vs batching and data caching considerations. Can/should data be consumed directly from source systems continously in real-time or is batch processing of data an acceptable option.
All the considerations above inform the overall architecture of the Machine Learning process.
Determine the key factors of the problem and how these map to available ML strategies. Compare basline model performance for types of models on types of data.
Depending on the outcome of the Data Availablilty analysis determine suitable models for processing the information. For example, if the data is continually streaming then time-series based strategies are more effective and LSTM, RNN GAN etc. architectures may be most appropriate.
Conversely if there is strong correlation between data points in the sets in terms of proximity - such as image map analysis, this may suggest CNN architecture.
Dealing with Legacy systems - With ever involving Enterprise Architectures and continuous upgrade cycles, legacy systems basically refer to any system that is currently in place within the enterprise. This may seem strange, and often the connotation of the term 'legacy' may conjure up visions of antiquated mainframes from the 1970’s. However, the fact is that pretty much anything installed today will at some point become 'legacy' as the evolution technology moves on. The key is therefore not to ‘eliminate’ the problem 'legacy' systems, rather it is to manage the migration of these components in a continually evolving infrastructure. Above all the important constant should be the data, knowledge and information contained within the business infrastructure - not particular components of the infrastructure themselves