Principles
Principles
The branch of theoretical computer that formally studies the
design of machine learning algorithms to know the ‘learnable’ problems and
identifies the computational limits of learning by machine is known as
Computational learning theory. To understand fundamental issues in the learning
process itself and to help in the design of better automated learning methods
is the goal of computational learning theory.
Statistical Learning theory is regarded one of the developed
branches of Machine Learning as it provides the theoretical basis. The two goals
of Machine learning are to understand the nature of Intelligence/Learning and drive
decisions from the data.
According to Oleg Sergeykin (2019) the general principles
for any machine learning projects are:
Transparency: Every aspect of a Machine Learning projects
should to inspected. For example, order of the steps, what data files, code,
configuration are used and what processing steps are used in the project
Reproducibility: The ability for co-workers to re-execute
precisely the project at any stage of its development
- The processing steps should be written in such that they can be rerunnable by any person
- Recording he state of the project as the it progresses. ‘State’ means code, configuration and datasets
- Ability to recreate the exact datasets available at any time in the project history is important for auditability to be useful
Auditability: Inspecting intermediate results of a pipeline
by looking at both the final results.
Scalability: Ability to support multiple co-workers working
on a project and the ability to work on multiple projects simultaneously.
According to Schelldorfer (2019) principles applied
for machine learning models are:
Data Related Principle
Choice of appropriate data features: Selecting a feature
that contributes most to a prediction variable or output. This will ensure high
accuracy of the model.
Data quality and governance: Data quality refers to
preprocessing of data like data cleansing, validity of data where several
models can be created whereas governance deals with security and privacy,
integrity, usability, integration, compliance, availability, roles and
responsibilities, and overall management of the internal and external data
flows within an organization.
Feature engineering: Feature engineering, the process
creating new input features for machine learning, is one of the most effective
ways to improve predictive models. Through feature engineering, one can isolate
key information, highlight patterns, and bring in domain expertise.
Model Development Principles
- Performance metrics
- Model validation
- Model calibration
- Model uncertainty
- Robustness
Comments
Post a Comment