Machine learning: Difference between revisions
imported>John Pate (Article Creation. Will continue to flesh out) |
mNo edit summary |
||
(12 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{ | {{subpages}} | ||
'''Machine learning''' methods automatically learn statistical regularities in a training data set to make accurate predictions about new data. Two definitions are: | |||
* "Any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population."<ref name="isbn0-934613-09-5">{{cite book |editor=Tecuci, Gheorghe; Anderson, John David; Michalski, Ryszard Stanislaw; Carbonell, Jaime G.; Mitchell, Tom M. |authorlink= |editor= |others= |title=Machine learning: an artificial intelligence approach |edition= |chapter=Why Should Machines Learn|author=Simon HA|page=28||language= |publisher=M. Kaufmann |location=Los Altos, Calif |year=1983 |origyear= |pages=28 |quote= |isbn=0-934613-09-5 |oclc= |doi= |url= |accessdate=}}</ref> | |||
* "A computer program is said to '''learn''' from experience ''E'' with respect to some class of tasks ''T'' and performance measure ''P'', if its performance at tasks in ''T'', as measured by ''P'', improves with experience ''E''".<ref name="isbn0-07-042807-7">{{cite book |author=Mitchell, Tom M. |authorlink= |editor= |others= |title=Machine learning |edition= |chapter=Introduction"|language= |publisher=McGraw-Hill |location=New York |year=1997 |origyear= |pages=2 |quote= |isbn=0-07-042807-7 |oclc= |doi= |url= |accessdate=}}</ref> | |||
For example, a machine learning algorithm for [[Machine translation]] may be presented with several thousand examples of sentences in two different languages in some training phase, and then use statistical regularities to predict the most likely translation for new sentences. Such methods are often contrasted with rule-based methods which give explicit instructions for selecting the best prediction (for example, the best translation). Do note, however, that this is not a hard and fast division and that rule-based approaches are often used in tandem with the statistical techniques of machine learning methods. | |||
==Classifications of Machine Learning Methods== | ==Classifications of Machine Learning Methods== | ||
Line 15: | Line 18: | ||
The decision between a supervised method and an unsupervised method is usually determined by the sort of data available for training. Supervised methods are generally used when optimal performance is required and the programmer has access to a large amount of data that has been labeled with the right answer. Unsupervised methods must be used when there is not much labeled training data. Researchers and engineers often pursue hybrid "semi-supervised" or "self-supervised" approaches as well. | The decision between a supervised method and an unsupervised method is usually determined by the sort of data available for training. Supervised methods are generally used when optimal performance is required and the programmer has access to a large amount of data that has been labeled with the right answer. Unsupervised methods must be used when there is not much labeled training data. Researchers and engineers often pursue hybrid "semi-supervised" or "self-supervised" approaches as well. | ||
Unsupervised methods may also be used if a researcher is interested in modeling cognitive behavior and believes a method which does not know the right answer is a more accurate representation of how humans learn in a particular domain. | |||
===Generative and Discriminative=== | ===Generative and Discriminative=== | ||
==Issues in training and evaluation== | |||
===Overfitting=== | |||
==Examples of Machine Learning Methods== | ==Examples of Machine Learning Methods== | ||
===Supervised learning=== | |||
* [[Decision trees]] | |||
* [[Hidden markov models]] | |||
* Bayesian statistics | |||
** Naive Bayes classifier | |||
** Bayesian network | |||
* Support vector machines | |||
* Ensembles of classifiers | |||
** Bootstrap aggregating (bagging) | |||
** Boosting, including Adaboost | |||
** Buckets | |||
** Stacking (stacked generalization) | |||
* Regression anaysis | |||
<!-- | |||
* [[Maximum entropy]] | |||
* [[Conditional random fields]] | |||
* [[Gaussian mixture models]] | |||
--> | |||
===Unsupervised learning=== | |||
* [[Artificial neural networks]] | * [[Artificial neural networks]] | ||
* | * Data clustering | ||
* [ | |||
==Software for machine learning== | |||
* [http://www.seg.rmit.edu.au/zettair/ Zettair] | |||
==References== | |||
<references/>[[Category:Suggestion Bot Tag]] |
Latest revision as of 11:01, 14 September 2024
Machine learning methods automatically learn statistical regularities in a training data set to make accurate predictions about new data. Two definitions are:
- "Any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population."[1]
- "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E".[2]
For example, a machine learning algorithm for Machine translation may be presented with several thousand examples of sentences in two different languages in some training phase, and then use statistical regularities to predict the most likely translation for new sentences. Such methods are often contrasted with rule-based methods which give explicit instructions for selecting the best prediction (for example, the best translation). Do note, however, that this is not a hard and fast division and that rule-based approaches are often used in tandem with the statistical techniques of machine learning methods.
Classifications of Machine Learning Methods
Machine Learning methods are divided into supervised and unsupervised methods depending on what sort of training data they use, and into generative and discriminative methods depending on how they arrive at their prediction.
Supervised and Unsupervised
In short, supervised methods get to see the right answer (or something like the right answer) when they are being trained, whereas unsupervised methods do not. The Machine translation example in the introduction is an example of supervised training because the algorithm sees the same sentence in both languages.
Supervised methods are further divided into "learning with a teacher," where the algorithm is told the right answer explicitly, and "learning with a critic," where the algorithm is only told when it is wrong.
The decision between a supervised method and an unsupervised method is usually determined by the sort of data available for training. Supervised methods are generally used when optimal performance is required and the programmer has access to a large amount of data that has been labeled with the right answer. Unsupervised methods must be used when there is not much labeled training data. Researchers and engineers often pursue hybrid "semi-supervised" or "self-supervised" approaches as well.
Unsupervised methods may also be used if a researcher is interested in modeling cognitive behavior and believes a method which does not know the right answer is a more accurate representation of how humans learn in a particular domain.
Generative and Discriminative
Issues in training and evaluation
Overfitting
Examples of Machine Learning Methods
Supervised learning
- Decision trees
- Hidden markov models
- Bayesian statistics
- Naive Bayes classifier
- Bayesian network
- Support vector machines
- Ensembles of classifiers
- Bootstrap aggregating (bagging)
- Boosting, including Adaboost
- Buckets
- Stacking (stacked generalization)
- Regression anaysis
Unsupervised learning
- Artificial neural networks
- Data clustering
Software for machine learning
References
- ↑ Simon HA (1983). “Why Should Machines Learn”, Machine learning: an artificial intelligence approach. Los Altos, Calif: M. Kaufmann, 28. ISBN 0-934613-09-5.
- ↑ Mitchell, Tom M. (1997). “Introduction"”, Machine learning. New York: McGraw-Hill, 2. ISBN 0-07-042807-7.