cards-deck: 100-199_university::111-120_theoretic_cs::116_introduction_machine_learning

Overview of Machine learning

anchored to 116.00_anchor_machine_learning may requires prior knowledge from 105.00_math_stochastic and likely also information about 111.00_anchor


Overview

Some general ideas for the concept of machine learning compose:

[!Quote] Arthur Samuel - 1959 Field of study that gives computer the ability to learn without being explicitly programmed

[!Quote] Learning is any process by which a system improves performance from experience

Machine Learning is concerned with computer programs that automatically improve their performance through experience ~ Herber Alexander Simon - Turing Award 1975, Nobel price (business) 1978 )

Generally speaking: Machine learning is not Voodoo -> but is to use data to automatically find a suitable function or algorithm to solve a given task.

Below we may give an overview on how machine learning could be taken to use to solve tasks:

[!Example] possible usecases:

  1. Weather forecast We would like to have a system that is able to predict the weather / temperature on the next day, based on data supplied –> previous information it obtained may result with: temperature tomorrow =
  2. Disease diagnosis: Finding out whether a person is ill or not based on some information (images of tissues, blood data etc.) attempts to result with a function: –> probability
  3. Chat Bots predicting the outcome of for the next word(s) to answer a request

[!Tip] (Oliver G. Selfridge) Find a bug in a program and fix it and the program will work today. Show the program to find and fix a bug and the program will work forever

-> at least thats expected, but not feasible xd

Historic scope:

  • back in the 1945s the first artificial intelligence structures were discussed -> ranging from being a simple computer to play go against or such.
  • From 1980s Machine learning was flourishing -> coming into play are the aspects of
  • with advancements in the field the whole topic of deep learning was developing fast starting in the 2010s

[!Tip] The friendship algorithm (TBBT)

[!Idea] Base concept of ML regarding available data and outputs #card?

The general idea to resolve problems in ML is by doing the following: Input: Data and the desired result - I want you to find cats in images Output: An algorithm that does this -I will find cats by doing things…

-> we reverse the usual approach to work with information/data


Supervised Learning

With Supervised learning we are partaking in one of three large categories for machine learning. Here particularly we have controlled learning of a machine because our data contains labels and information about the desired results of said data.

[!Definition] Supervised learning | concept #card

With supervised learning we are adapting/modifying the core principal of machine learning a little: input: data points where: is the input(data) and is the desired output - a label classifying what it should result with we have a space of function with elements Objective: We would like to find a function such that: for this given task –> we want to find / discover the function that is best at approximating the desired output for a given input -> to assure this: we measure the quality of this function :

proceeding to minimize the loss alltogether: ^1721145490688

This may be done in the following subjects:

  • recognizing handwritten digits
  • classifying cells -> pathology images
  • recognizing faces or objects in an image / video-stream
  • language modelling
  • regression

Unsupervised Learning

Once again we strive for the base principle of machine learning: having data and wanting to find a function that best interprets it in a certain way.

[!Definition] Unsupervised Learning | basic idea #card

with unsupervised learning we don’t use labels on given data - as it was in supervised learning. We have the following constraints / basics: input data points with ( a vectorspace!) once again we are given a function space with and we would like to find a function so that: where is the low-dimensional representation –> This means we are reducing dimensions from high vector input (x is a vector of !) to a smaller space while still maintaining / not losing information on its traits - like its similarity to the original data point in this new representation we then assign each to a given cluster

Objective: The objective is not directly defined, its rather the goal to generally reduce the dimension of data somehow reliably - how exactly is not defined. –> there are many algorithms that could be found to solve this task ^1721145490694

This may be used in the following subjects:

  • Genome comparisons ( this is clustering)
  • Finding descriptors for face expressions ( dimension reduction –> from a high dim ( the face) to a lower -> Cartesian mapping of emotions)
  • Finding disentangled generating factors ( take image of faces and find few descriptive variables – generating new maybe)

Reinforcement Learning

Is a little different in its idea compared to the previous to attempts to train machines. Here we utilize a reward system to have the system train itself and gain knowledge about performance and such.

[!Definition] Reinforcement learning in its core concept? #card we provide a system to interact with: -> is the state, is an action ( in time )

We have the Function space again with that denotes a (policy) deciding about actions to take based on the current state:

to evaluate / refine later we introduce a rewarding/utility function:

Objective: we ought to find the function that maximizes the expected reward

Generally speaking these stochastic systems are described with Markov Decision Processes.

Tasks of models in reinforcement learning are therefore:

  • collect own data – to develop on / with
  • simultaneously learn and potentially models of and (utilty funct and the system to interact with)
  • the reward can be sparse – only at the end of an long action sequence or such ^1721145490696

Reinforcement learning can be deployed in various fields - obviously - like:

  • robot control and movements
  • Deepmind AlphaGo Robot -> game bots
  • walking simulations

Usage of ML

Generally speaking we have a large objective with ML:

[!Definition] To solve problems where we do not have good algorithsm but data (a lot!)

It is involved in many many fields nowadays: daily life

  • sorting pictures
  • predicting what will be done (advertisment / content recommendation)
  • diagnosis assistant in health
  • access to knowledge -> LLM science:
  • automatic processing of experimental data
  • new tools for searcing in fast spaces - proetin confirmations or similar

Data privacy and security:

We ought to maintain caution of the origin of our used data: -> Machine learning systems are utilizing and working with data a lot hence its important to select and source it well

Could create:

  • bias due to gender/race/ethnicity
  • exploit information of people etc

Other fields:

  • psychology
  • computer science …
  • statistics